This month Intel moves on from the Core microarchitecture to the next generation of processors for mobile desktop and servers codenamed Nehalem and officially named the Core i7 family.
We’ve spent a few weeks with Intel’s test kit for the new desktop part codenamed Bloomfield as well as the new compatible motherboard chipset the X58 Express codenamed Tylersburg.
The new platform represents a fundamental change in the way Intel processors communicate with the rest of the system but more on that later.
Tick-Tock: this one is a tock
Intel’s tick-tock development process means every “tick” of the clock is a minor update to processor architecture (such as a process shrink) while the “tock” is a major upgrade to the architecture.
Since adopting this process Intel’s first “tick” was the process shrink for Presler Yonah and Dempsey CPUs to 65nm and the first Tock the release of the Core microarchitecture which superceded them.
Since then there’s been another “tick” with a process shrink from 65nm to 45nm for the Penryn processors.
Now we’re seeing the “tock” side of the development process with the new Nehalem microarchitecture. So this is a major relaunch.
Intel’s Tick-Tock Model
Looking forward we can expect to see a process shrink of the Nehalem family to 32nm around this time next year and roughly 12 months after that a new microarchitecture codenamed Sandy Bridge (formerly Gesher).
Currently we don’t know much about Sandy Bridge except unconfirmed whispers that it will focus on power efficiency and include a combined CPU and GPU on die.
[#PAGE-BREAK#New to the core#]
Bloomfield has a number of fundamental changes from the Penryn processors we’re used to. Physically Bloomfield is a larger processor in a new socket incompatible with existing coolers and motherboards.
Though we’ve been steady with socket LGA775 since 2004 moving to the new on-chip memory controller means a larger number of physical connections are needed for the interface; and a new socket is born â€“ socket LGA1366. It’s physically larger than socket LGA775 and is incompatible with existing CPU coolers so no easy upgrade there.
The rear of the Bloomsfield Core i7 processor showing new socket interface
More importantly Bloomfield does away with the front side bus (FSB) altogether as well as integrating the memory controller directly into the processor itself. Replacing the FSB is a new interconnect named QuickPath Interconnect (QPI).
The processor’s cache has also had a revision with some major changes over Penryn.
[#PAGE-BREAK#Death of Frontside Bus means better connection of multiple processors#]
The biggest news in the Nehalem family is that the FSB is no more. As the FSB has become a bottleneck in modern Intel systems (hence the constant increases in FSB speeds in recent years) Intel replaced the FSB with a new communications path named QuickPath Interconnect.
QPI can handle up to 6.4GT/sec using an Extreme Edition processor or 4.8GT/sec in lower performance chips between the IOH (northbridge) and other processors in a multi-CPU setup. Compared to 1333MT/s FSB bandwidth of Penryn that’s a significant speed increase.
(GT/sec stands for Gigatransfers per second — you can read more about it here.)
Communication with the system RAM is done directly from the CPU bypassing the northbridge altogether leaving more bandwidth available for IO.
QPI is a more efficient way of moving data around a platform as it’s a point to point system. This will understandably increase the complexity of motherboards especially in a multi CPU system but because there are no third party stops along the way for a data path communication directly between a CPU devices like the IOH (northbridge) and other CPU’s memory sees substantial increases in performance.
Unlike the Skull Trail platform’s use of an entry level server/workstation board to support a dual processor system X58 already includes two QPI endpoints allowing two physical CPUs to be connected.
The matching CPUs with two QPI connections allowing a dual processor setup is codenamed Gainestown but no specific announcements about Gainestown compatibility with X58 has been made.
[#PAGE-BREAK#Intel adds a whole new layer of Cache: L3#]
Bloomfield has some significant changes in its cache configuration compared to Penryn with some aspects slower and some much faster.
Level 1 cache is unchanged from the Core microarchitecture using 64KB total (32KB four-way instruction cache and 32KB eight-way data).
Bloomfield’s level 2 cache is set at 256KB per core so there’s 1MB for the first lot of four core Bloomfield processors.
Each L2 cache is private and specific per core.
Bloomfield sports an 8MB Level 3 cache shared between the four cores.
[#PAGE-BREAK#Tylersburg: Intel's enthusiast motherboard chipset#]
As Nehalem processors aren’t compatible with existing motherboard chipsets the first on the block to support the new chips is Intel’s own X58 Express performance chipset codenamed Tylersburg. Intel’s motherboard integrating the X58 chipset is the â€œSmackoverâ€ DX58SO.
X58 boards will be available at Nehalem’s launch from a number of vendors including GIGABYTE MSI Asus and EVGA. Though X58 is an enthusiast chipset (with multiple graphics card and cooling options) there will be lower-end SKUs shortly after today’s launch for other parts of the market such as home entertainment or budget solutions.
X58 has also undergone major changes from the X48 chipset such as removing the memory controller and integration of QPI to talk to the processor. X58 still uses DMI to talk to the southbridge and PCI Express 2.0 to communicate with add-on cards and graphics options.
For enthusiasts and gamers of note is the (optional) integration of NVIDIA SLI capability into an Intel motherboard so at least some X58 motherboards will support both NVIDIA SLI and AMD CrossFire multi-GPU setups. The SLI support is allowed through an NVIDIA certification scheme and only for X58 motherboards. The support only extends to a 2 way x16 SLI configuration or a 3-way 16x and 2x 8x setup.
[#PAGE-BREAK#One technology back from the dead PLUS a doozy...#]
Nehalem sees the comeback of Hyperthreading in mainstream Intel processors. With Nehalem family processors having a real core count of 2-8 cores Hyperthreading gives the processor an opportunity to execute two threads per core concurrently; meaning a thread count per processor from 4 to 16.
It does add performance to heavily threaded applications which are becoming increasingly (but slowly) more common. Of course assuming you’re using a 4-core CPU any application that doesn’t use at least five threads won’t see any advantages because of Hyperthreading — four threads would be handled by the processor natively.
There’s a list of heavily threaded commercial applications at the bottom of this article which will give a feel of what kinds of applications will be able to take advantage of Hyperthreading on the Bloomfield processor.
The doozy: three channel DDR3 memory controller
The doozy and a marked change from the way Intel’s processors have worked in the past is the integration of a memory controller directly into the processor package. Having the controller integrated allows for much faster communication with system memory and offers around 300% increased bandwidth between the processor and memory.
Nehalem processors all include a three channel integrated DDR3 memory controller with support for two memory slots per channel for a total of up to six per processor. Intel’s X58 motherboard includes four memory slots while the boards from other brands we’ve seen all include six.
We put the Intel Core i7 965 Extreme Edition through some benchmarks against an Intel Core 2 Extreme QX9770 on the following testbenches:
Bloomfield test system
CPU: Core i7 965 Extreme Edition
Motherboard: ASUS P6T Deluxe
Memory: Corsair Dominator @ DDR3-1600 8-8-8-24 1T 6GB (3x2GB)
Graphics: GIGABYTE HD4870 1024MB
PSU: Silverstone OP1000
Storage: Intel X25-M
Penryn test system
CPU: Core 2 Extreme QX9770
Motherboard: Foxconn BlackOps
Memory: Corsair Dominator @ DDR3-1600 7-7-7-21 2T 2GB (2x1GB)
Graphics: GIGABYTE HD4870 1024MB
PSU: Thermaltake Tough Power 1200
Storage: Intel X25-M
For this first look we prepared some basic benchmarks including PC Mark Vantage 3D Mark Vantage and Crytek’s Crysis CPU benchmarks. The Vantage benchmarks show a marked difference in performance between the Core i7 and QX9770 processors however under Crysis there’s less difference to be seen as optimisations for multiple cores aren’t enabled under Vista.
Crytek Crysis – Windows Vista 64
Products on Shelves
Released in the next few weeks will be three Bloomfield products the Intel Core i7 965 Extreme Edition at 3.2GHz the Intel Core i7 920 at 2.66GHz and another Core i7 product at 2.93Ghz. Intel X58 Express based motherboards will be available from launch from Intel MSI GIGABYTE ASUS and EVGA.
[#PAGE-BREAK#Software that heavily uses multithreading#]
A snapshot of software titles that take advantage of 4 or more threads. List supplied by Intel.
- THQ Relic Company of Heroes
- Sierra World in Conflict: Soviet Assault
- EA Flagship Hellgate: London (extra particle effects
- Crytek Crysis (Windows XP only)
- Ubisoft Assassin’s Creed
- Ubisoft Far Cry 2+
- Capcom Lost Planet Colonies
- Kingsoft Mission Against Terror
- Midway/Epic Unreal Tournament 3
CONSUMER/MAINSTREAM CONTENT CREATION
- Sonic Easy Media Creator 10
- Cyberlink Power Director 6 Plus
- ProShow Gold 3.2
- TMPEGEnc XPress 4.4
- Avid Pinnacle Studio 12
- Corel DVD Movie Factory 7
- Cyberlink Power Producer 5
- Cyberlink Power Director 7
- CorelÂ® Video Studio X2
PROSUMER/ PROFESSIONAL CONTENT CREATION
- Adobe Photoshop CS3
- Adobe After Effects CS4+
- DivX Codec v6.8
- Autodesk 3d Studio Max
- POV-Ray 3.7 Beta 23
- Maxon Cinema v11+
- Main Concept Reference Encoder and Decoder v. 1.5
- 3ivx MPEG 4
- Sobey Edit Max 7
- Newtek Lightwave v9.5
- Sony Vegas v8.0b
- Cineform Prospect HD
- Thompson Canopus EDIUS Pro 5
- Microsoft Office Excel 2007
- Abbyy FineReader v9.0
- Yuan Fang InteriCAD6000