Single cores bite the dust as Intel’s first true multicore processor hits the launchpad with up to six cores plus an enhanced ‘Turbo’ speed-boost mode.
This week is pretty much the ‘coming out party’ for Intel’s next-gen Nehalem microarchitecture, which succeeds the current Core platform in servers, desktops and notebooks. But it’ll be a long haul.
The first Nehalem chips will land in Q4 and belong to the flagship quad-core desktop series, which will see the new Core i7
brand replace the Core 2 Duo badges. These will be the 2.66GHz i7 800, the 2.93GHz i7 900, and the 3.2GHz overclock-friendly i7 900 Extreme Edition. The Xeon X7460 server edition will have six cores.
Mainstream desktop and mobile Nehalem-class processors, however, won’t hit production until the second half of 2009 – yes, a full 12 months away. Nor will they won’t carry the i7 identifier: they’ll have their own badges and i-numbered brands. To lessen your confusion, think of it as something akin to BMW’s line-up: the premium 7 series, the mid-range 5 series and the compact 3 series.
While Intel has outlined several features of the microarchitecture in recent months, we’ve been able to dive a little deeper into the recipe for Nehalem’s secret sauce.
First, a quick recap: Nehalem is a very different design to the Core engine, indeed to all Intel’s previous processors. All the cores are cast onto a single silicon die, rather than the previous case of a quad-core chip being two dual-core die strapped together. Each core can handle simultaneous multi-threading at two threads per core (what Intel calls Hyper-Threading).
This native multicore design appears set to put single core chips out to pasture. “Never is a long time”, Intel senior veep Pat Gelsinger told apcmag.com, “but at this point we have no single core versions of Nehalem on our roadmap.”
There’s also a single slab of shared cache common to all of the cores, which sees L2 cache scaled back in size and importance – for instance, the three i7 debutantes each have 8MB of L3 but just 256KB of L2 cache. The aim is to boost performance by putting data as close as possible to the processing cores.
Nehalem’s memory controller is baked into the silicon, rather than being an external ‘northbridge' hub, while an integrated microcontroller is dedicated to overseeing the chip’s power management. Hooking it all together is a dedicated CPU pipeline called QuickPath Interconnect which replaces the front side bus.
Intel has now revealed another item in Nehalem’s bag of tricks: a facility known as ‘Turbo Mode’ which dramatically boosts the speed of one core when running single-threaded applications. You could think of it as automatic overclocking, although Intel’s boffins might blanche because this implies pushing the processor well beyond its decreed performance ceiling.
The thinking behind this is when only one core is needed that core’s operating frequency is ramped up while the other unused cores are shut down. It’s an instant performance hit for single-threaded apps because your 2.6GHz processor suddenly becomes a 2.8GHz or even 3GHz engine. But because the other cores aren’t running, the system won’t overheat because the total thermal limits are set based on all of the cores humming along at the nominal speed.
If this sounds familiar, it’s because the Penryn-class processors introduced a similar feature last year. However, Nehalem builds this into the entire microarchitecture and takes it a few steps further. Penryn’s turbocharge was good for only one bump or ‘bin’ (in Intel parlance) of speed, such as 2.4GHz to 2.6GHz.
“We demonstrated today two bins of performance” said Gelsinger, “and in the future we’ll see well above two bins of performance delta”. Translation: a 2.4GHz Nehalem chip could conceivably set one core to push past 3GHz as needed.
Nor does Penryn have the ‘power gating’ technology of Nehalem, which enables unused cores to be completely shut down rather than set to a low-power sleep mode. “The amount of thermal headroom that can be provided in a two core environment to the other core is very small because you still have the leakage power that’s being consumed, and you only have the power of one core being available for the second core” Gelsinger explained.David Flynn is attending IDF Fall 2008 in San Francisco as a guest of Intel.