The Bulldozer's here, but is AMD's new platform better suited to a future when heavily threaded apps rule?
AMD has (finally) released its next generation all-purpose processor family, based on the architecture design previously codenamed Bulldozer. These new processors for desktops, codenamed Zambezi now fall under a reworking of AMD’s high-end brand name; FX. The CPUs work in already available chipsets, the 990FX, 990 and 970 models, and operate in boards with support for AM3+ processors, usually with a BIOS update.
The AMD FX processors are designed from the ground up for expansion. AMD’s focused on providing many cores in a CPU, and as such has developed an interesting approach to getting the core count higher without duplicating unnecessary parts of the processor. The basic building block of an FX processor is a module. A module consists of two CPU cores that share a number of core components. These components are shared to save space and heat in the design, but there's a slight danger both cores will need the resources at the same time, which slows them down. These shared components include the L2 cache, floating point units and the decode stage. It’s not really the same as having complete cores -- the two cores in a module aren’t quite up to the same performance level as if they were 100% independent but on the same die -- but the savings in space and heat overhead mean AMD can fit more of them in a chip, more than making up for that inefficiency.
A Bulldozer module is the basic building block of the Zambezi design and the finished CPUs come with either two, three or four modules for a four, six or eight-core chip. As the cores share resources in the modules, the modules share resources in the processor, though definitely in a more traditional design. The modules all have access to the processor's 8MB L3 cache and shares IO resources to the mainboard, such as memory, through the integrated northbridge and Hyper Transport links.
Like the competition, AMD has implemented Turbo functionality in the Zambezi design. The system analyses the activity on each module, then works out how much power is being used and how much heat is being generated. If the processor is under the cap, the clock speed can be increased on every single core as long as it stays under the thermal cap. That’s Turbo Core. When only a few cores or modules are under load, the processor can increase the frequency even more, up to 600Hz over the stock speeds in some models, and that scenario is Turbo Max.
Three processors are technically for sale (but not necessarily available at time of writing). The big gun is the FX-8150, retailing at US$245, an eight-core model running at a base of 3.6GHz and with a Max Turbo of 4.2GHz. Next in line is the FX-8120 at US$205, an eight-core, 3.1GHz model with a 4.0GHz Max Turbo. Down from there is the FX-6100 at US$165, a six-core model at 3.3GHz and 3.9GHz Max Turbo.
What you’ll notice right up is the high clock speeds of these processors. Though we’re seeing more and more performance increase out of multi-threading, thanks to more modern software design practices, it’s still important to keep clock speeds up, especially in a scenario with shared resources, so you can move past blocked threads and resource contention quickly. It's also one of the bigger factors in single-threaded performance and that’s important for this chip as well.
We received and AMD FX-8150 to run through our testing and we promptly set it up with a Corsair H100 water cooling kit, 4GB of DDR3 1866 compatible Kingston HyperX memory, MSI Radeon 6970 at stock speeds and a Patriot Wildfire SSD along with the provided ASUS Crosshair V Formula motherboard. The only street pricing for the FX-8150 we could find at time of writing was a pre-order for $315.
Alongside, we ran up an Intel Core i5-2500K system, a popular and price point worthy contender from the competition. This system used the same memory, same SSD and cooler and a GIGABYTE G1.Sniper2 motherboard. The Core i5-2500K processor is available for around $255 at retail.
Our testing showed us a few things about the Zambezi design right off the bat. Firstly, its performance with single-threaded applications (most smaller/less complicated software applications) is unimpressive. Its performance with heavily threaded applications is good.
A prime example is our result from the LAME MP3 encoding tool. Encoding a 57-minute-long WAV file to MP3 took 2 minutes, 1 second with the FX-8150 and 1 minute 47 seconds using the Core i5. Conversely, in heavily threaded scenarios, such as highly threaded compression tool 7-Zip, we were able to push Bulldozer well past the Core i5-2500K and record a time to compress our test files of 31 seconds, compared to the Core i5’s 55 seconds.
We ran both systems through our regular software tools, PCMark 7 and 3DMark 11 and got some results that makes us think that this design isn’t all it's cracked up to be. The AMD FX-8150 earned a score of 4,088 in PCMark 7, while the Core i5 2500K managed 4,666. The results were neck and neck in 3DMark, however, with both systems scoring within 20 points of each other, with the AMD earning 5,911 and the Core i5 at 5,928.
Not resting there, we ran through a number of games benchmarks and found a few more differences. Many games did become GPU-bound before we could stress the CPU’s involvement. For example, in Metro 2033, both Intel and AMD processors scored 87fps in the low-resolution 1,024 x 768 test. The graphics were hitting a wall before the CPU became a major bottleneck.
We did find some games where the processors were more heavily involved. Dragon Age at maximum settings relies on the CPU heavily for effects, and showed the Intel Core i5-2500K as capable of 141fps while the FX-8150 wallowed with 117fps. One of our favourites, Warhammer 40K: Dawn of War II, clocked the Core i5 at 80.8fps and the FX-8150 at 58fps.
As with our other benchmarks, it wasn’t all one way. With the Battlefield 3 beta in hand and pre-release drivers, we clocked the AMD FX-8150 at an average of 50fps while the Intel Core i5-2500k scored 47fps.
The bottom line
So, is Bulldozer and Zambezi all it’s cracked up to be? Well, it’s complicated. Its single-threaded performance really lets it down in the short term, purely because so little of the software we use is heavily threaded. Ultimately, Bulldozer is a template for future growth. And now, with AMD and Intel beating the drum together encouraging more and more threaded software, its future doesn’t look as grim as its present.