Hardware in Review
Tech news
at TheJemReport.com
Software reviews
at SoftwareinReview.com
Hardware reviews
at HardwareinReview.com
Discuss technology
at TJRForum.com
Hardware in Review → Articles → Guides →

Dual-core processing

By Jem Matzan

With the recent introduction of dual-core processors come more questions about system performance. Does a dual-core computer really perform better than a single-core CPU? How does it compare to a true multi-CPU system? What about Hyper-Threading Technology -- is it a thing of the past? Read on for an explanation of dual-core processors and symmetric multiprocessing and what it means to your computing experience.

Dual cores or dual CPUs?

CPUs with two processing cores are all the latest fashion today with Intel and AMD, but it's by no means a new idea -- IBM has had multi-core CPUs for a few years, and one of its most recent designs has 9 logic cores in a single processor. Dual core processors can perform the logic work of two discrete processors; the only downside is, the two cores must share the same resources (cache memory, RAM, and memory controller).

While the on-die cache memory on AMD and Intel dual-core processors is divided evenly between the two cores -- meaning the cache isn't shared globally -- that also means that there is less cache memory for each processor. The Athlon 64 X2 and the Intel Pentium D both have 2MB of cache memory, so each core gets 1MB.

Computers that use two discrete processors have two RAM-related advantages over single processors with dual cores. To begin with, dual-CPU systems have one memory controller for each CPU, which means that each processor has access to its own RAM banks. The second advantage is the increased number of RAM banks, which allows for more memory modules to be installed. A dual-core processor shares its memory controller and RAM between the two cores, so there is some loss in efficiency and available resources when compared with a multiprocessor machine.

When adding CPUs to solve a performance problem, there is a point of diminishing returns with some processor architectures. 32-bit IA32, for instance, won't see a significant performance gain in systems that use more than 4 CPUs in tandem. Companies like Cray and Unisys, however, have designed IA32 and AMD64 servers that use up to 32 CPUs in a series of symmetrical processing subsystems; CPUs are installed in pairs on a rack of internally connected motherboards, each with its own discrete resources. Essentially this is more of a self-contained cluster than a single computer in the traditional sense.

Multithreading: the software side

Most business software (including operating systems) is multithreaded, meaning it can handle several operations at the same time. The more processors or cores you have in a computer, the more threads you can run simultaneously, and the more calculations can be completed in the same timeframe. If you have, for instance, a multithreaded graphical visualization program that renders an entire 3-dimensional blueprint of a building, the time it takes to render and display that blueprint will be significantly reduced by using more processors to perform the calculations. Smaller calculations or single-threaded programs will see no significant benefit to having multiple CPUs or logic cores. Some operating systems are designed for 2 or 4 CPUs -- or at least they are licensed or supported that way -- and might not see a benefit from using more than the stated number of processors or cores.

So you first need a multi-threaded operating system. Windows XP Professional (but not Home), Windows 2000 (all versions), Windows 2003, GNU/Linux, Solaris, OS X, and Free/Open/NetBSD are the most popular multi-threaded operating systems that are still sold and supported. Of those, GNU/Linux and Solaris Unix have the most thorough and efficient multi-processing abilities.

Secondly, your application software must also be multi-threaded. One sure-fire way to verify that all of your applications are getting the best performance in a multi-core machine is to compile them yourself. This is easy to do on GNU/Linux distributions like Gentoo, and on most BSD variants. Since it's not usually possible to custom-compile a proprietary software program, you may have to check with the software manufacturer or vendor to make sure that your multi-CPU hardware and software configuration is supported.

The Intel Pentium D

Intel redesigned the Pentium 4 Prescott processor to support two logic cores instead of one. It is by no means a complete redesign ala the Xeon or Pentium M -- it's just a standard LGA775 Pentium 4 with two cores. The Pentium D does have an advantage over first-generation Prescott P4 CPUs, though: all Pentium D processors have the Extended Memory 64 Technology (EM64T) extension, which is Intel's implementation of x86_64 (better known as AMD64). In other words, it's a 64-bit instruction set architecture with the ability to flawlessly run 32-bit binaries.

All Pentium D motherboards use DDR2 memory, which is a significant redesign over the DDR SDRAM that Intel previously supported. While potentially more latent, DDR2 theoretically can transfer twice as much data as DDR at the same frequency. DDR2 also uses less power and can operate at frequencies that match the Pentium D's frontside bus. Intel's processor architecture still calls for a frontside bus, which is a physical pathway between the CPU and the memory controller. For maximum efficiency and performance, the RAM and the CPU should be operating at the same frequency, so you need to match the RAM frequency to the CPU's.

Pentium D processors are rated by model number, not by Intel's traditional measure of clock speed. The 820 is the first Pentium D model, and higher model numbers (830, 840, etc.) operate at higher frequencies.

Hyper-Threading Technology, which reduces the number of unused processor cycles by emulating two processing cores, is not included in the initial Pentium D designs. Intel has not abandoned HTT, though -- it'll be re-implemented in future processor models. Hyper-Threading emulates two processing cores, but in effect it is only one core that works more efficiently when multi-tasking. Under heavy processor loads, HT Technology can actually hinder performance. Since it was initially designed as a desktop computing enhancement, perhaps Intel needs to redesign HTT to be more compatible with multi-core and multi-CPU systems.

The AMD Athlon 64 X2

Like Intel, AMD did little more than add a second core to their socket 939 Athlon 64 line to make the Athlon 64 X2. That makes the X2 the third generation of the Athlon 64, and the sixth distinct AMD64 processor design.

Athlon 64 X2 processors work with motherboards that use DDR memory. AMD will implement DDR2 memory controllers into future Athlon 64 designs, but the initial X2 processors do not support it. To boost memory bandwidth, AMD integrates the memory controller in the CPU itself, which reduces latency and eliminates the need for a frontside bus.

Athlon 64 X2 processors, like all previous Athlon 64 and Athlon XP models, are performance rated according to model number. So the dual-core Athlon 64 X2 3800+ offers roughly the same performance as the single-core Athlon 64 3800+. The model numbers are deceiving; the X2 3800+ actually runs at a significantly slower clock speed than the standard Athlon 64 3800+, so it uses the second execution core to sort of "catch up" to the faster single-core processor. In other words, a processor marked as a 3800+ is going to perform roughly the same as all other AMD processors labeled 3800+ regardless of what technologies are in use.

Power usage

Dual-core processors inherently use less electricity than dual-CPU systems because there is less hardware to consume it. The below table shows data on power consumption that I gathered recently. It compares both Intel and AMD dual-core systems to a dual Opteron system.

The dual-core systems both had 1GB RAM (two 512MB modules of Corsair TwinX-LL DDR400 and Crucial DDR2 800), an Asus P5WD2 and an A8N-E motherboard, an Antec TrueBlue 480 power supply, a Matrox G550 1X PCIe 32MB graphics card, a Seagate SATA-V 160GB hard drive, and an AOpen DVD/CDRW combo drive. The Opteron system was a Sun Java Workstation w2100z with two Opteron 252 processors, 4GB DDR400 RAM in 4 modules, a Fujitsu SCSI-320 73GB hard drive, a graphics board based on the Nvidia Quadro FX3000, and an NEC DVDRW drive. Because of the large difference in peripherals between the Opteron system and the dual-core systems, the Opteron numbers are going to read higher -- possibly by as much as 15 or 20 watt hours.

Power was measured using the Watts Up Pro watt meter. I measured one reading evey second for fifteen minutes, starting at the system POST beep. The operating system was Knoppix version 3.9 for x86. This runs from the CD, so the numbers will be slightly higher (~5 watt hours) than an operating system that runs from the hard drive. I had to choose Knoppix because all other SMP-aware OSes that I tried (including Windows XP Professional) had difficulty installing on the Athlon 64 X2 machine. Knoppix, however, worked well when running from the CD. I performed three test runs on one machine to determine the margin of error; it is approximately 0.5 watt hours.

The cost calculations are based on $0.0731 per kilowatt hour, which is the average for the state of Florida as of 2003, and approximately in the middle of the energy cost spectrum in the United States. You can plug in your own cost per KWh and multiply it by the average KWh per month to get the estimated cost of operation for your area.

CPU Min/Max watts measured Watt hours consumed Average monthly KWh Average monthly cost (USD)
Intel Pentium D 820 156/264 42.6 123 $8.97
AMD Athlon 64 X2 3800+ 95/168 28.4 81 $5.91
AMD Opteron 252 (x2) 227/287 60.6 175 $12.79

You could run two dual-core AMD machines for the same utility cost as the dual-CPU machine. And check out how much more power the Intel machine draws than the AMD dual-core system. In large labs, offices, and schools, choosing energy-efficient computers could make a significant impact on operating costs. In this case there is no performance loss, either -- the Athlon 64 X2 machine outperforms the Pentium D 820.

Conclusions

Convinced that you need a dual-core computer now? Maybe you don't. If your current machine is good enough for your needs, there's no reason to upgrade. Those who are thinking of buying or building a new system can't ignore dual-core CPUs, though; soon they may be the only option available, as single-core CPUs gradually cease production.

Dual-core processing is not the wave of the future; multi-core processing is, however. It won't be long before we see desktop processors with more than two cores on a single die, and those computers will offer even greater efficiency and performance than the current generation of power-hungry machines. Even multi-CPU machines will eventually have multiple cores, which could double their computing power.

But to take advantage of such robust hardware, your software will have to be able to match its abilities. Don't ignore the software half of the equation, or you'll miss out on your new dual-core computer's potential.