AMD’s Barcelona CPU is loaded with “invented here” innovation. It is also inspired by IBM’s Power architecture. IBM’s newest Power CPU, Power6, is due mid-year, along with quad-core processors from Intel and AMD. And while x86 will get more headlines in IT publications, Power6 is arguably more deserving.
Power6 is a dual-core CPU, and seeing everything through an x86 lens makes it appear that Power6 is at a disadvantage compared with the x86. But IBM doesn’t have any races to run against competitors; it seeks only to outdo itself. Maybe that’s why Power is still the world’s fastest microprocessor architecture.
With Power6, what strikes you first is the clock speed: 5GHz. IBM’s Power6 gigahertz are honest, meaning the chip design reflects an obsession with keeping all of Power’s resources busy, simultaneously, during every clock cycle. However, recognising that modern software rarely requires that level of hardware optimisation, Power6 turns off elements of the CPU that aren’t being used in a given cycle. Plus, IBM has developed a new, more power-efficient type of transistor. The result: despite its incredible performance, Power is not the infamously hot and inefficient monster it once was.
Power6 builds on a concept that Intel (unwisely) sidelined: SMT (symmetric multi-threading). SMT takes responsibility for fine-grained scheduling optimisation away from the operating system by creating the mirage that each CPU core is actually two. The OS feeds instruction streams (threads) to each of those virtual cores, and Power6 performs the magic of blending the threads together. Intel’s gains from its Pentium 4 implementation of SMT, called hyperthreading, were primarily evident in demanding graphical applications because Intel’s x86 system architecture is bus-bound. The sum of the bandwidth available through all of Power6’s on-CPU bus controllers is 300 GBbit/s, so IBM’s SMT makes I/O-heavy applications such as databases take off.
Both of Power6’s two cores are blessed with plenty of cache: 128KB of Level 1, 4MB of Level 2, and a share of up to 32MB of on-chip Level 3. Power6 is loaded with many more advances, but the ones that impress most relate to virtualisation and self-healing. Power6 has room for up to 1,024 hardware-managed partitions, IBM’s term for virtual machines. Memory segments are locked with keys to make one partition’s memory entirely inaccessible to malicious software, and memory can be reallocated among partitions both to improve utilisation and to work around memory errors.
There is more to Power6, of course, but any rundown of big iron RISC CPU features has to end with one that distinguishes that CPU from x86 contenders. A Power6 CPU checkpoints (that is, makes a copy of) its execution state with every clock tick. If the CPU uncovers an error in external or internal memory, or in processor behaviour, Power6 silently restarts (not the entire system) and resumes execution at the last checkpoint. In other words, it doesn’t skip a beat. If the restart doesn’t resolve the problem, Power6 shuts the whole CPU down, but only after dumping its state to a healthy CPU. The healthy CPU intelligently integrates the migrated workload with its own.
Features like these would be overkill for x86, since no operating systems or applications could take advantage of them. But then I wonder whether having potential like this in hardware would spur a much-needed revolution in software.