The advent of the microprocessor created a new class of system that liberated computing from air-conditioned rooms and washing machine-sized storage. But as small systems have evolved, we’ve drifted from the design traits that made mainframes invaluable in critical applications.
Well, there’s no reason the x86 advantages of low cost, standardisation and low power should be at odds with manageability, resource partitioning, and good old RAS (reliability, availability, serviceability). We need to break away from our love affair with cores and caches, and focus on making x86 servers self-diagnosing and self-adapting by nature.
As with virtualisation, getting RAS right means it must be a design priority all the way down to the copper and silicon.
IBM gets that — shortly after uniting its Power and mainframe engineering design teams, IBM produced a microprocessor, Power6, that represents the crucial first step along the path towards bringing mainframe-grade reliability to small servers. The Power6 design, and the p 570 total system architecture built around it, incorporates several mainframe RAS qualities that seem impossible on an air-cooled microprocessor.
Genuine RAS is predictive and proactive, not reactive, and that capability cannot be extended to a server by the OS or by an external management controller. The x86 paradigm emphasises the sensing of coarse-grained failures, such as an overheated CPU or a bad memory module, and invokes a coarse-grained response, often shutting down the system or, in servers of better design, taking out the failed part while leaving the system running.
Power6 watches gross indicators such as thermometers, fan tachometers, and parity bits just like x86 does, but Power6’s microcode is loaded with little sanity checks. The Power6 CPU analyses the consistency of the states of its various modules, and if it finds a mismatch, it takes fine-grained action ranging from retrying a failed instruction to reverting to a known, healthy state.
If an invalid state persists, Power6 recovers by moving in-process workload to a healthy CPU. This isn’t something that Power6 does in its idle time. It does it with every clock cycle, 4.7 billion times each second. And yet Power6 eclipses Power5’s performance and power efficiency. IBM couldn’t have done mainframe-grade RAS in the Power6 CPU if it had decided that Power6 would be an eight-core chip. There wouldn’t have been room on the processor die for RAS. Instead, IBM decided that Power6 would be a dual-core CPU with two SMT (symmetric multithreading) engines per core. IBM managed to advance performance, doubling that of Power5, and reduce power consumption while implementing RAS in the CPU.
And when I say “in the CPU”, I mean that literally. The fault conditions that Power6 senses in hardware are handled in hardware. The OS is clueless that anything’s amiss, and that’s as it should be. Rapid growth in OS complexity is one of the reasons to pull RAS into hardware, and as I alluded to earlier, by the time the OS learns of a hardware failure, it’s too late to do anything but fail over. IBM will face a lot of criticism over Power6’s clock speed (and the perceived reduced density) stemming from its decision to favour RAS over multicore mania. I think that IBM made precisely the right decision. In fact, I’d like to see Intel and AMD back away from “core wars” and start devoting on-CPU real estate to RAS.