Five Nines, by the book

I recently had a junior engineer in our lab ask me about the term 'five nines.' I launched into my canned script about the term representing 99.999% uptime, translating into a 5-minute, 15-second downtime budget.

          I recently had a junior engineer in our lab ask me about the term "five nines." I launched into my canned script about the term representing 99.999% uptime, translating into a five-minute, 15-second downtime budget.

          Still with furrowed brow, he peppered me with more questions. "Is that five minutes, 15 seconds in a year or over the life of the box?" "How do you define 'downtime?'" "Does that include software patches?"

          The kid caught me flat-footed, and I realised I was passing along "spoon-fed" information about an important subject. I needed to educate myself on what five nines really means.

          Five nines is not a hard metric, but rather the result of a predictive calculation. When a company claims that its device is five-nines reliable, it is talking about an absurdly complicated mathematical calculation based on industry-standard formulas used to predict the reliability of the box.

          For every possible definition of "failure" - ranging from a hint of trouble to a total meltdown - these formulas take into account the extent of the failures, the probability with which they will occur, how quickly the failures can be diagnosed, and how soon service can be restored.

          Fuzzy line around availability

          Five-nines discussions blur the line between availability and reliability. A five-nines claim could be referring to either availability or reliability, depending upon which predictive formula is used. It's important to understand the difference between the two ways vendors can spin these terms.

          For any given product, availability equals the total amount of time the product was up. Reliability means the number of instances in which the product went down. So you can have one big outage, and the box will reflect high reliability, but low availability. Or you could have two dozen outages of five seconds or less, and the box could be accurately described as being highly available, but unreliable.

          Confusing, I know. But that's the point. When a vendor or marketeer says "five nines," they're probably using it as a catchall phrase that is probably devoid of any real meaning.

          If you invoked the "spirit" of the law, very few products would stand the test. One of the originators of the concept of five nines is Telcordia, formerly Bellcore, which geared its specifications toward Local Access and Transport Area office and tandem switching systems. When I engaged in the ascetic exercise of reading the core specification document, I found out that "five nines" - however you define it - is entirely too forgiving. For example, as it relates to availability, Telcordia's allowable downtime budget for an entire system failure, in which all end users are down, is 24 seconds per year. So much for five minutes, 15 seconds.

          So what should a five-nines claim mean to you as a purchaser of network gear? If not backed up by independently verified testing over time, not a thing. Just throw it on the trash heap of marketing buzzwords. Focus instead on the specific redundancy features of the gear you're purchasing, and you'll be much better off.

          Percy is a technology analyst at Miercom, a network consultancy and testing centre in Princeton Junction, New Jersey.

Join the newsletter!

Error: Please check your email address.

Tags uptime

More about Bellcore

Show Comments
[]