Which bits of the business can you do without? Companies need to decide which operations are most vital if things go pear-shaped, before spending any money, as they probably can’t afford to secure everything equally.
Too many one-size-fits-all disaster recovery solutions are being successfully sold to managers and IT people who do not think carefully about what different data items and capabilities they are securing and how important each is to the continuity of the business, says a data centre specialist.
Former Hitachi Data Systems executive Roger Cockayne now runs Hosting and Datacentre Services, part-owned by Hitachi. With 30 years’ experience in the IT business under his belt, he says he’s seen “two iterations of IT trying to lead business in disaster recovery, when they know nothing about what they’re trying to secure”. In between these phases, business has to some extent reassumed control.
The business needs to know and tell IT which elements are being protected, how important each is, what cost they’re willing to pay for that protection, and how quickly they need each part of the system to be recovered in the event of a disaster, he says.
“Those are matters that need business expertise. People tend to make binary decisions, when there are gradations of risk and cost.”
The gradation, it should be noted, is not continuous, Cockayne says. “If the business says they can’t manage without something for an hour, then that’s as good as immediate; the only way you’ll get that standard of recovery is to have a hot backup site.”
Standard of recovery time falls into three broad bands: “instantaneous, a number of hours and a number of days”, and each has its own appropriate regime and cost.
IT should remind business leaders of technical limitations. Expectations of recovery time from slow backup media like tape are often unrealistic, Cockayne says. “If a database is more than a certain size, it’s unreasonable to expect recovery in less than a few hours from tape.”
Critical New Zealand operations such as banks don’t have a backup site in the narrow meaning of the word, he says; instead, two data centres run constantly in parallel, with appropriate load-balancing, each standing ready to take over the whole load if the other goes down.
Clustering of geographically separated processors is an under-used technology in a disaster recovery context, he says. A company with four processors may think it need an equivalent resource in its backup and parallel site, but they may well be running at most at 75% capacity, so six clustered processors in two geographically separated groups of three will almost certainly provide adequate coverage.
Networking is actually “the simplest matter to resolve”, Cockayne suggests, though with telecommunications pricing structured as it is in New Zealand, it’s rather expensive.
The ideal setup is to have two physically separate networking centres, each connected into “the cloud that represents your users”, and each cross-connected to the parallel data centres. This means the business will only be using part of the capacity of each network connection, perhaps, with the normal safety margin, as little as a quarter of it.
But the telco will still charge for all of it. “There is a lot of opportunity for telcos to build packages that are recognised as being only partly used most of the time and charged accordingly.”
Threats are increasing — terrorism directed against computers, if not yet a reality, is moving to the status of something that has to be planned for; and fail-safe operation will increasingly be demanded by the user business and its end-users.
“The time is coming when anyone providing operating an essential business will have to provide guarantees of complete protection.”