Most companies shouldn’t have to replicate every piece of data to protect their business from the next cataclysmic event. Nor should they necessarily have to cough up millions for a mirror site that traces every network transaction. And let’s face it, unless you’re cyber-cynical, catastrophes are extremely rare. Be that as it may, organisations are increasingly being held accountable for their data and prudence points to being prepared. Computerworld Canada asked three experts what the most commonly overlooked elements are in today’s disaster recovery plans — just in case.
1. Understand business needs
Ultimately IT is there to serve business, and disaster-recovery planning should be no different.
Sound hackneyed? Well, most IT shops still don’t get it, according to EMC Canada consultant Iain Anderson. People are still making technology decisions, and not business decisions, he says.
According to Paul Saxton, lead consultant in business resiliency at IBM Canada, recovery capabilities have to be matched to the business requirements.
“Understand that disaster recovery and business continuity are part of overall risk management,” he says. “It’s not just an IT thing.”
Anderson says IT has a responsibility to understand how business workflow ties in to business applications, and how those applications in turn are supported by infrastructure.
“One of the challenges I see all the time is that business continuity and disaster recovery fall back to the responsibility of IT, and IT’s normal response is to throw technology at it,” says Anderson.
“We tend not to spend enough time communicating out there with the business units and understanding what their business problems are,” he says.
2. Know your enemies
As a type of insurance policy, it’s helpful to know what threats and vulnerabilities you’re likely to come up against.
Unless you’re in an area at high risk of a natural disaster, you probably won’t be building a mirror-site of your entire IT infrastructure.
But going through that vulnerability and risk assessment can be a heated debate, says George Kerns, president and chief executive of IT services provider Fusepoint. The budget for a recovery plan is large compared to the operating budget, and if the chance of a disaster occurring isn’t high, how do you avoid spending too much?
“I think this has to come down to a rational conversation between the CIO and the CEO,” says Kerns. “They have to be aligned on what risks they’re willing to take with their business.”
Anderson notes that catastrophic failures of datacentres are rare. They’re typically built for high availability, located in a secure area and supported by a network operations centre. “Your disaster recovery plan is going to depend on how data-intensive your business is and what your company’s appetite for risk is.”
3. Map your support system
Often it’s not clear how an application is serviced up to a business process, and how the underlying infrastructure supports those applications. Unless you know all those pieces you’re not going to be able to determine what a sensible disaster-recovery plan is, says Anderson.
Without system-to-application mapping, it’s impossible to understand the interoperability and interrelationships you need to manage. “This helps you understand the recovery bundles and where those single points of failure are in the environment.”
Having that business workflow to application-to-system mapping can also drive the discussion around cost, adds Anderson. It gives executives a clear sense of the extent to which IT supports business, he says.
Kerns says a disaster recovery plan can be dissected in different ways. Depending on how fast you need to get a piece of your system back up and run, companies can look at recovery sites that are cold, warm or hot. Not every business application is as critical as the next.
4. Get the message out
Consistent communication is another key element that’s often overlooked. Business needs IT’s par-ticipation and IT needs plenty of time and access to resources.
“You have to have that executive-level buy-in or you’re probably going to put up a facade of a plan without investing in the resources,” says Kerns.
“And the people who come up with the plan have to engage the people who are running IT operations,” he adds.
On another level, no one ever wants to talk about the gap in perceptions and expectations of recovery time, says Anderson.
He says most business units believe their IT systems can be backed up within hours, while IT will estimate a couple of days and an actual assessment of the technology will reveal a further gap.
It’s not only about communicating business requirements to the IT people, says Saxton. Sometimes it’s just being able to get in touch with people when something goes wrong.
5. Fail your test
Like any type of planning exercise, you have to test it and test it and test it again.
Saxton says there is very little in the way of full end-to-end testing for applications across multiple platforms. “This is a big area where more testing needs to be done, with a more rigorous, more integrated approach and a stronger level of governance around it,” he says.
And don’t test to pass; you have to test to fail, says Anderson. “If it fails, only then can you understand what to fix.”
Saxton says many organisations think they’re in a better position than they really are. “One of the biggest drawbacks is this pass-fail mentality. It doesn’t help to set yourself up to pass.”
You have to make sure it all fits together and there are no holes in it, says Kerns. “A lot of people put some effort into developing a plan so they can check the box and say they have a plan,” he says. “But forced into a true recovery situation, most companies would find their plans haven’t been updated and they’ve never been through a full-blown test.”
6. Keep up with change
Disaster-recovery planning is not a one-time event. As your IT system evolves with new applications, upgrades and configuration changes, these changes will likely affect your recovery plan.
“Keep your technical recovery capabilities consistent with the latest production configurations,” says Saxton.
As new technologies and applications are added, they frequently don’t get copied over and the recovery side of things quickly falls out of date, he says. Change management processes and the application development lifecycle need to take recovery into account.
“You have to keep rolling these things through so that you’re prepared.”