Database changes behind emergency response failure

Ironically, a simulated outage was planned, to test the cut-over procedures more fully, but the real failure happened first

The cut-off of telecomms services to the Auckland emergency service’s communications centres early this month happened only two days after the database for the three emergency centres — in Auckland, Wellington and Christchurch — had been consolidated. The drop-out was the result of a Telecom switch failure.

The three emergency centres used to operate as separate databases, but the need for improved co-ordination led to the decision, taken earlier this year, that they be consolidated into one centralised database (Computerworld, July 23).

This was why it took several hours to transfer operations from the main site, in Auckland, to the back-up site, in Wellington, says Police ICT manager Rohan Mendes.

“We had to take it very carefully, because it was the first time we’d done it” apart from an earlier test, he says.

Ironically, a simulated outage was planned, to test the cut-over procedures more fully, but the real failure happened first.

The failure of communications was confirmed at about 8am on Sunday, November 4, and the progressive cut-over of all three centres from the Auckland to the Wellington database took until about 12.30pm, says Mendes.

Meanwhile, operations continued to function manually, with details of calls being taken down on paper. “Fortunately it was not at a busy

period.”

Now the cut-over procedure has been fully tested, it should take “less than half an hour” in future, says Mendes.

A Telecom spokeswoman says the drop-out was triggered by the failure of an uninterruptible power-supply switch at a major Auckland exchange, during routine testing of its power back-up procedures. The mains power was cut deliberately over to the reserve supply, which runs on batteries, with a life of 30 minutes, allowing a diesel generator to take up the load.

However, the switch failed to connect to the generator and the systems ran down the batteries before the failure was noticed.

After the supply was switched back to the mains, a number of systems, both in the exchange and with Telecom customers, had to be rebooted.

On a Sunday, all this took longer than it would have during a normal working day.

When Computerworld spoke to Telecom last week, the failure had been traced to a component in the UPS switch, which had been replaced. Investigations continue into what made the component fail and how to avoid a recurrence of the problem, said the spokeswoman.

Join the newsletter!

Error: Please check your email address.

Tags emergency servicestelecom

Show Comments
[]