A compromised server was the root cause of a series of outages at Christchurch-based web hosting provider Digiweb this week, according to a note to customers obtained by Computerworld. "It appears that one customer site was compromised, which in turn caused the flood of malformed packets to the firewalls. Our internal network analysis software did not identify these packets as they were not ‘standard’ TCP/IP traffic," a note from Adrian Grant, the managing director of Digiweb-owned Discount Domains, says. The note catalogues the difficulty technicians had in identifying the root cause of the problem, with help being sought from both Gen-i locally and US engineers from firewall company Check Point. Digiweb first went down on Tuesday night between between 9.30 and midnight. That was attributed to a "core switch intermission failure" and two switches were replaced. However, on Wednesday night the symptoms reoccurred from 7.30 pm. "Clearly this highlighted that the corrective action of the previous night i.e. the replacement of both core switches deferred the issue rather than provided a permanent resolution," Grant writes.
The fault was again identified and the issue was escalated to external maintenance support teams for the Check Point firewall and hardware provider. This identified that the fault appeared to be within the Check Point firewall clustering software.
"With the assistance of Check Point engineers the decision was made to split the firewall cluster and run them as individual stand alone units to resurrect the network. This appeared to temporarily solve the issue just after midnight. However, at 2.45 am the network failed again. "The team were still onsite monitoring the network. Our firewall maintenance providers were again called who arranged for patches to be downloaded. At 5.10 am the patches were installed and the firewall management server reconfigured to accommodate the patch upgrade. This did not provide a permanent fix."
Digiweb then "reached out to our friends in Gen-i" to help source Check Point firewall hardware and to provide resources to support the technical team that had worked through the night, Grant writes. "In addition to this a decision was taken to move some core applications to the old network that was still functioning as [it] was not reliant on the Check Point firewalls. These include DiscountDomains.co.nz, email and Digiweb.co.nz. However the core network was re-established without the need to deploy this second network.
"Low level analysis with the assistance of Checkpoint engineers in the USA identified high volumes of fragmented packets originating from one of our shared virtual hosting servers to be the root cause of the issue," Grant writes. "These packets were flooding the firewalls and causing the outage. The source of these packets was identified and blocked at 1.50 pm and the Check Point firewalls then returned to normal service.
Digiweb now intends to move its shared virtual hosting customers behind a separate firewall isolated from the rest of its networks.
"This will ensure that should there be any re-ocurrence the offending server is quarantined, and does not cause the kind of outage we have just experienced," Grant writes.