- The DNS troubles which took flagship Microsoft websites off air yesterday appeared to have returned – with Microsoft.com, msn.com and other domains failing to resolve this morning.
After yesterday's nearly daylong blackout of many of Microsoft's Web sites, it appears the problem may have been exacerbated by a rookie mistake.
Microsoft's problem was linked to its Domain Name System (DNS) servers, according to spokesman Adam Sohn. He adds that the root cause of the outage remains a mystery.
But the fatal error may have been that all of Microsoft's DNS servers are located on the same network.
DNS servers translate domain names, such as Microsoft.com, into IP addresses. The IP addresses are used to locate servers on a network. Without DNS, therefore, Web surfers can't find Web sites. DNS is an Internet standard and the default routing system in Windows 2000.
With its DNS servers down, Microsoft Web sites -- including Microsoft.com, MSN.com, Expedia.com, CarPoint.com and Encarta.com -- have been unavailable or sporadically available since Tuesday night.
Microsoft is not ruling out a malicious denial-of-service attack, but it appears that part of the problem could be its network architecture -- most notably that its four DNS servers all appear to sit on the same network.
"If that is the case, it is extremely stupid," said one network administrator who runs his own DNS servers and asked not to be named. "The reason you have more than one server is in case the server goes down. The reason you have those servers on different networks is in case the network goes down."
Microsoft's Sohn said that DNS does run on its own network, but that "it is fully fault-tolerant with redundant routers and redundant bandwidth. Our technicians say this is the way we do it, and splitting it apart may not have made a difference." Sohn said that in the final analysis that architecture may change, but for now the focus is on finding the problem and getting things up and running again.
Using a Unix command called Dig, which can locate domain name servers on the Internet, it is evident that Microsoft's DNS servers are all on the same network. All of the company's IP addresses, which contain four blocks of numbers, begin with 207.46.138. Microsoft also owns all the addresses in the 207.46 IP address class, which indicates the problem is confined to its network.
The block of four numbers can be thought of as a locator akin to "state, county, city, block," said the administrator who ran the Dig query for aNetwork Worldreporter.
In running DNS servers on one network, Microsoft is ignoring strong advice from the Internet Engineering Task Force. The IETF created DNS and recommends in its Best Current Practices under RFC 2182 that "servers for a zone should certainly not all be placed on the same LAN segment in the same roof of the same building - or any of those. Such a configuration almost defeats the requirement, and utility, of having multiple servers."
An informal survey using Dig shows many Fortune 500 maintain their DNS servers on different networks. IBM, for example, maintains a DNS server in Zurich, Switzerland. But other companies, such as Dell Computer, appear to run their DNS servers on the same network, just like Microsoft.
Ping, a basic Internet program that lets you verify that a particular Internet address exists and can accept requests, shows that Microsoft's DNS servers have been up for short intervals all day.
That fact indicates Microsoft's DNS has not been affected by a cable cut that would sever all communications with the Internet, according to experts. It does, however, suggest a problem with a router or a DNS server.
If it is a router problem, Microsoft would have benefited from having its DNS on separate networks, according to the network administrator.
But Microsoft's troubles could be linked to updates it made to its DNS naming tables. The tables were updated six times on Tuesday, according to information gleaned from pinging Microsoft's DNS servers.
Microsoft's Sohn would not confirm if those changes or the DNS network architecture led to the problem.
The network administrator who ran the Dig and Ping queries forNetwork Worldsaid it is incomprehensible to think that Microsoft did not have a hot backup while working on its DNS servers.