February 01, 2001, 9:33 AM — After nearly a daylong blackout of many of Microsoft's Web sites, it appears the problem may have been exacerbated by a rookie mistake.
Microsoft's problem was linked to its Domain Name System (DNS) servers, according to spokesman Adam Sohn. He adds that the root cause of the outage remains a mystery.
But the fatal error may have been that all of Microsoft's DNS servers are located on the same network.
DNS servers translate domain names, such as Microsoft.com, into IP addresses. The IP addresses are used to locate servers on a network. Without DNS, therefore, Web surfers can't find Web sites. DNS is an Internet standard and the default routing system in Windows 2000.
With its DNS servers down, Microsoft Web sites -- including Microsoft.com, MSN.com, Expedia.com, CarPoint.com and Encarta.com -- have been unavailable or sporadically available since Tuesday night.
Microsoft is not ruling out a malicious denial-of-service attack, but it appears that part of the problem could be its network architecture -- most notably that its four DNS servers all appear to sit on the same network.
"If that is the case, it is extremely stupid," said one network administrator who runs his own DNS servers and asked not to be named. "The reason you have more than one server is in case the server goes down. The reason you have those servers on different networks is in case the network goes down."
Microsoft's Sohn said that DNS does run on its own network, but that "it is fully fault-tolerant with redundant routers and redundant bandwidth. Our technicians say this is the way we do it, and splitting it apart may not have made a difference." Sohn said that in the final analysis that architecture may change, but for now the focus is on finding the problem and getting things up and running again.
Using a Unix command called Dig, which can locate domain name servers on the Internet, it is evident that Microsoft's DNS servers are all on the same network. All of the company's IP addresses, which contain four blocks of numbers, begin with 207.46.138. Microsoft also owns all the addresses in the 207.46 IP address class, which indicates the problem is confined to its network.
The block of four numbers can be thought of as a locator akin to "state, county, city, block," said the administrator who ran the Dig query for aNetwork Worldreporter.
In running DNS servers on one network, Microsoft is ignoring strong advice from the Internet Engineering Task Force. The IETF created DNS and recommends in its Best Current Practices under RFC 2182 that "servers for a zone should certainly not all be placed on the same LAN segment in the same roof of the same building -- or any of those. Such a configuration almost defeats the requirement, and utility, of having multiple servers."