topics that matter; ideas worth sharing

share a tip, submit a link, add something new

Network outages should spur integrators to stay in contact with customers

August 27, 2007, 03:48 PM —  ITworld.com — 

You've to feel for the 20,000 people stranded recently at Los Angeles International Airport's Tom Bradley International Terminal and the 200 million more around the globe who lost access to Skype. Television and general-interest news organizations ascribed the woes to the ubiquitous "computer glitch." We, of course, know better. And usually the truth is a lot scarier.

According to a spokesman at LAX, passengers were stuck in terminals and in 60 planes starting at about 2 p.m. on Aug. 11 and service was not restored until nearly midnight. The failed system is one that contains information about incoming international passengers, including law enforcement data and outstanding arrest warrants. It's a reflection on the time in which we live, but that's another discussion for another venue.

As it turns out, the problem was neither rooted in changes to application software nor in a cyber-attack launched by "evil-doers and their henchmen" against networks operated by the U.S. Customs and Border Protection or by the airport. It was a plain old garden-variety hardware failure. But of what? Various published reports lay the blame with a network switch, failed server, rogue network interface card, and a generally ancient network infrastructure.

I don't buy it. Yes, it is hardware that failed, but the blame also lies with people, lots of them with lots of fingers to point. There are the IT organizations from the airport and Customs, including administrators, troubleshooters, security experts, and more. There's the ISP that provides connectivity. There are the public-sector bean counters who try to do things on the cheap (except for the occasional $100 million toilet). There's the lack of network monitoring tools that could zero in on misbehaving hardware. There are even solutions integrators who may not have persisted enough in getting these organizations to replace hardware and cabling that had reached old age -- as measured in IT years.

Normally accessing a database residing in Washington, D.C., the network wisely maintains a local backup copy in the rare event that communications back to Washington is lost. Good plan. Unfortunately, because it was the local network itself that went down, accessing the local backup became impossible. Further complicating matters was an initial misdiagnosis, leading to a six-hour hunt before it was determined that the problem lay within the local network and not elsewhere.

It's a good lesson for systems integrators: Know what your customers' networks look like, the age of hardware components, and understand where the redundancies are -- and aren't. That knowledge comes only from frequent, regular contact with the account. It's odd that some see every visit to a customer as a sales call. It need not be that way. The occasional on-site visit to keep the lines of communication open, make sure everything is alright, and to reassure the account that your expertise and solutions are just a phone call away, also serves to keep the integrator up to date. And with good bedside manner, guiding the conversation and getting the IT director to open up is even better. Confession, after all, is healthier than interrogation.

As for Skype, it is blaming its own software, putting to rest all kinds of nasty rumors. Spokesman Villu Arak denied the problem was due to an attack or to maintenance on its billing system. In fact, Arak has gone to great lengths on his blog to explain in detail exactly what took place. The outage was "triggered by a massive restart of our users' computers across the globe within a very short timeframe as they re-booted after receiving a routine set of patches through Windows Update." And, no, he's not blaming Microsoft.

Here's the mea culpa: "Normally Skype's peer-to-peer network has an inbuilt ability to self-heal, however, this event revealed a previously unseen software bug within the network resource allocation algorithm which prevented the self-healing function from working quickly."

Give Skype credit for taking a bullet. Even a solutions provider would have no clue about the innards of Skype's -- or any one else's -- proprietary code. Consequently, I see Skype's outage, which left its more than 200 million account holders incommunicado, as a very different situation. But it does make me wonder if VoIP is really ready for prime time. Love or hate my local phone company, I don't ever recall picking up the phone and not hearing a dial tone. Reliability still counts for something.

ITworld.com

I like it!
Post a comment
The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
Resources
White Paper

Symantec Backup Exec 12 and Backup Exec System Recovery 8 deliver industry leading Windows data protection and system recovery. Download this whitepaper to find out the top reasons to upgrade and how to get continuous data protection and complete system recovery.

Webcast

Data and system loss — from a hard drive failure, malicious attack, natural disaster, or simple human error — can happen anytime. Don’t leave your business vulnerable. Make sure you have a secure recovery strategy in place. Symantec's latest backup and system recovery technology can efficiently restore critical applications, individual emails and documents and even restore your entire system in minutes in the event of a loss.

White Paper

Businesses face a growing challenge to ensure that the IT environment is properly protected. Backup Exec 12 integrates with other applications in the Symantec family of products, to complement your current data protection strategy, keep your data securely backed up and make it recoverable when you need it most.

Free stuff
Featured Sponsor

Get a broad understanding of important regulations and how you can make sure your site is in adherence.





Learn how VeriSign SGC-enabled SSL Certificates can help improve site security and customer confidence in the free white paper, "How to Offer the Strongest SSL Encryption." In this paper you will learn the differences between weak and strong encryption and what they mean for your site's performance.

Get VeriSign's free white paper: "The Latest Advancements in SSL Technology" and learn about the benefits of strong SSL encryption, Extended Validation (EV) SSL and security trust marks and what these SSL offerings can do for your site.

Now with Extended Validation (EV) SSL available from VeriSign, you can show your customers that they can trust your site. Learn about EV SSL benefits in this free VeriSign white paper.

More Resources