Storms in the network backbone
Storms are bad. They cause havoc, disrupt people's lives and damage property. Network storms, although rarely life threatening can be equally disruptive.
The are several types of storms that occur on networks. Perhaps the most frequent and thus familiar are broadcast storms. Broadcast storms occur when there are a large number of retransmissions occurring on the network either due to device failure or some other instability in the network. Although they may sound innocent, broadcast storms can bring an enterprise network to its knees.
The good news is that broadcast storms are usually of short duration and tend to go away by themselves. Still, because of the loss of service that is involved, it is crucial to have a good network management package keeping on eye on the health of your network. If there is a broadcast storm, you want to be alerted by your network management package, not by your users.
Less known, but more insidious is the so-called OSPF storm. As it name implies, this kind of storm occurs on networks running the OSPF (Open Shortest Path First) protocol. (OSPF was first standardized in RFC 1247. The latest version, OSPF v.2, can be found in RFC 2328.) It is extremely popular in enterprise networks because it supports VLSMs (Variable Length Subnet Masks).
OSPF is referred to as a "link-state" routing protocol. That simply means that routing changes are based on the status and bandwidth of the physical links between routers. When an OSPF router comes on line it send out a "hello" message in an attempt to discover its neighbor routers. Once it gets a response, the router can exchange link-state information with its neighbor routers. This occurs in the form of LSAs (link-state advertisements). Typically, unless there is a change in network topology, LSAs are exchanged every 30 minutes. However, LSAs can be sent out more frequently if an interface goes down, or some other status within the network changes. If there are too many simultaneous changes, the LSAs can quickly get out of hand resulting in an OSPF storm.
OSPF storms are especially nasty because OSPF is typically run in the enterprise backbone. Therefore, an OSPF storm can render the entire backbone network inoperable. Worse yet (if there is a "worse" than having the backbone tied up in knots) is that unlike broadcast storms, OSPF storms generally do not go away by themselves. They often require manual intervention -- and that intervention, more often than not, involves physically downing the affected routers.
One way to minimize the potential for OSPF storms is to divide the network into separate, hierarchical domains, called areas. Instead of having one very large OSPF area or zone, the backbone is made up of a number of adjacent OSPF zones. The common rule of thumb is that there should be no more than 50 routers per area, but this is often reduced depending on the number of interfaces per router and the network backbone size. For example, West Virginia University's backbone network consists of five OSPF areas with no more than two routers per area. That may be conservative, but it has helped to keep us relatively storm free.
With multiple OSPF areas, a storm is less likely to occur because there are only a few devices chatting in the area. If it does, it is more likely to be confined to a specific area -- although this is by no means a guarantee. So proceed with caution. Although they are inevitable, you will want to do whatever you can to prevent and minimize storms in the backbone.