September 13, 2012, 1:31 PM — This vendor-written tech primer has been edited by Network World to eliminate product promotion, but readers should note it will likely favor the submitter's approach.
The network monitoring industry has been around for a long time, but it's still an immature science relative to other information technologies.
Having participated in the creation of network monitoring solutions for banks, Fortune 500 enterprises, telcos and government agencies, I know a few things about what today's network monitoring technology can do easily and what's more difficult, what works and what doesn't. To get started, there are two fundamental questions that must be addressed: Why is network history critical? And who should own network history?
CLEAR CHOICE TEST: HP, IBM, CA deliver highly scalable network management suites
Why is network history critical?
Most large organizations have deployed some kind of SIM/SIEM (security information and event management) solution to help manage their security posture, as well as some kind of network management system to help manage the network. These are typically major investments that take years to deploy, but they don't provide the actual packet data and network history associated with security events. The most sophisticated deployments pull in hundreds of different data feeds in an attempt to create a single real-time view of what's actually going on across the entire network.
Under the hood, these are large, highly sophisticated correlation engines sifting through vast amounts of data in order to generate intelligence (alarms) that someone somewhere inside the NOC or SOC has to deal with.
The problem with any system based on statistical correlation is that it is non-deterministic. As with any system that's not fact-based, there's a risk that it can generate false positives. While many of the alarms are black-and-white and the remediation process is very straightforward, there's also a lot of gray involved in the process and where there's gray there's risk.
Engineers can't just ignore or dismiss alarms, they are required to act -- either to acknowledge and dismiss or engage and react. In many instances, the act of remediation is very straightforward and has no impact on anyone (a simple firewall rule change perhaps), but in certain instances the act can be significant and impact a lot of people.