Problem notification on small, local networks is easy - just listen for the complaints from your co-workers. On large, global networks, an enterprise-level network management platform can keep watch. But for a midsize network (up to several hundred nodes), you'll need one of these monitoring and alerting software packages. While these tools aren't as feature-laden as ones from Tivoli, Computer Associates or Hewlett-Packard, they can automatically detect network connectivity problems, Windows NT operating system problems or both. In fact, two of the tools we reviewed can only monitor IP-addressable devices and SNMP-manageable nodes (see related story). The other seven offer varying degrees of connectivity surveillance and server health monitoring.
Some monitoring and alerting tools can tell you - via e-mail, pager, SNMP alert or other means - that a server's CPU is overloaded, its memory resources are nearly exhausted, its free disk space is precariously low, or that some other error condition exists. In some cases, these tools can reboot a server, restart a service or take other corrective action without your intervention. For connectivity, a tool might ping IP-addressable devices or poll SNMP-manageable devices to alert you to network failures.
Using one of these tools, you get notification of problems earlier than if you waited for users to complain. Remember that the earlier you deal with a problem, the smaller it is, and the later you deal with it, the bigger it is.
Our evaluation focused on Windows-based products that notify network administrators about connectivity and server health problems. We tested nine products: Ipswitch's WhatsUp Gold 5.0, Ripple Technologies' LogCaster 2.6, NetIQ's AppManager 3.4, Breakout Technologies' MonitorIT 2.0, NTP Software's System Sentinel 2000, MediaHouse Software's ipMonitor 6.0, Heroix's RoboMon NT 7.5, Dartware's InterMapper 3.0 and Argent Software's Guardian 4.1A.
Our World Class Award goes to ipMonitor, which proved to be a superior and intelligent problem notification aid for our network. Its monitoring and reporting features, together with its slick user interface, impressed us greatly. AppManager and RoboMon NT deserve special mention. Their ability to closely monitor the Windows operating system rivaled ipMonitor's.
nIP those problems
MediaHouse's ipMonitor watches Windows NT/2000 machines for excessive resource consumption (via NT Performance Monitor data), failed NT services and specific event log entries. IpMonitor supervises particular applications, succh as Oracle and SQL Server, and it also pings IP-addressable network devices to check for availability and polls them via SNMP to discover their status. The network scanning feature for discovering IP-addressable network devices, like the one WhatsUp Gold uses, is quick and accurate. IpMonitor further impressed us by noting changes to files on our servers, verifying Web page links and maintaining surveillance on our Win 2000 Active Directory data.
In addition to e-mail, pager and SNMP alerts, ipMonitor can also alert an ICQ chat account, issue a network broadcast message, issue a help desk trouble ticket via third-party software or insert an entry into the NT event log. It offered the most corrective action options, including rebooting a server, restarting an NT service or launching a program, batch file or third-party diagnostic utility.
IpMonitor's reports are highly useful and configurable. The supplied report templates include Availability, Response Time, Downtime, Diagnostic Analysis, and Health and Trouble reports. For two user-selectable time intervals (current and historical), the reports show historical activity and can predict trends. Administrators can designate one or more e-mail address destinations for each report. IpMonitor stores monitoring parameters and statistical performance data in its internal database.
The ipMonitor Web-based interface is well-designed and slick. IpMonitor's use of colors to indicate the status of a monitored device or parameter (green for OK, yellow for a new problem, red for a problem for which notifications have been issued and dark red for older problems for which ipMonitor has already transmitted several alerts) provides a great deal of information to anyone who glances at the Monitor Status window. Through ipMonitor's Web pages, we could choose monitoring tasks to run, indicate the schedule on which they should run, and view ipMonitor's reports and status displays.
Installation was a breeze. The printed documentation is a simple "Getting Started" booklet, but the online help is comprehensive, clear and context-sensitive.
An application perspective
The monitoring agents of NetIQ's AppManager impressed us by detecting broken trust relationships across Windows NT/2000 domains, identifying hung and terminated NT/2000 services, noting which applications caused excessive CPU or memory usage, and tracking the number of open shared files. It discovered which users consumed the most SQL Server resources (CPU, memory, disk I/O and locks), and it monitored Exchange throughput and Internet Information Server (IIS) connections and session timeouts. Via SNMP, AppManager discovered and monitored our network devices and non-Windows computers.
Like ipMonitor, AppManager could monitor Windows NT/2000's performance parameters, event logs, registry, services and specific applications running on NT/2000, such as Oracle, SQL Server, Exchange, Lotus Domino, Citrix MetaFrame, Citrix WinFrame and SAP R/3. Its easy-to-understand script language for creating rules and defining behaviors is a dialect of Microsoft's powerful and popular Visual Basic.
AppManager can notify administrators of problems via e-mail, pager calls or SNMP alerts. But it doesn't send network broadcast messages, nor does it issue help desk trouble tickets through third-party software. In addition to its other corrective actions, we liked AppManager's ability to interact with a database through Open Database Connectivity (ODBC).
We found setting up AppManager's real-time reporting of performance factors especially easy. By merely dragging and dropping script icons onto target computers in the Win32 console's tree view, we quickly defined several reports that AppManager immediately began running. The real-time graphs of system activity also let us drill down for details, such as which SQL statements were causing excessive CPU usage in SQL Server. AppManager's reports also showed system status and inventory data. For example, we could see exactly which files were being shared by each server, as well as the IP addresses of network adapters installed in the NT machines. AppManager can produce more than 200 reports.
The Win32 native console and browser-based interface offered the same functionality and were easy to navigate. Each displayed information clearly and intuitively. The multithreaded Win32 console, we noted, was amazingly responsive during our tests.
AppManager's installation process is almost child's play. However, it requires that you already have SQL Server 6.5 or 7.0, which adds considerably to the total cost of the product and caused us to lower the product's installation score-card entry. The printed documentation is clear and professionally done, as is the online help. However, the printed documentation defers to the online help in many places, especially with respect to setting up corrective actions.
Hey, mon! A network repair robot
Although architecturally different, NetIQ's AppManager and Heroix's RoboMon NT have similar features, strengths and weaknesses. We gave RoboMon NT lower scores because it didn't monitor quite as many applications as AppManager, its script language was less powerful and the printed documentation wasn't as good.
RoboMon NT monitored many aspects of our Windows NT/2000 servers and clients. It even took steps to correct some problems before they affected our network. For example, when we flooded a shared printer queue with print jobs in one test, RoboMon NT redirected a portion of the print jobs to an alternate printer queue to head off what otherwise would have been a stand-around-the-printer-waiting-for-my-printout bottleneck. (We enabled pop-up print job completion notifications to send people to the right destination printer.) However, RoboMon NT can't monitor Exchange 2000, SAP R/3 or Lotus Domino R5.
Configuring rules to direct RoboMon NT's behavior is simple and painless, but the script language for creating monitoring rules is weak and unsophisticated. The product includes useful default rules for diagnosing problems, checking system statistics and monitoring network performance.
In another test of RoboMon NT's ability to keep our network up and running, we overloaded an NT machine with several concurrent tasks. RoboMon NT, using one of its many configurable rules, clearly and descriptively displayed the message, "The number of processes on the system is too high." Clearly and unambiguously, it also told us our Dynamic Host Configuration Protocol address pool was almost empty.
For the overall network beyond NT servers, we were disappointed that RoboMon NT monitored just Cisco network devices. RoboMon also runs on Unix and OpenVMS, but we didn't test those versions.
Rule-triggered behaviors can consist of notification, corrective action, rule modification and variable modification.
Rule modification can enable, disable or reschedule other rules. Variable modification changes one or more of RoboMon NT's internal settings, which then dynamically affect other rules.
We found we could have a rule keep an eye on virtually any NT activity we wished. These activities included NT Performance Monitor items, event log entries, Oracle or SQL Server relational database changes, application log file entries, Component Object Model changes and Windows Management Interface events. RoboMon NT even distinguished between chronic problems, persistent problems and new ones. It stores rules and the network data it collects in a Microsoft Access database it creates by default. You can alternately use SQL Server.
RoboMon NT's reporting tools, Report Manager and Graph Manager, present numerous ways to view current and historical network activity. The Graph Manager displays relationships between statistics that RoboMon NT collects, and Report Manager offers reports by group ((disk, event, Exchange, Internet, process and system). For example, the system report group menu consists of summary, cache, physical memory, page file, CPU usage, CPU rates, file I/O, TCP/IP, client/server, Web server and SQL Server reports. However, RoboMon NT lacks a trend analysis reporting function.
RoboMon NT's user interface is intuitive and easy to master. Access-ing RoboMon NT's dispersed network components via a central Enterprise Manager module makes a network administrator highly productive.
Installation takes between 5 and 10 minutes, during which you have to tell RoboMon NT what each agent is monitoring. The printed documentation, consisting of merely a start-up manual, is of poor quality. Paradoxically, the online documentation is comprehensive and clear.
A guardian angel
Like ipMonitor, RoboMon NT and AppManager, Argent Software's Guardian is essentially a Windows watcher that can monitor many different Windows NT/2000 parameters and applications. Like Ripple Technologies' LogCaster, Guardian can also keep an eye on Linux computers. To discover and monitor network devices such as routers and switches, it can also send IP pings and SNMP polling requests across the network.
However, Guardian's corrective action feature was more difficult to configure, it has a dearth of built-in reports and the documentation wasn't as good. And because it starts at $9,000 for 10 servers, Guardian is expensive for small networks.
The monitoring process is completely configurable and uses an object-oriented, rule-based design. We liked Guardian's scheduling feature for running recurring monitoring tasks. The scripting language is proprietary and quite unlike the script languages within RoboMon NT, AppManager and LogCaster. Each rule set specifies how Guardian should perform its problem detection. A rule set can have up to eight classes: event, performance, program, service, SNMP alert, command, system down and printer. Creating a working rule set is a matter of selecting rules within each class and choosing one or more notification methods.
The Guardian Predictor module's built-in reports consist of just a service-level agreement monitoring report and system resource report. However, Guardian stores its monitoring parameters, performance data and network device status information in its ODBC-accessible database. Guardian suggests the use of a reporting tool such as Seagate Software's Crystal Reports to depict its data. In contrast, NetIQ's App-Manager has more than 200 built-in reports, which made AppManager much more useful out of the box.
The native Win32 user interface is easy to navigate and simple to operate. Guardian's Web interface can display server status information and the product's limited set of built-in reports, but it lacks the Win32 console's ability to configure Guardian. In addition, the Web interface requires IIS and doesn't work with Netscape or Apache Web servers.