FRAMINGHAM (10/01/2003) - Your company's transactions, queries, documents, intranet data and files are its lifeblood, and your network's connections are the arteries that carry that blood. Keeping those connections healthy is more than just prudent. It's critical. No company wants to see its network in intensive care - or the morgue.
Many vendors offer monitoring software, devices or combinations of both to help you maintain wide area network (WAN) links at the peak level. These vendors promise their tools will alert you when outages occur, pinpoint the root cause of the outage and help you reestablish communications immediately. They claim to produce useful reports showing utilization trends, outage statistics, service-level agreement (SLA) compliance and other information. Vendors say the tools are easy to use, scale well, integrate with network management systems, handle any and all protocols, and have lots of additional features, such as the ability to prioritize network data based on quality-of-service parameters you provide.
To find the best WAN monitoring tool for your network, we invited vendors to submit their products to our lab. We tested Network Instruments' Observer 8.3 software and rack-mountable WAN Probe with a pair of T-1/E-1 analyzer taps; Neon Software's CyberGauge 5.0 software; Adtran's IQ 710 with traffic shaping and N-Form 1.4 monitoring software; Visual Networks' Visual UpTime 7.1 and Analysis Service Elements (ASE - DSU/CSUs augmented with link monitoring capabilities); Concord's eHealth 5.6 software; and Allot's WiseWAN 401 Network Application Priority Switches (link monitoring, shaping and controlling devices) and WiseWAN Network Application 5.2 Enterprise software.
Visual UpTime was the best product for keeping WAN links up and running smoothly and wins our World Class award. Although it only works with Visual Networks' DSU/CSU devices, Visual UpTime's precise and accurate monitoring ability is unsurpassed. Its many reports are practical and well designed, the user interface is intuitive and responsive, and it scales well.
For heterogeneous networks, Concord's eHealth is a World Class winner for its superior reports and amazing breadth of recognized and supported devices.
All the products did well in our tests. They proved themselves worthy, reliable tools for monitoring critical WAN links.
Hardware vendors leverage their sales by bundling or offering software that works only with their devices. In contrast, software vendors work hard to support as many devices as possible. This can pose a dilemma for companies planning to expand or upgrade an existing network.
Not surprisingly, we saw the best and most-detailed monitoring of our WAN links from products that merged a vendor's software with its hardware devices. Visual UpTime gathered statistics from and sent control commands to Visual Networks' own DSU/CSUs; Adtran's N-Form software provided WAN monitoring for the company's IQ 710 DSU/CSUs; and the WiseWAN Network Application Enterprise software worked with the WiseWAN 401 monitoring and traffic-shaping devices. Similarly, Network Instruments' Observer WAN monitoring relied on the presence of a WAN Probe located at the other end of a monitored link.
On the other hand, Concord's eHealth and Neon Software's CyberGauge gave us support for a range of network devices, but didn't monitor as closely nor deliver the level of detail that, for example, Visual UpTime did.
Discovering and reporting problems
The key to Visual UpTime's success is its close relationship with its ASE devices. The ASEs continually measure link availability and activity on a second-by-second basis for each data link connection identifier (DLCI), yet still used our network quite frugally to inform Visual UpTime of the network's current status. We found UpTime's calculations of round-trip delays very accurate. Those calculations excluded router serialization and insertion delay, and thus gave us a precise measurement of network delay for each permanent virtual circuit (PVC). We even found that for the sake of accuracy, we could exclude scheduled maintenance periods from Visual UpTime's calculations of uptime and bandwidth utilization. UpTime used the data from the ASEs to clearly show us outages and traffic levels. It also showed us several frame relay metrics, such as per-port and per-PVC throughput, overall utilization, by-protocol utilization, bursting above the committed information rate (CIR) and network congestion identified by the presence of frame relay internal throttling mechanism packets.
Combining Adtran's IQ 710 traffic-shaping DSU/CSUs and N-Form software not only monitors links for availability, but also recognizes application-specific traffic and prioritizes that traffic during busy periods. It can identify more than 300 kinds of application-level network datastreams, including Citrix WinFrame, HTTP, AOL Instant Messenger and Napster messages. Both the IQ 710s and N-Form track and display the same frame relay metrics as Visual UpTime, although with not quite much fine detail.
While some Allot WiseWAN models include DSU/CSU functionality, the model 401 units we tested were pure monitoring devices. An Ethernet tap connected the 401 to the link between the local network and router. Allot also offers a WiseWAN unit for monitoring broadband (DSL) connections. The WiseWAN WANXplorer Server software collected and displayed (via a browser interface) a wealth of statistics on the health of the WAN link and showed us who uses the most bandwidth and link utilization. Network protocol distribution reports showed the relative traffic levels of WAN protocols. The primary reports reveal line availability and SLA breaches (both summary and detailed versions). Other WAN link-related reports show line statistics, DLCI traffic by bandwidth consumption, PVC by CIR load, DLCI performance and response times.
Network Instruments' Observer is more than just a protocol analyzer or packet decoder. It also can accumulate network activity statistics and display them in useful ways. When you put the vendor's hardware or software probes on remote network segments, Observer collects network activity statistics from those probes. Observer polls these probes every 5 seconds (by default), and you can increase this to every 2 seconds. Observer presents the latest, average and maximum overall bandwidth utilization statistics, maximum and average utilization by DLCI, top talkers and congestion metrics, which include notifications when congestion is occurring, even when bandwidth utilization is below the CIR. Observer also works with probes from other vendors, such as Netscout.
Concord's eHealth includes four modules - LiveHealth, Network Health, System Health and Application Health. Network Health monitors the performance and availability of WAN interfaces, routers, switches, frame relay circuits and remote access equipment. System Health monitors servers and selected (or all) clients to alert administrators to application performance problems, server crashes and disk space shortages. Application Health is a transaction-oriented collection of tools that help determine the cause of poor application response times. At a default of 5-minute intervals (or at a rate you can set), eHealth actively polls SNMP-aware devices to determine their status and displays the result in real time.
EHealth recognizes and understands more than 900 management information base (MIB) definitions. It uses these MIBs to determine device performance and availability. Initially, eHealth collects network activity and inventory data to build a normal network baseline. Thereafter, using a complex but configurable rules set, it detects and highlights exceptional activity patterns, such as excessively high or low traffic through a router or switch port. The Network Health frame relay module efficiently and accurately collected network statistics from the DSU/CSUs in our WAN links. EHealth's many reports showed us WAN link data such as top talkers, packet discards, congestion, overall utilization and utilization by DLCI (average, minimum and maximum). We found that eHealth also understands and can monitor DSL connections.
Neon Software says you can use CyberGauge to monitor Internet connections, but we found it also can keep an eye on private WAN links. CyberGauge is well suited to small networks and Apple Macintosh-based networks. Using SNMP, CyberGauge queries an IP address (a router, for example) at the other end of a WAN link as often as every second, and collects MIB II data. However, we found setting the interval rate to 10 or 15 seconds let CyberGauge gather useful statistics. It reported uptime and downtime in terms of the number of intervals the link was active, and showed uptime as a percentage. It showed total bytes inbound and total outbound, as well as utilization billing information expresses as average traffic levels for 5-minute periods. CyberGauge also displayed bandwidth utilization for the reporting period in percentage ranges.
Ease of use
Visual UpTime excels at helping administrators maintain WAN link details, locate link problems and track link activity. Clearly its designers carefully and thoughtfully focused on administrator productivity as they built UpTime's responsive and intuitive user interface to fit the workflow and individual tasks within a large network operations center. For example, the Network Configuration dialog is a central point for changing or adding networks, sites, access lines, ASEs and circuits. UpTime's ability to print a network configuration report that documented our work was icing on the cake. We never had to fumble around in the interface to locate the right window through which to update network details, troubleshoot a problem or produce (or schedule) the appropriate reports for our WAN links.
Adtran's N-Form user interface did not impress us. Its main administrative window distinguishes between user-oriented and server-oriented tasks. Selecting the Users tab in the administrative tool brought up windows in which we could create, change or delete users. The Servers tab similarly was a doorway into configuring N-Form's default SNMP settings, network utilization thresholds, e-mail identities, event history log and network event notifications.
N-Form's Network Manager interface displays a hierarchical tree of network segments that identifies devices by address, type and status. An administrator can attach comments to each device's N-Form data to help make the tree's entries more meaningful. Network Manager can discover and display non-Adtran devices, but the tree's "type" column is relevant only for Adtran devices. The tree's "status" column, whose information is only as recent as the last SNMP polling sweep, only shows either "offline" or whether an e-mail notification is associated with a specific device. N-Form's Network Manager tree can be collapsed or expanded to help drill down to specific segments and devices.
In contrast, we found that Allot's WiseWAN WANXplorer has a well-designed tree-view interface that contains an intuitive and clearly presented display of network devices. We could move objects via drag-and-drop and sort columns of data by clicking on the column header. Right-clicking an entry displayed WANXplorer's easy-to-understand pop-up menus. Best of all, WANXplorer color codes currently set alarms to show a rising status (red) or a falling status (gray).
Observer uses a tree-view main window and multiple concurrently open child windows to show devices and events. Drilling down to get more data is simply a matter of double-clicking an item in the tree. Observer also displays a window containing a graphical view of network conversations. Alongside each conversation pair are statistics showing packet-to-packet delay times, retransmissions and lost packets. Clicking on a conversation pair drills down to a list of packets exchanged by the nodes. Each display of network activity is a child window that updates in real time, and you can have as many concurrent windows open as you wish.
While the other tools presented native Windows interfaces, Concord's eHealth server console used The SCO Group's Xvision PC X server. But growing accustomed to PC X takes only a short while. EHealth's expandable combination of tree-view window and associated detail windows gave us quick access to network segment and device details, and current status. EHealth obviously is intended for large networks. For example, we found we could sort eHealth's display of network devices by IP address or class, which helped make working with populous segments much easier. Creating circuit-specific presentations of uptime and bandwidth utilization is easy with eHealth.
The CyberGauge interface defines simplicity. Entering device data involves choosing an interface type from a list (including frame relay, Ethernet, and serial) that CyberGauge detects on the router you point at. CyberGauge then lets you configure interface preferences and parameters, and how you want to display statistics. After you select one or more interfaces on a target router, clicking the Begin Monitoring button puts CyberGauge to work.
All six tools offer browser-based access to their reports and configuration data.
Entering details about each WAN link isn't a task you need to do every day, fortunately, but each tool takes a different approach to the job. You explicitly tell Visual UpTime, WANXplorer, CyberGauge and Observer about each IP-addressed device at either end of a WAN link. In contrast, using IP address ranges you specify, N-Form and eHealth automatically discover WAN link devices. In our tests, eHealth's discovery process occurred daily, on a schedule we could set or, if we wished, interactively. During each sweep of the network, eHealth automatically discovered new or changed device information. EHealth eased the process of identifying network devices by letting us categorize network elements by class or IP address grouping. It then performed a discovery process to find those elements on the network.
All the tools we tested handled the various protocols we threw at them. We went a step further, however. TCP has an internal throttling mechanism that classically fails in the presence of other protocols. The mechanism senses overall TCP traffic levels to know when to throttle itself back, but the traffic-level detection ignores other protocols as it decides how many packets it can send in a "window" before expecting a response from its session partner. How well would the traffic-shaping tools work when we mixed high levels of TCP and other traffic on the network? Working at the application layer, the Adtran IQ 710 and Allot WiseWAN 401 sorted out the traffic jam quite nicely as they prioritized, for example, database transactions over e-mail.
We found that UpTime and eHealth scaled best. They both exhibited the capacity to handle a range of different network sizes, as well as a high degree of modular configurability.
All six products integrated well with HP's OpenView, emitting SNMP alerts (traps) that OpenView accepted and processed. They're also easy to install. And kudos to Visual Networks for sending customers Visual UpTime pre-installed on a fast server.
Visual, Concord and Network Instruments all supply professionally written, clear and comprehensive documentation, as well as useful online help. Allot Communications' documentation is a 96-page manual that explains the WiseWAN hardware, but leaves the bulk of the software's description to the online help files. Adtran's documentation consists entirely of online help files, while the CyberGauge documentation is simply a 44-page booklet augmented by some online help files.
Migrating to Visual UpTime when you install new or replacement DSU/CSUs can help you create a WAN environment that's conducive to keeping your WAN links running smoothly. Troubleshooting the problem of the hour is much easier when you have Visual UpTime's level of up-to-the-second detail to help. To avoid future problems, Visual UpTime's reports are a godsend to capacity planners who need to make intelligent judgments about network growth and changes.
For heterogeneous networks, eHealth is just what the doctor ordered. Its status indicators showing the condition of your network segments and devices - as well as its plethora of useful reports - make eHealth a necessity on large, diverse networks.
Nance, a software developer and consultant, is the author of Introduction to Networking, 4th Edition and Client/Server LAN Programming. He can be reached at firstname.lastname@example.org.