|Home||Reviews||Tools||Forums||FAQs||Find Service||ISP News||Maps||About|
how-to block ads
In typical "real-life" network, there should be some kind of automatic network health monitoring and reporting system. The idea of having such system is to have some network health management and report to provide at least general idea of how network health state is at.
One aspect of network health management is the monitoring part. With automated system, one can receive automatic alert of general network health such as up/down connection, bandwidth utilization, network device status and utilization. Those automatic alert can be in form of either email, SMS/text, or flashy display on your PC monitor should the system detects issues. Such automatic alert is helpful when there are too many network devices to manage or there are no luxury to manually monitor network health in real time.
As illustration, let's say there is a T1/E1 circuit that is crucial to business requirement, either as Internet circuit, private link, or the like. To ensure smooth business transaction over this circuit, the circuit's bandwidth utilization should never reach above 80%. You as network administrator would like to know if and when the circuit bandwidth utilization is "too high" without spending time manually watch the circuit utilization.
With automated network health monitoring system, you can set the system to send you "yellow" (warning) alert when the circuit bandwidth utilization reaches 50% and to send you "red" (crucial) alert when the circuit bandwidth utilization reaches 80%.
As mentioned, the automatic alert can be in form of either email, SMS/text, or flashy display on your PC monitor should the system detects issues. Therefore you could be physically away from the circuit doing other things yet you don't miss the moment of when the circuit is "over-utilized".
Knowing immediately whether bandwidth currently over-utilized is great to know especially when users complaining of slow access, either slow Internet access (if the circuit is Internet circuit) or slow access to private server (if the line is private line). By knowing such info immediately, your job as network administration will be less troublesome since you already have the valid cause of such slow access (latency issue).
The next aspect of network health management is the reporting part. With automated system, one can receive report of network usage within certain time range. This report type varies, which can include how often a circuit is up/down, how much bandwidth utilization is on certain circuit or connection, how much memory and CPU utilization is on certain network device, and how slow/fast certain application or software response is; depending on the automated system feature.
As illustration, let's say you like to know how a circuit bandwidth utilization looks like since last month. With automated network health reporting system, you can set the system to send circuit bandwidth utilization report starting from last month to today.
The monthly report typically shows some bar graph with daily use of the circuit bandwidth. On the report you may see that on Day 1, the circuit bandwidth is used up to 40%. On Day 15 let's say, the report may say that the circuit bandwidth is used up to 80%.
With this kind of report, it will be useful to track the circuit bandwidth utilization level. When the circuit bandwidth is too often over-utilized (too often of 80% utilization let's say), then further action might be in order. Such action could be an investigation of what kind of traffic using the bandwidth and if those traffic are either legitimate or illegitimate. Another action could be considering of upgrading the bandwidth to larger one.
How the Automatic Network Health System looks like
The automatic network health monitoring and reporting system itself is a software installed in some server (typically either Unix, Windows, vendor-specific, or proprietary server). The software will communicate with the network devices to be monitored in some kind of protocol, which will be explained later. The network devices that can be monitored vary; typically routers, switches, firewall, server, printer, and wireless access point.
Automatic Network Health System mechanism
Most common monitoring system deal with IP-based network devices, meaning any devices that can have IP address. There are some monitoring system that deal with non-IP-based devices. This non-IP-based devices are typically legacy or "old-school" devices such as analog PBX or phones and legacy DAX in telco environment.
The IP-based monitoring system as mentioned communicate with the monitored network devices use some kind of protocol (IP protocol). Most common protocol used are ICMP, TCP-, or UDP-based protocol. Example of TCP- or UDP-based protocol used is Syslog, SNMP, and Netflow (Cisco specific).
Note that more advanced IP-based monitoring system can also monitor using higher level protocol like HTTP and SQL databases. In addition, this kind of software or application monitoring system can also detect and monitor IM (Instant Messaging) protocol and even peer-to-peer protocol such as Kaaza and eDonkey. This software or application monitoring system is typically deployed when specific software or application performance is crucial to business requirement.
There are a lot of software out there that do the IP-based monitoring, from the "free" version to "premium-pay" version. Following are some of technology key words on how the software is designed.
* ICMP (Internet Control Message Protocol)
* SNMP (Simple Network Management Protocol)
* Netflow (Cisco specific)
* Software/Application performance monitoring: HTTP, SQL databases, IM, peer-to-peer protocols
Typical business-grade network devices (i.e. routers, firewalls, switches from major vendor such as Cisco and Juniper) should be able to generate some kind of logs due to some event or incident such as up/down interface, routing updates, and configuration changes. This kind of logs in general are in the form of syslog messages. By default, these syslog messages are stored within the devices themselves.
When you have an automatic health monitoring system, the system should have a syslog server which collects all syslog messages generated by all network devices. To have this, following are the general idea.
* Install a syslog server
* Configure the server to receive and to store syslog messages from your network devices
* Configure your network devices to send syslog messages to syslog server
Note that you should be able to check syslog messages on the network devices themselves. However those devices are not designed to store syslog messages for a long time. Usually after a short period of time, the logs are deleted. Using a syslog server, you can store syslog messages much longer period (typically for 1 to 3 months) and even can back up the messages to other media such as tape backup.
ICMP (Internet Control Message Protocol)
In a lot of time, you may need to see if certain circuit or Internet connection is up or down. One simplest and common way to find out is to ping the Internet gateway (your ISP equipment) or pretty much any device that is at the other side of the circuit. This ping mechanism is based on assumption of receiving ICMP echo reply from the device you monitor in certain time frame as a response of ICMP echo your monitoring system is sending. If in certain time the ICMP echo reply is not received, the other end device or the connection could be safely assumed to be either down or busy.
Most network devices by default should be ping-able. By ping-able means that the device will send ICMP echo reply as a response to the ICMP echo it receives. Note that certain firewall however by default will not be ping-able. Should you choose to monitor network devices by ICMP, verify if the devices response to ping.
SNMP (Simple Network Management Protocol)
In some cases, having a syslog server to collect syslog messages are insufficient. One case is that syslog messages don't provide more specific info regarding specific events or devices such as device CPU or memory utilization, bandwidth utilization, and device temperature. This is something that SNMP does provide.
SNMP is another essential part of your automatic health monitoring system. Similarly to Syslog, a SNMP server collects SNMP traps from SNMP clients. These SNMP clients could be any IP-based network devices such as routers, firewalls, switches, printers, and production servers (i.e. web or mail). As mentioned; up/down interface, CPU and memory utilization, port or bandwidth utilization, temperatures, and low on laser printer toner are just little things SNMP traps from specific devices can represent those device health condition. Depending on the network device feature, you may be able to configure the device to generate limited choice or large choice of SNMP traps.
Once SNMP server receives all of those SNMP traps, the server can generate reports on those specific conditions. If you like to see CPU and memory utilization on specific SNMP clients within certain time range for instance, you can pull a report regarding those. You can do similar task for switch port utilization.
Further, you can link your SNMP server to your mail server. This way you (or just anybody within your company) can receive mail alert when specific condition take place such as device temperature hits 80 degree Fahrenheit, CPU or memory utilization of a device hits 80% or more, and down devices.
Typically only business-grade network devices support SNMP. This support means that the device will generate SNMP traps and is capable to send those SNMP traps to certain SNMP server. Should you decide to monitor the network device condition by SNMP, verify such SNMP trap you look for (i.e. up/down interface, CPU and memory utilization, port or bandwidth utilization, temperatures) is supported on the device.
Specifically for bandwidth utilization, SNMP report only tells how much specific port or connection is utilized (i.e. 10% or 90% utilized). However the report does not tell you which traffic are utilizing the bandwidth.
When your network devices are Cisco that can provide Netflow reports, you can utilize Netflow to provide such specific details. In a nutshell, the Netflow reports show which traffic are utilizing the bandwidth from perspectives of source and destination IP address, TCP or UDP port, and how many IP packets are going through. For instance, your internal user (let's say 10.0.10.254 IP address) accesses your internal webserver (let's say 10.0.0.2 on TCP port 80) and www.yahoo.com on the Internet using 80% of available bandwidth.
More info on Cisco Netflow is available here
A lot of time, network or Internet slowness is caused by software or application run on server or PC. This software or application could be mail (SMTP), web (HTTP, HTTPS, SSL, TLS), FTP, SQL databases, or even peer-to-peer applications such as Kaaza and eDonkey. Beside monitoring the network, monitoring the software and/or application performance is highly recommended as these software and/or application can be written incorrectly by the developers, causing poor performance.
There are many monitoring system you can choose as the software or application performance monitoring. Some of them are OPNET and Ixia. By using OPNET for example, you can find out exactly what happen during the client-server relationship on some software or application and if those events of client-server relationship happen as expected or not. The monitoring result should give you ideas of what happen and if the events you see may cause performance problem.
Another example of application monitoring system is Cisco MARS which can detect the use of IM (Instant Messaging) protocol such as Yahoo! or AOL IM; in addition to peer-to-peer protocol detection such as Kaaza and eDonkey. In some organization or company, the use of such application is forbidden, especially peer-to-peer application since such application can use up available bandwidth to the point where no more available bandwidth can be used for business-legitimate application.
More info on Cisco MARS can be found here
Software To Choose as Automatic Monitoring and Reporting System
Note that you don't have to use the mentioned monitoring system. Those mentioned monitoring system are just picked as illustration (although they are proven to work and helpful on real-life production network). As a rule of thumb, any monitoring system should do as long as they are able to serve your need.
There are many software that can do Syslog, ICMP, SNMP, and Netflow collection and report as mentioned. A lot of companies like to use Solarwind or Whatsup products. Some companies like to use CiscoWorks, which may include Cisco MARS.
There are free ICMP and SNMP software that are widely used such MRTG and Cacti. One popular free Syslog software is Kiwi Syslog.
As mentioned, basically any software that you think work should do. Typically the "premium-pay" software is preferred when you have a large or complex networks, or you like details or thorough reports.
»Cisco Forum FAQ »Improving Small Business network performance
»[OT] Network Test tool http/Sql/Mapi/SIP, etc
»Program for monitoring DSL statistics
»Show Ip flow Top-talkers detail