How do you prepare your security data for visualization?

Do you know how much traffic is transmitted on your networks? Do you know what protocols are in use and what machines are using them? Are there spyware infected machines on your network that leak information?

Traffic flows are records that tell you what machines have communicated on the network, what services they used, and how much data they transmitted. These records can be used to answer a variety of questions about the behavior of machines and the traffic flowing on your networks.

In the next blog entry I will show how flows can be visualized to ease their analysis. To do so, we need to first collect them and do some initial processing.

In the following example, I am going to use NetFlow - one specific type of traffic flow. (Other traffic flows include sFlow or jFlow). Traffic flows are a representative of security data. To process other security data sources a similar process to the one presented here can be used.

First we need to configure the source device - in my case a CISCO router - to generate NetFlow records. To do so, you have to issue the following commands:

interface Ethernet0/0
  ip route-cache flow
ip flow-export destination 8888
ip flow-export version 9 bgp-nexthop

This assumes that on you have a machine that can accept NetFlow records. To collect the NetFlow records, we are going to use nfdump. Issue the following command to start nfdump and record the flows being sent to our collection machine:

./nfcapd –w –D –p 8888

This will record the flows on disk, in a binary format. In order to read the recorded information, issue the following command:

./nfdump –r /var/tmp/nfcapd.200801021115 
-o "fmt:%ts %td %pr %sap -> %dap %pkt %byt %fl %out %in"

This tells nfdump both the location of the records, as well as the format in which to output the information. The output of the previous command displays records in the following form:

2005-10-22 23:02:53.967  0.000  TCP 0>   1   60   1   0   1 

To visualize the data, this is not very useful. We need to generate CSV output of the fields that we are interested in. To do so, we can use the following command:

./nfdump –r /var/tmp/nfcapd.200801021115 -o "fmt:%sa,%da"

This will output all the source and destination IPs in a CSV (comma separated values) format:,

This is a format that is understood by various visualization tools. Unfortunately, a lot of security tools do not offer the capability to change the output format to CSV - unlike nfdump. In those cases, we need to parse the output with either a specialized parser or some type of UNIX script. Here is how we could parse the previous output with awk (pipe the output into the following command):

awk '{print $5,$7}' | awk -F: '{print $1,$3}'

There are many other ways to parse this, for example with Perl and a regular expression. If you don't know regular expressions, the previous awk line is probably the simplest way of parsing the output.

Stay tuned for the next blog entry where I will show how we can take the CSV output to visualize the communication patterns.

Terima Kasih from Jakarta

ITWorld DealPost: The best in tech deals and discounts.
Shop Tech Products at Amazon