November 30, 2008, 5:13 PM — Have you tried to visualize your traffic flows? Did you have the problem that your graphs got really cluttered? Did you find a way around that problem? Did you filter data? Did you aggregate multiple nodes into one?
Aggregation and filtering are two of the steps that can be used to reduce the number of data points in a graph. Oftentimes that is very useful. Other times the reduced amount of information is not desirable. In those cases, you need to switch your graph type and use something that can encode more information. One such graph type is Sparklines.
"Sparklines are data-intense, design-simple, word-sized graphics". [Edward Tufte (2006). Beautiful Evidence. Graphics Press. ISBN 0961392177]
The following is a sample Sparkline:
You can see that there are no units shown at all. They are not of interest. It's the general development of the measure that is of interest. Generally the y-axis (horizontally) shows time. That way you see the development over time of your desired measure. An example would be a stock price. You can see how a specific stock - or a market - developed over time. You are interested not necessarily in the absolute stock value, but in the development over a fixed time period. If you do this for multiple stocks, you can compare them very nicely due to the fact that the prices are normalized.
I am using Sparklines to analyze various types of log files. For example, to monitor host configurations, such as the number of processes running on my servers or the number of open ports. For this post, I am going to stick to an easier example. I am going to analyze the number of flows captured via a traffic flow log.
The graph shows - over time - which IP addresses and what services have been active. There are many many things you can see in this visualization. You can find patterns that repeat for a set of IP addresses, patterns that show up for pairs of sources and destinations or destinations and ports. You can see periodic behavior and gaps or spikes in such.
The graph is even more insightful if you know more about the environment. For example, if you knew the IP addresses of the DNS servers in this network you could draw certain conclusions. You might find policy violations or mis-configurations in this graph. Maybe you find suspicious activity or machines on the network that should not be there.