Rediscovering sar

Monitoring system performance with sar is as easy as enabling the sar tasks. While sar is bundled into many versions of Unix, it's often disabled until you take steps to get it running. To find out whether sar is running on your Unix server, just type sar and see what happens. If you get a report that shows you performance measures for every ten minutes or so, it's collecting data for you.

# sar
Linux 2.6.18-128.el5 (boson)     07/22/2012

12:00:01 AM       CPU     %user     %nice   %system   %iowait    %steal     %idle
12:10:01 AM       all      0.15      0.00      0.06      0.01      0.00     99.78
12:20:02 AM       all      0.08      0.00      0.05      0.01      0.00     99.86
12:30:01 AM       all      0.09      0.00      0.05      0.01      0.00     99.85
12:40:01 AM       all      1.83      0.00      0.11      0.36      0.00     97.70
...
03:00:01 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
03:10:01 PM       all      0.22      0.00      0.07      0.02      0.00     99.69
03:20:01 PM       all      0.09      0.00      0.05      0.01      0.00     99.86
03:30:01 PM       all      0.10      0.00      0.05      0.01      0.00     99.84
03:40:01 PM       all      0.10      0.00      0.05      0.01      0.00     99.85
03:50:01 PM       all      0.09      0.00      0.05      0.01      0.00     99.86
04:00:01 PM       all      0.08      0.00      0.04      0.01      0.00     99.86
04:10:01 PM       all      0.10      0.00      0.05      0.01      0.00     99.84
Average:          all      0.12      0.00      0.05      0.01      0.00     99.82

Now, let's examine what these numbers are telling us. For one thing, the %idle -- a fairly obvious measure of how much of the time your CPU (or CPUs) are simply waiting for something to do -- is above 99%. This particular system is twiddling its proverbial thumbs. Scanning through the measurements, the 97.70 in the fourth line of measurements is the busiest it's been. But, of course, today is Sunday and it might have an altogether different profile on a weekday.

The other numbers tell us what is going on the rest of the time.

  • %user -- percentage of the time the CPU is performing user tasks (commands and applications)
  • %nice -- percentage of the time the CPU is performing user tasks with the priority set to "nice"
  • %system -- percentage of the time the CPU is executing kernel tasks
  • %iowait -- percentage of the time the CPU is idle with outstanding I/O requests
  • %steal -- percentage of the time the CPU spent in involuntary wait (applies to virtual CPUs)
  • %idle -- percentage of the time the CPU is idle without outstanding disk I/O reuqests

If your server is not collecting sar data routinely, you can look at current performance by giving sar some arguments. In the command below, we are asking sar to give us three 10-second intervals worth of data.

# sar 10 3
Linux 2.6.18-128.el5 (boson)     07/22/2012

04:49:21 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
04:49:31 PM       all      0.04      0.00      0.03      0.00      0.00     99.93
04:49:41 PM       all      0.23      0.00      0.09      0.01      0.00     99.67
04:49:51 PM       all      0.12      0.00      0.07      0.01      0.00     99.80
Average:          all      0.13      0.00      0.06      0.01      0.00     99.80

To look at performance for some other day, use the -f switch and specify the file you want to use:

# sar -f sa17
Linux 2.6.18-128.el5 (boson)     07/17/2012

12:00:01 AM       CPU     %user     %nice   %system   %iowait    %steal     %idle
12:10:01 AM       all      0.17      0.00      0.06      0.03      0.00     99.73
12:20:02 AM       all      0.09      0.00      0.05      0.01      0.00     99.86
12:30:01 AM       all      0.09      0.00      0.05      0.01      0.00     99.85
...
11:40:01 PM       all      0.08      0.00      0.04      0.01      0.00     99.87
11:50:01 PM       all      0.08      0.00      0.04      0.01      0.00     99.87
Average:          all      0.10      0.00      0.04      0.01      0.00     99.84

Now, we can see that this system is largely idle even on a weekday.

If you want to break out that data by CPU, you just have to add a -P ALL argument. If you want to focus on one particular CPU, substitute the CPU number for the word ALL (e.g., sar -P 2 10 1).

# sar -P ALL 10 1
Linux 2.6.18-128.el5 (boson)     07/22/2012

04:56:06 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
04:56:16 PM       all      0.04      0.00      0.03      0.01      0.00     99.92
04:56:16 PM         0      0.00      0.00      0.00      0.00      0.00    100.00
04:56:16 PM         1      0.10      0.00      0.00      0.00      0.00     99.90
04:56:16 PM         2      0.10      0.00      0.10      0.10      0.00     99.70
04:56:16 PM         3      0.00      0.00      0.00      0.00      0.00    100.00
04:56:16 PM         4      0.00      0.00      0.10      0.00      0.00     99.90
04:56:16 PM         5      0.10      0.00      0.10      0.00      0.00     99.80
04:56:16 PM         6      0.10      0.00      0.00      0.00      0.00     99.90
04:56:16 PM         7      0.10      0.00      0.00      0.00      0.00     99.90


Average:          CPU     %user     %nice   %system   %iowait    %steal     %idle
Average:          all      0.04      0.00      0.03      0.01      0.00     99.92
Average:            0      0.00      0.00      0.00      0.00      0.00    100.00
Average:            1      0.10      0.00      0.00      0.00      0.00     99.90
Average:            2      0.10      0.00      0.10      0.10      0.00     99.70
Average:            3      0.00      0.00      0.00      0.00      0.00    100.00
Average:            4      0.00      0.00      0.10      0.00      0.00     99.90
Average:            5      0.10      0.00      0.10      0.00      0.00     99.80
Average:            6      0.10      0.00      0.00      0.00      0.00     99.90
Average:            7      0.10      0.00      0.00      0.00      0.00     99.90

If I cd over to the /var/log/sa directory, I see that sar has been collecting performance data for a while and is retaining nine days' worth.

# cd /var/log/sa
# ls
sa14  sa16  sa18  sa20  sa22   sar14  sar16  sar18  sar20
sa15  sa17  sa19  sa21  sar13  sar15  sar17  sar19  sar21

While you might not be inclined to run the sar commands shown above every day, it's a good idea to monitor performance routinely. One way to do this is to set up a script that emails you some of this data every day. The script shown below will send you the daily averages from the nine sa files (modify the script if your installation of sar does not retain nine files).

#!/bin/bash
# showPerf

admin="YOUR-EMAIL@YOUR-COMPANY"

declare weekday=('Sunday   ' 'Monday   ' 'Tuesday  ' 'Wednesday' 'Thursday ' 'Friday   ' 'Saturday ')

case `date +%A` in
  Sunday) day=6;;
  Monday) day=0;;
  Tuesday) day=1;;
  Wednesday) day=2;;
  Thursday) day=3;;
  Friday) day=4;;
  Saturday) day=5;;
esac

echo "                    CPU     %user     %nice   %system   %iowait    %steal     %idle" > /tmp/sar-report

for file in `ls /var/log/sa | grep -v sar`
do
    sar -f /var/log/sa/$file | grep Average | sed "s/Average/${weekday[$day]}/" >> /tmp/sar-report
    day=`expr $day + 1`
    if [ $day -gt 6 ]; then
        day=0;
    fi
done

# email the report
cat /tmp/sar-report | mailx -s "Sar Report for `uname -n`" $admin

Some explanation is in order. You'll notice that we create an array containing the days of the week. I've added extra spaces so that the report columns will line up regardless of how many letters are in the names of the weekdays.

The case statement sets up the expectation for changing the word "Average" in each line to the day of the week that the data represents. If today is Sunday, the first day we'll see in the report is Saturday, the last Sunday (today). This will make the report easier to read than it would be if every line just said "Average". The output will look like this:

                    CPU     %user     %nice   %system   %iowait    %steal     %idle
Saturday :          all      0.11      0.00      0.04      0.02      0.00     99.83
Sunday   :          all      0.18      0.00      0.05      0.05      0.00     99.72
Monday   :          all      0.10      0.00      0.04      0.02      0.00     99.83
Tuesday  :          all      0.13      0.00      0.04      0.02      0.00     99.80
Wednesday:          all      0.14      0.00      0.05      0.03      0.00     99.78
Thursday :          all      0.12      0.00      0.05      0.03      0.00     99.80
Friday   :          all      0.12      0.00      0.05      0.02      0.00     99.81
Saturday :          all      0.10      0.00      0.04      0.02      0.00     99.83
Sunday   :          all      0.21      0.00      0.06      0.06      0.00     99.68

Every day, we'll receive email showing how busy the CPU has been on average for the prior nine days if we set up a cron job like this:

# send performance report
45 23 * * * /usr/local/bin/showPerf 2>&1 /dev/null

Notice that the report is generated after the sar collection runs at 11:45 PM so that it will show averages for nine complete days.

The sar command can also provide you with memory statistics. The sar -r report below shows a number of statistics related to memory usage.

# sar -r 10 2
Linux 2.6.18-128.el5 (boson)     07/22/2012

05:01:32 PM kbmemfree kbmemused  %memused kbbuffers  kbcached kbswpfree kbswpused  %swpused  kbswpcad
05:01:42 PM  22267292  14770512     39.88    377512  11499048  16778232         0      0.00         0
05:01:52 PM  22267360  14770444     39.88    377512  11499052  16778232         0      0.00         0
Average:     22267326  14770478     39.88    377512  11499050  16778232         0      0.00         0

In this display, kbmemfree is the amount of free memory in kilobytes, kbmemused is the amount of memory in use in kilobytes, %memused is the percentage of how much memory in use, kbbuffers is the amount of buffer space used by the kernel in kilobytes, kbcached is the amount of cached space used by the kernel, kbswpfree is the amount of free swapfile space, kbswpused is the amount of used swapfile space, %swpused is percentage of swap in use, and kbswpcad is the amount of cached swap memory. All measurements are in kilobytes except, ofr course, for the percentages. With -r 10 2, sar is showing us two intervals of ten seconds. In this display, we can see that we have a whopping 22 Gbytes of free memory (the numbers shoiwn are in kilobytes). Further, the kbswpused (kilobytes of swap used) and %swpused (percentage of swap used) columns show us that the system is not swapping. A nearly identical script to that shown below will send up memory stats for the previous nine days.

#!/bin/bash
# showMem

admin="YOUR-EMAIL@YOUR-COMPANY"

declare weekday=('Sunday   ' 'Monday   ' 'Tuesday  ' 'Wednesday' 'Thursday ' 'Friday   ' 'Saturday ')

case `date +%A` in
  Sunday) day=6;;
  Monday) day=0;;
  Tuesday) day=1;;
  Wednesday) day=2;;
  Thursday) day=3;;
  Friday) day=4;;
  Saturday) day=5;;
esac

echo "              kbmemfree kbmemused  %memused kbbuffers  kbcached kbswpfree kbswpused  %swpused  kbswpcad" > /tmp/mem-report

for file in `ls /var/log/sa | grep -v sar`
do
    sar -r -f /var/log/sa/$file | grep Average | sed "s/Average/${weekday[$day]}/" >> /tmp/mem-report
    day=`expr $day + 1`
    if [ $day -gt 6 ]; then
        day=0;
    fi
done

cat /tmp/mem-report | mailx -s "Sar Memory Report for `uname -n`" $admin
The data from the memory report will look something like this:
              kbmemfree kbmemused  %memused kbbuffers  kbcached kbswpfree kbswpused  %swpused  kbswpcad
Saturday :     19635983  17401821     46.98    349700  14420850  16777208         0      0.00         0
Sunday   :     18772368  18265436     49.32    358554  15115530  16777208         0      0.00         0
Monday   :     19628533  17409271     47.00    362235  14421599  16777208         0      0.00         0
Tuesday  :     19003475  18034329     48.69    364537  14412879  16777208         0      0.00         0
Wednesday:     18061655  18976149     51.23    367138  14444925  16777208         0      0.00         0
Thursday :     18032747  19005057     51.31    369775  14533945  16777208         0      0.00         0
Friday   :     17992031  19045773     51.42    372410  14561208  16777208         0      0.00         0
Saturday :     17985615  19052189     51.44    375354  14593718  16777208         0      0.00         0
Sunday   :     14946674  22091130     59.64    384016  17453476  16777208         0      0.00         0
Add the second script to your crontab file:
# send performance report
45 23 * * * /usr/local/bin/showPerf 2>&1 /dev/null
45 23 * * * /usr/local/bin/showMem 2>&1 /dev/null
If you quickly review reports like these on a daily basis, you're bound to notice how much "off" the numbers are when your system is having performance problems.

Top 10 Hot Internet of Things Startups
Join the discussion
Be the first to comment on this article. Our Commenting Policies