Unix Tip: Tracking down disk usage
Send in your Unix questions today! |
See additional Unix tips and tricks
Whenever you run across a root file system that's 100% full, the first thing you are likely to do is ascertain which of many directories are actually stored, rather than simply mounted, within /. Variations of the df and du command are likely to come in handy. However, the process of detailing how disk space is being used and identifying files or subdirectories that can be removed often takes a lot longer than most of us would like.
In this week's column, we'll look at a simple script that can help you determine which directories in / are abnormally large.
#!/bin/bash
# chkSpace
for DIR in `ls`
do
if [ -d $DIR ]; then
cd $DIR
FS=`df -k . | tail -1 | awk '{print $NF}'`
if [ $FS == "/" ]; then
cd ..
du -sk $DIR
else
cd ..
fi
fi
done
|
The chkSpace script is fairly straightforward. It moves into each directory within the directory from which it is run (in this case, /) and uses the "df -k" command to verify that the directory is part of the root file system and then uses the "du -sk" command to determine how much space that directory occupies. When the script completes, you'll have a listing that looks like this:
# /export/home/shs/chkSpace 1 bin 1 cdrom 1642 dev 41 devices 20968 etc 5089814 export 32587 kernel 1 lib 8 lost+found 1 mnt 1 nsmail 629002 opt-bkp 35149 platform 17893 sbin 1 space 1 u01 1 u03 1 u04 1 u05 3445206 usr 914106 var-bkp |
In this example, we can see that someone has made a backup of /var in root and named the backup directory var-bkp. We can verify that this is what this directory is by examining its contents. The age of the files in this case showed that none of the files were current or of any value.
Removing the old backup of /var brought the usage of / down to 91% with over a gigabyte of free space -- a much healthier condition for root.
The first probe of your overfull file system might not give you all of the detail you would like to see. For example, if some particular directory in root is using far too much space, you might still want to drill down further to get an idea which particular subdirectory accounts for the extra disk usage. If this is the case, you can simply cd into the problematic directory and run the script again. Each time you find a directory that's abnormally large, you can repeat the process.
Two useful modifications to this script would allow 1) any file system to be checked and 2) the output of the script to be listed in size order. The following version of the script incorporates both of these changes.
#!/bin/bash
if [ $# == 1 ]; then
cd $1 || exit 1
fi
for DIR in `ls`
do
if [ -d $DIR ]; then
cd $DIR
FS=`df -k . | tail -1 | awk '{print $NF}'`
if [ $FS == "$1" ]; then
cd ..
du -sk $DIR >> /tmp/space$$
else
cd ..
fi
fi
done
sort -n /tmp/space$$
# clean up
rm /tmp/space$$
|
We might also want to include large files in our output since they can account for an excessive amount of disk space usage as easily as the contents of directories. Adding files to our analysis, however, could mean we end up with far too much output to look at, so the following version of the script only displays the largest ten entries in the output.
#!/bin/bash
if [ $# == 1 ]; then
cd $1 || exit 1
fi
for FILE in `ls`
do
if [ -d $FILE ]; then
DIR=$FILE
cd $DIR
FS=`df -k . | tail -1 | awk '{print $NF}'`
if [ $FS == "$1" ]; then
cd ..
du -sk $DIR >> /tmp/space$$
else
cd ..
fi
else
du -sk $FILE >> /tmp/space$$
fi
done
sort -n /tmp/space$$ | tail -10
# clean up
rm /tmp/space$$
|
The output from this version of the script run on a troublesome /var might look like this:
# ./chkSpace4 /var 131 snmp 985 cron 1313 adm 1554 cpudiag 14425 apache 23512 logs 22822 analog 225094 tmp 266919 sadm 860401 spool |
Two items in this particular output are suspicious -- the 225 MB used up by /var/tmp and the /var/log entry (not /var/log) which turned out to be a compressed access log file, undoubtedly resulting when someone ran a "mv access_log.gz /var/logs" command when they meant to use /var/log.
Tracking down disk space problems can be very tedious work. I've found that scripts like those described in this column save me a lot of time.