July 08, 2009, 6:54 PM — This kind of problem can occur because processes can create and hold an open file descriptor, preventing the file's inode from being relinquished and the associated disk space freed up. If you can't kill the process, you might be stuck until you can -- even if that means putting up with scrolling errors that make it nearly impossible to type commands on the console.
A better way to proceed when you notice a file is hogging disk space is to cat /dev/null to the file. This will generally free the space immediately while not requiring the process to be killed and restarted.
The syslog file, for example, is held open by the syslog daemon, syslogd. If this file gets so large that the /var partition on your system is choking, you won't have much luck releasing the space it occupies with an rm command, but you can always cat /dev/null to the file to regain the space. Other files that might benefit from the same procedure include auth_log, messages, sulog and wtmpx.
If prior syslog files (i.e., syslog.0 through syslog.7) exist and are correspondingly large, you should clearly start with these files. Since they are historical "rolled over" files, syslogd is no longer using them and they can be removed without side effects.
# lsof syslog COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME syslogd 137 root 8w VREG 85,44 0 18059 syslog
Here we have an example of removing a wtmpx file that has been ignored and has grown far too big over the span of several years. Notice that nearly 300 MB is recovered. That's a lot in a file system that's only 1 GB in size.
# lsof wtmpx COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME utmpd 235 root 3r VREG 85,44 1488 7760 wtmpx # df -k . Filesystem kbytes used avail capacity Mounted on /dev/md/dsk/d44 1021735 607960 352471 64% /var # cat /dev/null > wtmpx # df -k . Filesystem kbytes used avail capacity Mounted on /dev/md/dsk/d44 1021735 323688 636743 34% /var
Now let's do it the wrong way on a similar system. This time we remove the file and no change in the available (or used) space is evident.
# lsof wtmpx COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME utmpd 230 root 3r VREG 85,44 493644 7760 wtmpx # df -k . Filesystem kbytes used avail capacity Mounted on /dev/md/dsk/d44 1021735 311676 648755 33% /var # rm wtmpx # df -k . Filesystem kbytes used avail capacity Mounted on /dev/md/dsk/d44 1021735 311676 648755 33% /var
If a file isn't being held open by a process, the lsof command will not generate any output as shown in the example below.
# lsof /var/log/syslog.? # lsof /var/adm/lastlog # ls /var/adm/mess* /var/adm/messages /var/adm/messages.1 /var/adm/messages.3 /var/adm/messages.0 /var/adm/messages.2 # lsof /var/adm/mess* COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME syslogd 241 root 9w VREG 32,0 0 14659 /var/adm/messages
Of all the files above, only the current messages file is open. The others can be removed with impunity. The messages files is better removed using /dev/null.