Missing disk space Linux/Unix: when df disagrees with du -s

A common situation many admins find themselves in is where they quickly have to clear down disk space.  So for instance, say /u01 is filling up.  The Oracle admin knows that the database will simply stop if he doesn’t take action quickly.  With the judicious use of du -s he finds some large directories and quickly deletes a few temporary files he know the database doesn’t immediately need.  He does a ‘df -h’ to find that it hasn’t made any difference!  He then does his ‘du -s’ and it shows the space has been freed up.  He doesn’t know it, but he has deleted at least one open file whose space won’t be freed up until the process is closed.  What he should have done is this:

echo "" > offendingfile

where offendingfile is the huge file.

In the case of the Oracle admin it’s likely his only choice is to restart the database.  Consider a more general case where a Linux/Unix admin has deleted files but has lost track of where the files were and what might be using them.  Or one admin deleted the files and scarpered leaving the other trying to clean up the mess.  He is left with the bigger challenge of trying to find what process is holding what files open.

A starting point: lsof

The lsof command can be a good starting point, however you are now looking for a needle in smaller haystack, so you will have to do some further filtering.  On CentOS 6 it will mark files which have been deleted, however it seems to throw up quite a few false positives.

To illustrate the problem of open files I have created some C code which will create a big file and sleep for 1,000 seconds.  Compiling and running the binary I will get a 10 Mbyte file:

/var/tmp/SampleBigFile

If I then remove the file I have then created the situation described above.  On CentOS 6 I could run:

lsof | fgrep '(deleted)'

but that produces 24 results (among which are files that haven’t been deleted, like /usr/bin/gnome-screensaver), so it would be a good idea to shrink the range.  For instance it’s likely in this situation that is just one file system that is full so you could grep for its mount point.  That does it nicely in our example:

[root@centos6 ~]# lsof | fgrep '(deleted)' | fgrep /var
createope 11012 admin 3u REG 253,3 10485761 693 /var/tmp/SampleBigFile (deleted)
[root@centos6 ~]#

In MacOS (Darwin) there is no ‘(deleted)’ label so go straight for checking for /var:

vger:~ root# lsof | egrep 'REG.*/var/tmp'
mysqld    346 _mysql 4u  REG 14,18        0 6217706 /private/var/tmp/ibu4Nw9X
mysqld    346 _mysql 5u  REG 14,18        0 6217707 /private/var/tmp/ib6jCfyT
mysqld    346 _mysql 6u  REG 14,18        0 6217708 /private/var/tmp/ibu9Zqxb
mysqld    346 _mysql 7u  REG 14,18        0 6217709 /private/var/tmp/iboukiVq
mysqld    346 _mysql 11u REG 14,18        0 6217710 /private/var/tmp/ibLRW39J
createope 42775 admin 3u REG 14,18 10485761 6308941 /private/var/tmp/SampleBigFile
vger:~ root#

(REG indicates a regular file.)  While our big file is clearly identifiable here, if it wasn’t you could try something like sort -k7 to sort on file size.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.