Using find to locate files

Unix Insider –

The

<font face="Courier">find</font>
command is one of the most powerful tools available to a Unix system administrator, but its command syntax is awkward and often poorly explained. A typically cryptic description of
<font face="Courier">find</font>
goes something like this:

<font face="Courier">find path-list expression</font>

After chasing through the man pages you find that an expression can be either a criteria for selecting a file or an action to perform on a file.

It is possible to simplify the

<font face="Courier">find</font>
syntax. The
<font face="Courier">find</font>
command will search from a starting directory down through subdirectories, locating files that match your specified search criteria. Find will then execute a command on the found file.

Though the man pages are technically correct in stating that the

<font face="Courier">find</font>
command has only three parts, it is useful to think of it as having four:

1 2 3 4
find starting where find which files do what

In more detail, the parts are:

  1. The
    <font face="Courier">find</font>
    command itself (the word "
    <font face="Courier">find</font>
    "), that is needed to start the program.
  2. The directory from which to start searching. This can be more than one directory, but you will usually use only one starting directory.
  3. Which files to find. The search criteria can be specified by file name, size, type, and many other categories which I will discuss in a moment.
  4. The last part of the command contains what to do with the file when found. There is almost no limit to what you can do to a file. This portion of the command may include any Unix command. When used as a file locator, this part of the command usually specifies that the file path, name, and other information are to be printed on the screen or to a file.

Locating files

The following is an example of a

<font face="Courier">find</font>
command that will locate all files named "minutes.txt." The search starts at your own home directory and works its way down through subdirectories. On each 'hit' it prints the file name of the file found.

<font face="Courier">find $HOME -name minutes.txt -print</font>

In this example part 1 of the command is the

<font face="Courier">find</font>
command itself. Part 2 is the starting directory, $HOME. Part 3 lists the files to find, and this part of the command is "-name minutes.txt." The "do what" portion of the command is "-print."

Here are the three most common starting directories.

<font face="Courier">$HOME	the users home directory
.   	the current directory
/    	the root directory (searches the whole system)
</font>

Using

<font face="Courier">find</font>
to locate files by name is probably the most common use for the command. It is so common that there are some quick points here on the use of the -name option with the command. This option is followed by the name you want to locate. The file name that you want to find may contain wildcards as in *.txt to search for all files ending in .txt. When a wild card is included in a file name, the shell attempts to expand these wild cards before giving the arguments to
<font face="Courier">find</font>
. You want
<font face="Courier">find</font>
to receive the argument exactly as "*.txt" and not as an already expanded list of file names such as notes.txt, mydoc.txt, and so on.

In order to protect the asterisk from being expanded by the shell, it is necessary to use a backslash to escape the asterisk as in:

<font face="Courier">find $HOME -name \*.txt -print</font>

The backslash in front of the asterisk prevents *.txt from being expanded by the shell. Find receives the argument as "-name *.txt", which is what you wanted in the first place. The same applies to the "?" wildcard. The following will locate all files with a one character extension and will print their names.

<font face="Courier">find $HOME -name \*.\? -print</font>

Note the escapes in front of both the asterisk and the question mark. This simple syntax allows you to create a file locating utility that can be used to track down lost files.

Here I use a shell script named

<font face="Courier">findfile</font>
, which can be used as a quick way of entering a file search (see Listing 1). Use vi to create the file and then make it executable by entering:

<font face="Courier">chmod a+x findfile </font>

In fact most of this shell script is the error checking logic and usage information.

Listing 1 --

<font face="Courier">findfile</font>

<font face="Courier">#!/bin/sh
# -----------------------------------------------------
# findfile file locator
#
# syntax:
#    findfile filespec
# -----------------------------------------------------
# The number of command line arguments must be exactly 1. Otherwise
# display a usage message. If the arguments are OK run the find
# starting at the top of the directory tree /

if  [ $# -ne 1  ]
  then echo "syntax:"
    echo "   findfile file-specification"
    echo
    echo " The file-specification must be either"
    echo " a file name or a file spec containing"
    echo " wildcards. If wildcards are included,"
    echo " they must be preceded with a \ (backslash)."
    echo
    echo " examples:"
    echo
    echo "      findfile mfile.txt"
    echo
    echo "      findfile \*.c"
else
    find / -name $1 -print
fi</font>

The program tests that the user has entered one file specification with the test command:

<font face="Courier">if  [ $# -ne 1  ]</font>

Note that the opening bracket ([) is followed by a space, and the closing bracket is preceded by a space. The $# is a shell variable that contains the number of arguments on the command line used to start the shell script. The -ne condition tests for "not equal." If the number of arguments on the command line does not equal 1 then a syntax/usage type message is displayed, otherwise a

<font face="Courier">find</font>
request is launched using the root directory "/" as the starting point.

Access denied

Security considerations prevent

<font face="Courier">find</font>
from having access to all directories. Unless you are root,
<font face="Courier">find</font>
will probably display error messages indicating that it could not gain access to various directories. Once you have seen what else
<font face="Courier">find</font>
can do, you will understand why users should not be allowed to run wild with it. Here is an example of what your screen might look like during a
<font face="Courier">find</font>
search (executed from a Bourne shell):

<font face="Courier">congo$ findfile \*.txt
find: cannot open /etc/auth
find: cannot chdir to /etc/ps
find: cannot chdir to /rcd0/rc
/usr/tom/minutes.txt
/usr/tom/logs.txt
/usr/sally/expense.txt
find: cannot open /usr/sam
find: cannot open /usr/theboss
 </font>

This display is a combination of two types of messages. The "

<font face="Courier">find:</font>
cannot open or chdir" messages are error messages caused by some inability to access a directory or a file.

The filename messages such as "/usr/tom/minutes.txt" are the output of the "find -print" option.

Error messages can clutter up a display. A user, other than a system administrator, often cannot get into every directory, prompting many instances of the "cannot find" error message. The easiest solution is to suppress the error messages "cannot find" and "cannot chdir" by redirecting errors away from the screen. Errors can be sent to /dev/null by changing the

<font face="Courier">find</font>
line in
<font face="Courier">findfile</font>
(see Listing 1) to read:

<font face="Courier">find / -name $1 -print 2>/dev/null</font>

The 2>/dev/null will cause messages that are supposed to be printed on stderr (errors messages) to be sent to the null device. The null device is a long way of saying "nowhere." Those messages simply disappear. Listing 2 incorporates this change in the shell script.

Listing 2 --

<font face="Courier">findfile</font>
revisited

<font face="Courier">#!/bin/sh
# ----------------------------------------------------------
# findfile file locator
#
# syntax:
#    findfile filespec
# ----------------------------------------------------------
# The number of command line arguments must be exactly 1. Otherwise
# display a usage message. If arguments are OK run the find starting
# at the top of the directory tree /
if  [ $# -ne 1  ]
  then echo "syntax:"
    echo "   findfile file-specification"
    echo
    echo " The file-specification must be either"
    echo " a file name or a file spec containing"
    echo " wildcards. If wildcards are included,"
    echo " they must be preceded with a \ (backslash)."
    echo
    echo " examples:"
    echo
    echo "      findfile mfile.txt"
    echo
    echo "      findfile \*.c"
else
    find / -name $1 -print 2>/dev/null
fi</font>

Multiple options

Now we will expand on

<font face="Courier">find</font>
further. What else can you do to a file after you have found it? The sky is the limit if you have the access privileges. You have seen -print as one option. The other powerful option is -exec.

The syntax for the exec option is:

<font face="Courier">     -exec a_command \;</font>

Note that the backslash and the semicolon must be included as shown. Usually you want to execute a command on the file just found. The

<font face="Courier">find</font>
program uses a left and right curly brace ({}) to represent the name of the file just found as in:

<font face="Courier">     -exec a_command {} \;</font>

Looking at a practical example makes this clearer. Going back to Listing 2. Let's assume that instead of just printing the file name of any file found, we want a full ls -l style listing for that file. The version of a

<font face="Courier">find</font>
command that would do this is:

<font face="Courier">     find / -name $1 -exec ls -l {} \;</font>

In English this would read: start searching from the top of the directory tree for any file named as given in the passed argument. When a file is found, execute an ls -l filename command on that file. This repeats listing 1 using the new syntax and the /dev/null trick. Screen 2 is an example of what the screen output might look like when searching for \*.txt.

Listing 3 --

<font face="Courier">findfile redux</font>

<font face="Courier">#!/bin/sh
# ----------------------------------------------------------
# findfile file locator
#
# syntax:
#    findfile filespec
# ----------------------------------------------------------
# The number of command line arguments must be exactly 1.  Otherwise
# display a usage message. If arguments are OK run the find starting
# at the top of the directory tree /

if  [ $# -ne 1  ]
  then echo "syntax:"
    echo "   findfile file-specification"
    echo
    echo " The file-specification must be either
    echo " a file name or a file spec containing
    echo " wildcards. If wildcards are included,"
    echo " they must be preceded with a \ (backslash)."
    echo
    echo " examples:"
    echo
    echo "      findfile mfile.txt"
    echo
    echo "      findfile \*.c"
else
    find / -name $1 -exec ls -l {} \; 2>/dev/null
fi
</font>

Here is what your screen might look like when searching for \*.txt, using the

<font face="Courier">findfile</font>
as in Listing 3:

<font face="Courier">congo$ findfile \*.txt

-rw-r--r-- 1   tom     group  1544  Jun 12 1997
/usr/tom/minutes.txt
-rw-r--r-- 1   tom     group  1087  Jan  1 1997
/usr/tom/logs.txt
-rw-rw-rw- 1   sally   group  1226  Jan  6 1997
/usr/sally/expense.txt

</font>

The syntax for an exec is awkward, but easy to follow once you have the hang of it. It is the -exec flag followed by a Unix command containing {} if the name of the file is used in the command, followed by \; to close the -exec command.

User, size, atime: finding more than just names

We have only looked at -name as the method of matching searched files, but

<font face="Courier">find</font>
provides a set of other search flags that go beyond -name.

You are working peacefully at your desk one day when the intercom buzzes. N.E. Programmer who was the lead programmer on the new Widget account is no longer with the company. Your job: clean up any loose ends he may have left behind. After cleaning up his home directory and files, the next step would be searching the system for files that he created. To do this use the -user flag to search for files owned under his logon initials, nep.

<font face="Courier">find / -user nep -exec ls -l {} \; >nepfiles.txt</font>

In English: search from the root directory for any files owned by nep and execute an ls -l on the file when any are found. Capture all output in nepfiles.txt.

This follows the four-part command structure of

<font face="Courier">find</font>
that was discussed at the beginning of this article. The
<font face="Courier">find</font>
command itself is the first part, the root directory is the point at which to begin the search, the files to search for are "-user nep" and finally, the "do what" part says that when any matching files are found, execute an ls -l on the file.

Another useful search option is finding files by size. The -size flag allows you to do this. It is very useful for locating large files on your system. The default units for -size is blocks and allows you to search for files by size in number of blocks. I find it easier to think in bytes, and there are options to allow searches using bytes. First we'll look at the defaults.

<font face="Courier">find . -size 4 -print</font>

This command will print the names of all files that are four blocks long, using the current directory as a starting point. You may add a + or - (minus) in front of the number, to specify greater than or less than. The following finds files larger than 20 blocks.

<font face="Courier">find . -size +20 -print</font>

If you add a "c" after the number, the number is interpreted as characters (bytes) instead of blocks. The following command will find all files larger than one million bytes.

<font face="Courier">find / -size +1000000c -print</font>

This alone can be used to create a useful utility that will search your system for large files, but if we look at another

<font face="Courier">find</font>
option first we can create something more flexible.

The last access time can also be tested using the -atime switch.

<font face="Courier">find / -atime 2 -print</font>

This command finds files accessed two day ago (the day before yesterday). An additional + and - (minus) can also be used here for greater than and less than.

<font face="Courier">find / -atime +30 -print</font>

This prints files that have not been accessed in the last 30 days.

The

<font face="Courier">find</font>
search criteria can be combined. The following command will locate and list all files that were last accessed more than 100 days ago, and whose size exceeds 500,000 bytes.

<font face="Courier">find / -atime +100 -size +500000c -print</font>

Again the four-part syntax of

<font face="Courier">find</font>
holds here, but the search criteria in part 3 has become the combined: "-atime +100 (and) -size +500000c."

By combining these two

<font face="Courier">find</font>
command options, you can track down large files that are not used: the files that uselessly chew up disk space.

The

<font face="Courier">findfat</font>
shell script listed below will accept age and bytes on the command line. The error handling for missing command arguments is more useful. If the arguments are missing, the script asks the user to enter the values. The disk is searched for these large old files, and a detailed directory entry is displayed for any found.

Listing 3 --

<font face="Courier">findfat</font>
locates bloated files

<font face="Courier"># ----------------------------------------------------------
# findfat file locator
#
# syntax:
#    findfat age bytes
# ----------------------------------------------------------
# if the number of arguments is not 2, then ask
# the user to enter the parameters.
# The parameters are number of days to use to consider a file
# old, and number of bytes to use to consider a file fat.
if  [ $# -ne 2  ]
      then
        echo "How many days make a file old?"
        read age
        echo "How many bytes make a file fat?"
        read bytes
else
        age=$1
        bytes=$2
fi

echo Locating files older than $age days and larger than
$bytes bytes
find / -atime +${age} -size +${bytes}c -exec ls -l {} \;
2>/dev/null

</font>

I hope that I have provided you with some useful utilities and enough information to illustrate some of the basics on

<font face="Courier">find.</font>
Using the four-part command approach to understanding
<font face="Courier">find</font>
should also make it easier for you to read the
<font face="Courier">find</font>
man page entry and understand what it is doing. Happy hunting!

What’s wrong? The new clean desk test
You Might Like
Join the discussion
Be the first to comment on this article. Our Commenting Policies