Unix Tip: Reading and Writing with File Descriptors

By Sandra Henry-Stocker, ITworld.com |  Small Business 2 comments

Send in your Unix questions today!


See additional Unix tips and tricks


Have you ever wanted to read a file one line at a time in a shell script and found the task to be a lot more trouble than you ever imagined? If you use a "for line in `cat file`" loop, for example, each pass through the loop is going to set your $line variable to, not the next line in the file, but the next word in the current line. One handy way around this problem is to assign a file descriptor to your input file and then use that file descriptor to read the file line by line. Here's how it works.


Reading from a File Descriptor



First, you can assign a file descriptor to an input file with this syntax:



$ exec 3< infile



where 3 is the file descriptor we have chosen to use and file is the name of the file.
Under most circumstances, we should avoid the use of file descriptors 0, 1 and 2, since these file descriptors are assigned to standard in, standard out and standard error. In the example shown, each time we read from infile, our position in the file will be incremented by one line.



To read from infile, we would then use a command like this:



$ read <&3 myline



The variable "$myline" will then contain the first line from the file. Repeat the read command and we will get the second line and so on. We might, therefore, use a script like that shown below to read and process each line in a file. Notice the comments inserted into the spot where we would process each line in some way in whatever way we want.
#!/bin/bash

if [ $# != 1 ]; then
    echo "Usage: $0 input-file"
    exit 1
else
    infile=$1
fi

# assign file descriptor 3 to input file
exec 3< $infile

# read til the end of the file
until [ $done ]
do
    read <&3 myline
    if [ $? != 0 ]; then
        done=1
        continue
    fi
    # process file data line-by-line
    # in here ...
    echo $myline
done

echo "The End"

This particular script is written in bash, but the exec command will work just the same in the Korn shell (/usr/bin/ksh) and Bourne shell (/usr/bin/sh). One difference between these shells with respect to user-defined file descriptors, however, is that the Korn shell (on Solaris anyway) only allows us to assign file descriptors 0 through 9 while bash and the Bourne shell work with file descriptors as high as 255.



If we issue a read command on a file descriptor without assigning a variable to hold the resultant data, we will find the data in $REPLY as shown in this interactive sequence of commands:

bash$ exec 3< infile
bash$ read <&3
bash$ echo $REPLY
One is the loneliest number that you'll ever do
bash$ read <&3
bash$ echo $REPLY
Two can be as bad as one
bash$ read <&3
bash$ echo $REPLY
It's the loneliest number since the number one

Notice how we are stepping through the lines in the file.



Writing to a File Descriptor



To assign a file descriptor to an output file, we use a very similar syntax to the exec command that we just used to read from a file:
$ exec 4> $outfile



In this command, file descriptor 4 is being assigned to $outfile. To write to this file, we would then use commands like these:



echo "File $file is unwritable" >&4
echo "Using /tmp/$file instead" >&4



When we examine the contents of $outfile, it will contain lines like these:
File data0506.log is unwritable

Using /tmp/data0506.log instead



Listing File Descriptors



On Solaris systems, we can list the file descriptors associated with a particular process with the pfiles command. If we were to run the two exec commands shown above on the command line, for example, we would see something like this when we run pfiles on the current shell:

$ pfiles $$
26036:  -bash
  Current rlimit: 32768 file descriptors
   0: S_IFCHR mode:0620 dev:102,0 ino:394018 uid:1111 gid:7 rdev:24,1
      O_RDWR|O_LARGEFILE
   1: S_IFCHR mode:0620 dev:102,0 ino:394018 uid:1111 gid:7 rdev:24,1
      O_RDWR|O_LARGEFILE
   2: S_IFCHR mode:0620 dev:102,0 ino:394018 uid:1111 gid:7 rdev:24,1
      O_RDWR|O_LARGEFILE
   3: S_IFREG mode:0644 dev:0,2 ino:476081584 uid:0 gid:1 size:1531
      O_RDONLY|O_LARGEFILE
   4: S_IFREG mode:0644 dev:0,2 ino:479974080 uid:1111 gid:10 size:0
      O_WRONLY|O_LARGEFILE
 255: S_IFCHR mode:0620 dev:102,0 ino:394018 uid:1111 gid:7 rdev:24,1
      O_RDWR|O_LARGEFILE FD_CLOEXEC

Notice file descriptors 3 and 4 in this output.

2 comments

    Anonymous 1 year ago
    Wouldn't it be much simpler to use a while loop? This should be equivalent to the loop in the article:# read til the end of the filewhile read <&3 mylinedo # process file data line-by-line # in here ... echo $mylinedone
    Anonymous 3 years ago
    I searched 2 hours to find this example. It is clear to the point and helped me a lot. Thanks, Peter.

      Add a comment

      Post a comment using one of these accounts
      Or join now
      At least 6 characters

      Note: Comment will appear soon after you have activated your account.
      Obscene/spam comments will be removed and accounts suspended.
      The information you submit is subject to our Privacy Policy and Terms of Service.

      ITworld LIVE

      Small BusinessWhite Papers & Webcasts

      White Paper

      Microsoft Volume Licensing Comparison - Small/Med. Business

      This quick-reference document lets small and medium organizations (i.e. those with five or more devices) to easily compare the available Microsoft Volume Licensing programs to create a simple, cost-effective and flexible way to benefit from volume licensing.

      White Paper

      ESG: Oracle Database Appliance: A Simple, Economical Option for SMBs and Independent Software Vendors

      Read this technology overview of a DBMS built for SMBs that provides a rapidly-deployable, highly-available platform at an affordable cost

      See more White Papers | Webcasts

      Ask a question

      Ask a Question