Command line psychology 101

Unix Insider –

One of the mysteries of Unix (aside from Unix itself) is the command line, filled as it is with strange and cryptic characters. Now let's see, do I need a dot between two backslashes, or should it be a backwards quote followed by a hyphen?

One thing that will help sort out what is actually happening in a command line, and maybe even help you construct one of your own, is an understanding of how the command line is interpreted.

The command line is actually the input to the shell program. The shell program (sh, ksh, csh, or any other variant) reads the input line and untangles it before it attempts to execute the command. The sequence of steps the program goes through to untangle a command provides interesting insight into shell programming. By studying it you're sure to learn some new tricks.

We will cover these pieces in more detail in a moment, but first let's take a look at the sequence of evaluation of a command line:

  1. History substitution (except for the Bourne shell)
  2. Splitting words, including special characters
  3. Updating the history list (except for the Bourne shell)
  4. Interpreting single and double quotes
  5. Alias substitution (except for the Bourne shell)
  6. Redirection of input and output (
    <font face="Courier">< ></font>
    and
    <font face="Courier">|</font>
    )
  7. Variable substitution (variables starting with
    <font face="Courier">$</font>
    )
  8. Command substitution (commands inside back quotes)
  9. File name expansion (file name wild cards)

You will note that the Bourne shell is missing the steps to do with history and alias substitution. These are not included in the Bourne shell.

History substitution

If you have history set up in the Korn shell (ksh), C shell (csh), or any similar shell, command lines are saved in a history file before they are executed. You can review your previous commands by typing:

<font face="Courier">
$ history
</font>

The list of commands is preceded by a number, as in:

<font face="Courier">
13 ls *.txt
14 cd $HOME
15 ls *.log
</font>

In Korn shell you can usually recover a history command by typing

<font face="Courier">r</font>
followed by the number. For example, typing
<font face="Courier">r 13</font>
in the example above would repeat the command
<font face="Courier">ls *.txt</font>
.

In the C shell, use an exclamation point and no space instead of an

<font face="Courier">r</font>
:
<font face="Courier">!13</font>
.

When processing a command line, the shell first checks for these command substitutions, looks them up in the history file, recovers each command, then creates a new command line with each. There is much more to history than these simple steps, but we'll save that for a separate article.

Splitting words

The next step is to separate the words and special characters into words. A word is basically a token that is recognized by the shell program as an element of a command. For example, the following command does a long listing of the current directory and searches for

<font face="Courier">mjb</font>
in any line of the directory information.

<font face="Courier">
ls -l|grep mjb
</font>

The words in the command are

<font face="Courier">ls</font>
,
<font face="Courier">-l</font>
,
<font face="Courier">|</font>
,
<font face="Courier">grep</font>
, and
<font face="Courier">mjb</font>
. A word can also be a quoted string. In the following command, a long directory listing is searched for files created on "Sep 07."

<font face="Courier">
ls -l|grep "Sep 07"
</font>

In this case the words are

<font face="Courier">ls</font>
,
<font face="Courier">-l</font>
,
<font face="Courier">|</font>
,
<font face="Courier">grep</font>
, and
<font face="Courier">Sep 07</font>
. Note that
<font face="Courier">Sep 07</font>
is treated as one word because it was quoted in the command.

Update the history list

Once the words are identified, the command is written to the end of the history file. (Assuming that you're using history.)

Single and double quotes

Where a word has been surrounded by double or single quotes, the word is tagged so that variable expansion either does or does not occur within the quotes. Variables that are surrounded by single quotes will be left as is, and variables in double quotes will be expanded. To see this difference for yourself, enter the following commands with no quotes, single quotes, and double quotes.

<font face="Courier">
echo $PATH
echo '$PATH'
echo "$PATH"
</font>

The first will display the value of the

<font face="Courier">$PATH</font>
variable. The second will display the word
<font face="Courier">$PATH</font>
, and the third will again display the value of the
<font face="Courier">$PATH</font>
variable, as in the following examples:

<font face="Courier">
echo $PATH
/bin:/usr/bin:/my/bin:.

echo '$PATH'
$PATH

echo "$PATH"
/bin:/usr/bin:/my/bin:.
</font>

As an exercise, also try the following commands using double quotes around single quotes, and single quotes around double quotes. When in doubt about the effect of something on the command line, experiment.

<font face="Courier">echo $PATH
/bin:/usr/bin:/my/bin:.

echo '"$PATH"'
"$PATH"

echo "'$PATH'"
'/bin:/usr/bin:/my/bin:.'
</font>

Alias substitution

The first word of each command is checked against the alias list. In the example we have been following,

<font face="Courier">ls</font>
and
<font face="Courier">grep</font>
are checked in the alias list, and any alias substitution is performed. I haven't discussed aliases much in previous articles so I will elaborate a bit here. An alias is a method of substituting one command for another. For example the following command:

<font face="Courier">
alias ll 'ls -l'
</font>

creates a new command, so that if you type:

<font face="Courier">
ll *.txt
</font>

it is the equivalent of typing

<font face="Courier">ls -l *.txt</font>
.

Alias substitution is done at this stage of the shell processing.

Pipes and redirection

At this point the shell program looks through the words of the command for

<font face="Courier">|</font>
,
<font face="Courier">></font>
,
<font face="Courier"><</font>
,
<font face="Courier">>></font>
, and other redirection commands. When it finds one it creates the pipe or establishes the redirection.

Variable expansion

Now at last the variables in the command line are expanded. Assuming the variable is not surrounded by single quotes,

<font face="Courier">$PATH</font>
(or any other variable in the command) is replaced with its value.

Command substitution

Command substitution involves looking for backward quotes, executing the command and arguments between the backward quotes, and then using the results of that execution as arguments within the command that is being executed.

Try this example:

<font face="Courier">
ls -l `ls -p|grep /`|more
</font>

The command

<font face="Courier">ls -p</font>
will produce a directory listing in which any directories are marked with a trailing slash. A sample listing shown below includes a single directory,
<font face="Courier">mystuff</font>
, indicated by the trailing slash.

<font face="Courier">
ls -p

STARTUP
file.txt
file2.txt
mystuff/
xdir/
</font>

Adding the

<font face="Courier">grep /</font>
to the command line selects only those lines containing the trailing slash.

<font face="Courier">
ls -p|grep /

mystuff/
xdir/
</font>

By enclosing the whole of

<font face="Courier">ls -p|grep /</font>
in back quotes the command is executed and the results are handed to the preceding command as arguments. In the example shown

<font face="Courier">
ls -l `ls -p|grep /`|more
</font>

is the equivalent of

<font face="Courier">
ls -l mystuff/ xdir/|more
</font>

which causes a page-by-page listing of all subdirectories within the requested directory. This is how the command substitution phase of command processing works.

Wild cards

The command processor looks for wild cards used in file names and expands them. These are the standard wild cards:

<font face="Courier">*</font>
and
<font face="Courier">?</font>
, as well as the bracket wild card,

<font face="Courier">
ls -l [abc]*
</font>

which provides a listing of any files or directories that start with a, b or c.

Execution

Finally the command is executed, and this completes the steps of the shell processing.

One additional note worth mentioning: When the command processor encounters a command between back quotes at step 8, it separates out the command between the quotes and runs steps 1 through 9 on that command.

This also happens to commands separated by semicolons. Steps 1 through 9 are performed on each separate command. You can test this yourself by selecting a directory with only a few files and then issuing the command

<font face="Courier">
echo files in /chosen/directory are ; echo `cd /chosen/directory; ls *`
</font>

What will be echoed is the list of files in

<font face="Courier">/chosen/directory</font>
. The asterisk argument to
<font face="Courier">ls</font>
is obviously not expanded until after the
<font face="Courier">cd</font>
command. If this were not the case, the asterisk would be expanded using the list of files in the current directory instead of the target directory.

So commands within back quotes are processed as if they were separate commands with a full set of steps 1 through 9. The same applies to each of the commands within a line of commands separated by semicolons.

ITWorld DealPost: The best in tech deals and discounts.
Shop Tech Products at Amazon