Unix How-To: awk is still a very handy tool

By  

Awk does a lot more than select a column from a file or an input stream. It can select columns from selected rows. It can calculate totals, extract substrings, reverse the order of fields and provide a whole lot of other very handy manipulations. Whether you squeeze your awk permutations onto the command line or prefer to build them into scripts, the language is clever and versatile and well worth using even as it joins the ranks of middle-aged computer utilities.

If you want to use awk to add up a bunch of numbers formulated as a column in a text file, you can use a one-liner like this:

$ awk '{ SUM+=$1 } END { print SUM }' < nums

That SUM+= operation adds the contents of column one to a running total for each line of input. The sum is only printed at the very end when input is exhausted. You can change $1 to the column of your choice or $NF if you want to add up the rightmost column.

If you prefer scripts to the command line, you could use a script like the addcol script below to sum whatever column you choose. You would just pass the column you want to add on the command line as the COL parameter:

$ awk -f addcol COL=3 < numbers

The -f tells awk to run the designated script (addcol) and COL=3 passes "3" as the column number you want to sum.

# addcol
BEGIN { SUM=0 }
{
  print $COL
  SUM += $COL
}
END { print "Sum: " SUM }

Awk has some grep-like features as well. If you want to operate only on lines that contain some particular text, you can specify that text on the command line like this:

$ awk '/choose me/' textfile

You can also combine search options using && (and), || (or), and even negation operators. The command below selects lines that contain both the word "this" and the word "that". Using '/this/ && !/that/' would select only lines that contain "this" without containing "that".

$ awk '/this/ && /that/' notes
# find the process that started this one
# make sure that this user provided an answer

You can also select content based on its position within your input. In the command below, we only want to see lines eleven and above.

$ awk 'NR > 10' counts
11 63
12 99
13 63
14 77
15 41

And, of course, there are a lot of other nice little tricks available.

Awk provides some quick and effective filtering and incorporates an easy syntax. Probably the only thing that takes some getting used to is referring to parameters without putting dolar signs in front of them.

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Operating SystemsWhite Papers & Webcasts

See more White Papers | Webcasts

Answers - Powered by ITworld

ITworld Answers helps you solve problems and share expertise. Ask a question or take a crack at answering the new questions below.

Ask a Question
randomness