November 17, 2004, 11:27 AM — Send your Unix questions today!
See additional Unix tips and tricks
Regardless of how sophisticated Unix has become, one of its ever-endearing qualities is the ease with which the command line can be used to select, manipulate and display data. Clever little languages like sed and awk continue to remind me why it was so much fun to sit down with the early O'Reilly books, trying each new little "trick" that they taught me and discovering just how much processing I could do via simple scripts. And, while I rarely write awk scripts these days, there are occasions in which it is the perfect tool for the task. In this week's column, we're going to look at a couple tools for creating columnar output from a list of strings -- a simple task, but one that I hate to do "by hand".
In the first of two scripts, data arriving in a list such as this:
| apricot banana chick peas dill pickle egg french fries grapes hot dogs ice cream jello ketchup lemon melon |
would be processed to look like this:
| apricot | banana | chick peas |
| dill pickle | eggs | french fries |
| grapes | hot dogs | ice cream |
| jello | ketchup | lemon |
| melon |
This rearrangement of data can be helpful if, for example, you are including a list of file names or numbers in a text file and want to include multiple items in each row to shorten the length of the inserted text.
For anyone who hasn't written awk scripts. it's useful to know that awk processes one line of input at a time and that it automatically assigns each string on a line to the variables $1, $2 and so on ($0 represents the entire line of input).
If there are certain things that you want to do in your script before it has processed any data or after it has processed all of the data, you can put those commands within BEGIN and END blocks like those you'll see in the scripts below. In the first script shown below (mkCols1), you'll notice that two variables have been assigned values in the BEGIN block. These values should not be set more than once. One (cols) is a constant, parameterized so as to make it easy for someone to change its value. The other (count) is initialized in the BEGIN block and then used to count the number of strings added to each output string before it is printed.
# mkCols1 -- organize input into columns
BEGIN {
count=0
cols=3
}
{
len=length($0)
pad=substr(" ",1,15-len)
LINE=LINE " " $0 pad
count=count+1
if (count == cols) {
print LINE
count=0
LINE=""
}
}
END { if (count > 0) {
print LINE}
}
|
The mkCols1 script creates a series of 3-element lines by concatenating the input strings until it has three of them. It then resets the counter and empties the variable used to hold the text before printing.
To create the columns, a string of blanks is inserted between each element as it is added to the growing output string. This padding is adjusted to the size of each input string so that the columns will line up. For example, if you fed the script a list of food items shown above, you would get this:














