July 19, 2005, 12:38 PM — Unix tools are great for finding particular text in an input stream and for selecting a portion of each line to display. Like selecting rows and columns from a table in a relational database, commands like grep and awk allow us to trim huge amounts of data down to just what we want to see. In fact, the ability to pipe commands together and get a useful answer in a single line of commands continues to be one of the great appeals of our favorite operating system (in all its many flavors).
But what should you do if you want to see not just some target text, but a portion of the text surrounding it? What if you need to see your search terms in context to know whether your hits are relevant? When one of you readers asked me this question, my first reaction was ... good question! Then I started wondering how hard it would be in Perl to capture and display the text surrounding my search terms and how much data I might have to load into memory.
But soon afterwards, I was trying my hand at GNU grep and found out that its easy-to-use options for including lines before and after the target text turn what might have been a tricky scripting task into a veritable fete accompli! Check out this GNU grep command run against the text of the Gettysburg Address. The -B (before) and -A (after) arguments are requesting that the command provide two lines before and after those lines containing the search term. Notice, however, that our displayed text includes not five, but seven lines. This is because the target text (the word "consecrate") appears twice in this excerpt. We get, therefore, two lines before the first appearance of the word and two lines after the second. There is a single line of text between the two lines containing our target word.
bash-2.03$ /opt/gnu/bin/grep -B 2 -A 2 consecrate gburg.txt this.
But, in a larger sense, we can not dedicate -- we can not consecrate -- we can not hallow -- this ground. The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add or detract. The world will little note, nor long remember what we say here, but it can never forget what they did here. It is for us the living, rather, to be dedicated
How nice not to have had to think through the algorithm to display multiple hits in this seamless manner! The -B and -A options can also be specified as --before-context and --after-context, though I prefer to type as little as possible so I use the one-letter versions. Of course, where there's one good option, there may be another, so let's see what else GNU grep can provide. Another GNU grep option allows you to limit the number of matches that you'll find. Going back to our Gettysburg Address example, Abe Lincoln used the word "dedicate" (or a derivative) six times in his 275-word address. Let's say we only want to see only the first three instances of this word. No problem for GNU grep:
bash-2.03$ /opt/gnu/bin/grep -m 3 dedicate gburg.txt a new nation, conceived in Liberty, and dedicated to the proposition that nation so conceived and so dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate a portion of that field,
Or, maybe we want to print line numbers along with our located text:
bash-2.03$ /opt/gnu/bin/grep -n dedicate gburg.txt 2:a new nation, conceived in Liberty, and dedicated to the proposition that 6:nation so conceived and so dedicated, can long endure. We are met on a great 7:battle-field of that war. We have come to dedicate a portion of that field, 12:But, in a larger sense, we can not dedicate -- we can not consecrate -- we 16:forget what they did here. It is for us the living, rather, to be dedicated 18:nobly advanced. It is rather for us to be here dedicated to the great task
And, if we have trouble locating our target words in the text, we tell GNU grep to colorize our search terms:
bash-2.03$ /opt/gnu/bin/grep -n --color=always dedicate gburg.txt 2:a new nation, conceived in Liberty, and dedicated to the proposition that 6:nation so conceived and so dedicated, can long endure. We are met on a great 7:battle-field of that war. We have come to dedicated a portion of that field, 12:But, in a larger sense, we can not dedicated -- we can not consecrate -- we 16:forget what they did here. It is for us the living, rather, to be dedicated 18:nobly advanced. It is rather for us to be here dedicated to the great task














