Unix How-To: Counting Anything

By  

A reader recently asked whether it was possible on Unix systems to count how many times a particular character appears on each line of text. "On Unix", I answered, "not only is just about anything possible, but there are usually half a dozen ways to do it".

Counting how many times a particular character appears on a line of text, while not exactly straightforward, is easily done. If you take advantage of awk's ability to identify fields and its built-in field counter (NF), you can get very close. Tell awk the character in question is the field separator and it will happily count how many times it appears. You just have to account for fact that awk will report one more character than is actually included on each line. The string "ababa", for example, would report three fields if "b" is interpreted as the field separator. Reducing each count by one, therefore, tells you how many b's appear on the line.

Here's a simple script that does this, asking first what character you want to count:

#!/bin/bash

echo -n "character to count> "
read char

echo -n "file> "
read file

if [ ! -f $file ]; then
    echo "OOPS: No file named $file"
    exit
fi

for n in `awk -F${char} '{print NF}' $file`
do
    n=$(($n - 1))
    echo $n
done

Similarly, if you want a count of how many times a character appears in a file, you can "scrunch" the file down to a single line by removing the linefeeds and avoid having to subtract one more than once.

$ cat myfile
This is the start of a new era in the life of our members.  We can choose
to take a stand or we can hide in the shadows.
$ cat myfile | tr -d "\012" | awk -Fe '{print NF}'
14

If you want to count how many times a particular word shows up in a particular file, you can use a tactic that's more or less the opposite of the file scrunching shown above. While it's easy to count how many lines contain a paticular string, this output doesn't necessarily tell you how many times the word appears:

$ cat myfile
This is the start of a new era in the life of our members.  We can choose
to take a stand or we can hide in the shadows.
$ grep the myfile | wc -l
       2

Hand the text one word at a time to wc, on the other hand -- by turning blanks into linefeeds, you will get a count of how many times your word appears anywhere in the file.

$ cat myfile | tr " " "\012" | grep the | wc -l
       3

Clearly there are many ways to count letters and words in files and many more elegant than these. Still, these quick and dirty ways or working around command limitations might come in handy.

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Answers - Powered by ITworld

ITworld Answers helps you solve problems and share expertise. Ask a question or take a crack at answering the new questions below.

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Ask a Question
randomness