May 28, 2012, 9:57 AM — There are a lot of really nice scripting languages available to Unix admins, but Perl is still one of my favorites for doing any work that involves regular expressions -- any text that you can describe with a pattern. If you want to locate or change chunks of text that match some particular specification, you can probably throw together a script in Perl that will do the work fairly easily. In this week's post, we're going to examine some easy fixes for common problems that take advantage of Perl's versatile nature.
Removing blank lines
The first trick is removing blank lines. In Perl, recognizing a blank line is easy. Since ^ represents the beginning of a line and $ represents its end, ^$ represents a line that begins and ends and has nothing in between. You can expand this to also match lines that contain white space by changing the expression to ^\s*$. The \s means "white space", so \s* matches zero or more characters of white space.
To skip over blanks lines in a perl script, you have several choices. You could use a "next if /^$/" (skip if empty) command or a "next if /^\s*$/" skip if empty or only white space. Alternately, you could take the approach of printing only if (/\S/) (print if there is text) or print if (!^$) (print if not empty).
This script nugget would show only lines containing text by skipping blanks lines:
while () { next if /^\s*$/; # skip blank lines print; }
This one would only print lines containing text:
while () { print if (/\S/; }
This code displays only lines that aren't empty:
while () { print if (!/^$/); # print only if NOT empty }
This handy one-liner removes blank lines, but also saves the original file to .old. The new file (the one without blank lines) assumes the original filename.
perl -i.old -n -e "print if /\S/" filename
You can turn this into a script:
#!/bin/bash
if [ -f "$1" ]; then
perl -i.old -n -e "print if /\S/" $1
fi
or an alias:
alias deblank='perl -i.old -n -e "print if /\S/"'
Removing whitespace at the beginnings and ends of lines
Removing whitespace at the beginnings or ends of lines can facilitate later processing by reducing the options that you need to consider.
To remove leading whitespace:
$string =~ s/^\s+//
To remove trailing whitespace:
$string =~ s/\s+$//
Removing non-Ascii characters
Removing characters that don't fall within the range of the traditional ASCII character set is a little tricky. In the command below, we're using the perl tr command to map the range of characters between hex 80 (decimal 128) to hex FF (decimal 255) aqnd deleting them (d).


















