Easy fixes using perl

By  

There are a lot of really nice scripting languages available to Unix admins, but Perl is still one of my favorites for doing any work that involves regular expressions -- any text that you can describe with a pattern. If you want to locate or change chunks of text that match some particular specification, you can probably throw together a script in Perl that will do the work fairly easily. In this week's post, we're going to examine some easy fixes for common problems that take advantage of Perl's versatile nature.

Removing blank lines

The first trick is removing blank lines. In Perl, recognizing a blank line is easy. Since ^ represents the beginning of a line and $ represents its end, ^$ represents a line that begins and ends and has nothing in between. You can expand this to also match lines that contain white space by changing the expression to ^\s*$. The \s means "white space", so \s* matches zero or more characters of white space.

To skip over blanks lines in a perl script, you have several choices. You could use a "next if /^$/" (skip if empty) command or a "next if /^\s*$/" skip if empty or only white space. Alternately, you could take the approach of printing only if (/\S/) (print if there is text) or print if (!^$) (print if not empty).

This script nugget would show only lines containing text by skipping blanks lines:

while (  ) {
    next if /^\s*$/;	# skip blank lines
    print;
}

This one would only print lines containing text:

while (  ) {
    print if (/\S/;
}

This code displays only lines that aren't empty:

while () {
      print if (!/^$/);	# print only if NOT empty
}

This handy one-liner removes blank lines, but also saves the original file to .old. The new file (the one without blank lines) assumes the original filename.

perl -i.old -n -e "print if /\S/" filename

You can turn this into a script:

#!/bin/bash

if [ -f "$1" ]; then
    perl -i.old -n -e "print if /\S/" $1
fi

or an alias:

alias deblank='perl -i.old -n -e "print if /\S/"'

Removing whitespace at the beginnings and ends of lines

Removing whitespace at the beginnings or ends of lines can facilitate later processing by reducing the options that you need to consider.

To remove leading whitespace:

$string =~ s/^\s+//

To remove trailing whitespace:

$string =~ s/\s+$//

Removing non-Ascii characters

Removing characters that don't fall within the range of the traditional ASCII character set is a little tricky. In the command below, we're using the perl tr command to map the range of characters between hex 80 (decimal 128) to hex FF (decimal 255) aqnd deleting them (d).

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Spotlight on ...
Online Training

    Upgrade your skills and earn higher pay

    Readers to share their best tips for maximizing training dollars and getting the most out self-directed learning. Here’s what they said.

     

    Learn more

IT ManagementWhite Papers & Webcasts

See more White Papers | Webcasts

Answers - Powered by ITworld

ITworld Answers helps you solve problems and share expertise. Ask a question or take a crack at answering the new questions below.

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Ask a Question
randomness