From: www.itworld.com
April 11, 2001 —
Part 1 of this series on simple email automations began with a look at outbound messages. We exhibited simple scripts that can send out items and even attachments.
This time, we will filter incoming items.
Forwarding facilities
It's possible to filter at several points in the transmission of a message. Most consumer email clients have proprietary filtering capabilities. System administrators generally have ways to filter traffic on a sitewide basis. They often use alias and :include facilities for some of the functions we describe below.
We'll concentrate today on the filtering an individual can write when a Unix machine hosts her mail spool. If your "mail server" is a specific Unix machine that you reach through POP3 or IMAP4 authorization, or by a direct log-in to the host, email is usually delivered by sendmail or a sendmail alternative that is sufficiently compatible to respect .forward.
.forward files tell your email service if you want your email forwarded. Suppose you have an account, someuser, hosted on somemachine.com. Email addressed to someuser@somemachine.com goes straight to that machine's mail spool.
Someday, you might decide it's more convenient for you to forward someuser's email to a machine on a personal network -- say, me3@mymachine.elsewhere.com. If you create a file called .forward in someuser's home directory on somemachine.com, and put the single line
me3@mymachine.elsewhere.com
in that file, then all email addressed to someuser@somemachine.com will actually be delivered to me3@mymachine.elsewhere.com. Specific email services generally impose a few minimal security requirements: .forward must be owned by its own user, have its "user read bit" set, and so on.
Many users take advantage of this simple forwarding capability. It's only the beginning of what a .forward can do for you, though. You can forward your email, but also leave copies where they "naturally" belong, with
\someuser, me3@mymachine.elsewhere.com
You can leave a permanent copy of all your traffic with
\someuser, /usr/users/someuser/backup/allmail
That creates the /usr/users/someuser/backup/allmail file as a huge spool that receives a duplicate of every item sent to someuser.
Your own filters
Most exciting of all is the ability to vector traffic through your own filter scripts. Start experimenting by using
\someuser, "|/usr/users/someuser/programs/mytest"
Then create /usr/users/someuser/programs/mytest as the shell script, with contents
#!/bin/sh
echo "Email received at `date`." >> /usr/users/someuser/logs/maillog
and execution bit set. You'll soon see /usr/users/someuser/logs/maillog accumulating a log of all your email messages.
Several programs have been specifically developed to filter email traffic. We've written before about using procmail to filter out spam and sort incoming traffic into folders. While procmail's excellent performance and security ensure its widespread use, its arcane syntax makes it inconvenient for more general email chores.
Any of the common scripting languages -- Lua, Perl, Python, Rexx,
Ruby, Tcl -- make good alternatives to procmail if you're setting up an autoresponder, analyzing your email traffic, or constructing a more sophisticated forwarding scheme. Suppose, for example, that you want to be paged whenever you get a message from your boss that does not have "Staff meeting" in the subject line, and arrives outside working hours. You might construct a filter like this:
#!/usr/local/bin/python
import os
import re
import sys
import time
# .forward delivers each entire e-mail message to the
# standard input of the filtering program. That's
# where we pick up individual ones.
message = sys.stdin.read()
if re.search(r'Subject: Staff meeting', message) != None:
sys.exit()
if re.search(r'From: myboss', message) == None:
sys.exit()
hour = time.localtime(time.time())[3]
if (8 < hour) and (hour < 17):
sys.exit()
os.system("sendmail mypager@pagerservice.com")
That code is actually slightly off; it rejects messages that mention "Subject: Staff meeting," as well as those that actually have "Subject: Staff meeting". The virtue of the example, though, is that it shows how readable and sophisticated a simple email filter can be.
In practice, it's generally a good idea to use the special-purpose mail-parsing modules that most scripting languages enjoy. In Python, for example, the rfc822 module understands how to parse message header elements.
Citizenship considerations
In our previous column, we endorsed the use of standard email clients like sendmail, mutt, and pine for generating email messages. A great deal of email traffic "in the wild", including email from prominent companies like Amazon, Juno, and GE, is misformatted and will likely confuse some process along the way. By starting with well-established command-line clients, you improve the odds that you'll only generate well-formed messages that your recipients can read without difficulty.
When filtering your own email, though, feel free to go wild. The only damage you're likely to do is to your own mail spool. You can experiment with all sorts of languages and algorithms, and produce quite interesting results. For instance, if you want to send a form letter in response to a class of email you occasionally receive, scripting languages are ideal for rapid development of useful filters.
In future installments of Regular Expressions, we'll take a closer look at how to handle email attachments and common administrative chores. Let us know what particular needs you have.
Unix Insider