Let's automate email.
We are continually surprised at how many moderately experienced computer users and developers don't know the basic facts about email. Throughout the coming year, we'll explain a few of the essentials and show working scripts that solve common problems.
Automating the original "killer app"
We start with the essentials. Email items travel across the Internet as plain-text byte streams, formatted according to RFC 822 (see Resources, below). This specifies a simple structure of a header followed by content, with a blank line separating the two. A minimal message might then look something like
From someone@somewhere.net Tue Mar 6 16:16:00 2001
This is a message.
Items typically have considerably more elaborate headers, including elements such as To:, From:, Subject:, and so on.
This is enough background to understand one of the questions we come across most often: "How can I automatically send out an email item with a Subject (or Cc, or Bcc, or ...) that says X?" In a typical Unix environment, all it takes is a command line invocation
sendmail -t << END
From: myaccount@myhost.com
Subject: This is the subject.
To: intended_recipient@somewhere.com
Cc: other_person@somewhere.com
Bcc: my_records@myhost.com
Hello. This is the message. Goodbye.
END
You can also emit email messages directly from most languages, without an apparent need to invoke an external process such as the sendmail used here. We use this most often when testing email service, and especially in architecting unusual sites; by coding at a lower level, it's easy to access alternative network ports, request unusual relay topologies, and so on. Through the Resources below, you can locate ways to code email transmissions in such other common scripting languages as KornShell, Perl, Python, Rexx, Ruby, and Tcl.
However, email agents including sendmail add a great deal of value that's not immediately apparent. They will, for example, retry transmissions so that temporary breaks in connectivity don't thwart your attempts to get through. Therefore, our usual advice to developers automating most email operations is just to "shell out" and invoke a specialized, external email agent. All the most popular scripting languages make that easy.
Do you want to attach a file to your outbound message? Many correspondents do. While that takes only a few keystrokes more than a simple message, the concepts behind attachments are widely misunderstood. Let's take a few moments to be clear on the subject.
MIME packages payloads
We've already mentioned that as it travels around the Net from sender to recipient, all email traffic looks the same: header, blank line, body, all expressed in simple ASCII characters. A message with attachments follows the same rules. It just happens to have a body (plus, most likely, a small piece of the header) that can be interpreted so as to reconstitute the original contents of the attached files.
Standards documents such as RFC 1521 define how the Multipurpose Internet Mail Extensions (MIME) wrap up external files to fit them inside an email message body. This packaging is so successful, in fact, that MIME is also used for many applications, including the World Wide Web, that have no direct connection to email.
Therefore, your first instinct in automating messages with attachments should be as in the first section above: find a special-purpose mailing agent that knows the language. Sendmail doesn't, and none of the applications that do are as ubiquitous as sendmail. Most hosts, though, have at least one MIME-savvy processor. With a client like mutt installed, it's easy to
mutt -a $ATTACHMENT_FILE -i $CONTENT_FILE \
-s $SUBJECT $ADDRESS
This sends a message that looks roughly like
From: me@myhome.com
To: $ADDRESS
Subject: $SUBJECT
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="4Ckj6UjgE2iN1+kY"
X-Mailer: Mutt 1.0pre2i
--4Ckj6UjgE2iN1+kY
Content-Type: text/plain; charset=us-ascii
This is $CONTENT_FILE.
--4Ckj6UjgE2iN1+kY
Content-Type: image/jpeg
Content-Disposition: attachment; filename="$ATTACHMENT_FILE"
Content-Transfer-Encoding: base64
ZGVidWdnZXIvAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
.......... ...
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
--4Ckj6UjgE2iN1+kY--
Notice the extra Content-Type element of the header and its boundary parameter. Those give email clients the information they need to cut an email message into its parts. In this case, there are two parts, representing the text in $CONTENT_FILE and the entire contents of $ATTACHMENT_FILE.
Can your favorite scripting language work directly with MIME? Sure; almost all of them have at least one auxiliary MIME-aware library or package. Don't start there, though. Practice your email technique and understanding with a few simple command-line exercises. You might discover that will meet all your requirements.
What if you're running Windows, or Mac OS, or OpenVMS, or some other operating system? All those also allow for more or less simple email automations. For specific details, join us in ITworld.com's Scripting Languages and Techniques forum.
In the next installment of Regular Expressions, we'll look at email from the other end: How do you automate the processing of messages you receive?
TkGS
We're excited enough to step outside our usual style: Regular Expressions generally confines itself to advocating for developers and users, not languages or tools. We do have a few favorites, though, and you should know about one in particular.
While we use several graphical user interface (GUI) toolkits, Tk is our favorite for quick, portable, and reliable construction. On the other hand, Tk isn't as "lively" as several of its younger peers; toolkits such as GTK+ and Qt enjoy considerably more "buzz" and innovation.
TkGS might change that comparison. TkGS is an open source reimplementation of Tk's internals, to make them faster, more flexible and capable, and even more easily portable and "themeable." A rough draft, or alpha version, that handles textual issues (including font control) is already done. As project leader Frédéric Bonnet explains,
the current version now supports similar features as Tk, that is,
- Full Unicode support, both in Unicode and UTF-8 format, as well as system-specific encoding support.
- A Tk-compatible font fallback mechanism. TkGS will always pick the best and closest font to the one requested.
- A Tk-compatible generic multifont Unicode system on non-Unicode-native systems. A Unicode string may be drawn using several fonts, depending on the glyphs they support.
- A higher-level interface to basic features: that is, a TkGS port of the current Tk font objects and commands.
What's exciting about TkGS is not just that it's good technology and widely useful -- anyone now maintaining or relying on Tcl/Tk, Tkinter, Perl/Tk, TkLua, and so on, should benefit from it -- but also that your own efforts can make a difference. Bonnet, who himself is operating on a purely volunteer basis, has arranged the work so that experts in many areas might cooperate. The project particularly needs help now from graphic format specialists (including those who know PDF), Mac OS developers, technical writers to document specifications, and those who know the existing Tk sufficiently well to write new implementations of the Tk canvas and other widgets and features.
Do you want to create a better GUI toolkit, or just watch others in the middle of the same metamorphosis? Check out the TkGS Website (see Resources below).
Resources