PDF mollifiers fill crucial role

Not all PDFs are created equal, but surgery can make 'em look that way

We often "mollify" PDF instances. I've never seen anyone else document this practice; I'll tell how we do it.

First comes why, though. PDF is very widely-used, for a wide range of results: some organizations produce financial statements by the megapage, that are immediately sent to high-speed printers and stuffed in envelopes; others focus on high-end color and layout effects, with subtle adjustments to render photographs in different styles; and still others juggle multiple languages--and sometimes multiple writing systems--while meeting statutory requirements for publication and archiving. Our own work often merges several distinct pagestreams. An official proposal, for instance, might combine computer-generated budget tables, illustrative diagrams, copies of participants' résumés, signature pages, organizational boilerplate, and so on. The crucial point here is that signature pages might have been scanned on different equipment, by different people, and rendered into PDF by considerably different technologies (scanner firmware vs. a desktop application, and so on). End-users expect whatever they see in front of them will look equally good to others.

Similarly, someone who prepares a presentation at his desktop counts on PDF to solve all problems involved in creation of the same image on all other platforms.

The promise isn't the reality

It's not always so easy, though. We frequently encounter PDF instances that look great in one viewer (Adobe's software is generally, but not always, the most forgiving, the most standard in its renderings, and the most widely-accepted by end-users), but bizarre in another: scanner vendors frequently appear to cut corners in their PDF-generating software, different applications are based on different versions of the PDF standard, and DRM tactics range from the stupid to the inscrutable. One strategy we could take is to tell our end-users they must submit standard-compliant, entirely unencumbered PDF. We do this, a little, but only a little; with some populations of end-users, it would be equally meaningful or effective to command them to date all correspondence by the Nahuatl calendar.

Instead, we've worked up a portfolio of "mollifiers": filters that consume PDF instances, and emit other instances that look the same, or nearly so, but conform better to the PDF standard, and render more consistently across a range of readers. While Acrobat is generally the most potent, it's a headache to automate. pdftk is the one mollifier we most often use: it's available for essentially all platforms, runs from the command-line, has liberal licensing, and reasonably tolerant of the weirdness that turns up "in the field". The result: several of our applications have a

pdftk $BEFORE.pdf output $AFTER.pdf

PDF promises, along with much else, absolute platform-independence: what you see on your screen will look the same on all other computers, of when printed, put up to a public-space sign, or otherwise rendered. The reality falls short of that ideal. With a few lightweight tools in our kit, though, we bring the reality close enough to the ideal that our customers don't notice the difference.

ITWorld DealPost: The best in tech deals and discounts.
Shop Tech Products at Amazon