Would the real, authentic copy of the document please stand up?

By Sean McGrath  Add a new comment

A small thought experiment for you on this bright but chilly winter's morning. In your hand you have a 40 page document. On your computer screen you have an electronic document open in a word processor. You have been told they they are "the same document". How can you tell? What does it even mean to say that they are the "same"? Does it matter if there is no sure-fire way to prove it?

Let us start at the end of that list of questions and work backwards. Does it matter that there is no sure-fire way to prove it? Most of the time, it does not matter if you cannot prove they are the same. Over the years since the computerization of documents, we have devised various techniques for managing the risks of differences arising between what the computer says and what the sheets of paper say. However, when it does matter it tends to matter a whole bunch. Examples are domains such as legal documents, mission critical procedure manuals, that sort of thing.

A very common way of mitigating the risk of differences arising between paper and electronic texts is to declare the electronic version to be the real, authentic document and treat the paper as a "best efforts" copy or rendering of the authentic document. If the printing messes up and some text gets chopped off the right hand margin we think "No big deal". Annoying but not cataclysmic. The electronic copy is the real one and we can just go back to the source any time we want...

...Yes, as long as the electronic source is not, itself, an ambiguous idea. Again, we have developed practices to mitigate this risk. If I author a document in, say, FrameMaker but export RTF to send to you, the FrameMaker is considered the real, authentic electronic file. If anything happens to the RTF content - either as it is exported, transmitted or imported by you into some other application - we refer back to the original electronic file which is the FrameMaker incarnation....

...If we still have it up to date. The problem is that we do not print FrameMaker or Word or Quark Express. We tend to print frozen renderings of these things. Things like postscript and PDF. On the way to paper, it is not uncommon for fixes to be required just prior to the creation of very expensive printing plates. If something small needs to be fixed, it will probably get fixed at 2 a.m. in the postscript or PDF file...which is now out of synch with the original FrameMaker file...

...Which, come to think of it, might not have been as clear cut an authoritative source as I made it out to be. It is not uncommon for applications like FrameMaker, Adobe CS2, Quark etc. to be used downstream of an authoring process that utilizes Microsoft Word or Corel Wordperfect or OpenOffice or some Webb-y browser plug-in.

If (i.e. when) errors are found in document proofs the upstream documents should really be fixed and the DTP versions re-constituted. Otherwise, the source documents get out of synch with the paper copy very quickly indeed. Worse, the differences between the source documents and the paper copy may be in small errors. A period missing here, a dollar sign there...Small enough to be very hard to spot with proofreading but large enough to be very serious.

What to do? Well, we need to freeze-dry cuts of these documents to remove all ambiguity and then institute rigorous policies and procedures to ensure that changes are properly reflected everywhere along the document production toolchain...

...Which, these days, can be quite a complicated toolchain. For example, it is quite likely that web page production is feeding off the content prior to when it goes in to the DTP program. So, when a fix is needed you need to chase down all copies made and fix them, preferably all at the same time. Oh, and tools for editing (yes, I did say "editing") PDF documents are becoming more and more commonplace. So much for simple freeze-drying of content...

...Wait. This is getting too complex and has too many points of human intervention which introduces costs and the potential for human error. Best to simplify the tool chain...

...Yes, that would be nice but unfortunately DTP packages do useful things - things that word processors do not do. Word processors do useful things - things that Webby-plugins cannot do. Layout formats like PDF, Postscript, SVG do things that authoring formats like ODF do not do. HTML can be both a layout format and an authoring format but only at the expense of leaving behind a lot of very useful stuff for large document publication...

...So, where does that leave us? Well, behind that beautifully produced 40 pager you hold in you hand we have, roughly speaking, umpteen different electronic variations of it. Each of which may or may not be "the same" as the paper in a variety of subtle and (to me anyway) interesting ways...

...We have a problem. Consider this, search around the Web for companies offering data capture services from paper. Lots to choose from right? Now where do you think all that paper is coming from? Old, old content that pre-dates computerization? No. Some of it falls into that category but only some. Filled in, paper based forms that do not exit in computers at all? Yes, there is a bunch of that. But a lot of it is content that came into existence purely electronically, at some stage over the last 30 years. It passed through some complicated toolchain and workflow on its way to paper. The paper then became the only reliable incarnation of the content. Any electronic versions of it that the owners could dig out were found to be potentially flawed in some way...

...Thus the need to capture the content from paper. An exercise which, even with rigorous QA to say, 99.998% accuracy, is guaranteed to introduce its own set of errors...

So now, would the real, authentic copy of the document please stand up?

    Add a comment

    Post a comment using one of these accounts
    Or join now
    At least 6 characters

    Note: Comment will appear soon after you have activated your account.
    Obscene/spam comments will be removed and accounts suspended.
    The information you submit is subject to our Privacy Policy and Terms of Service.

    ITworld LIVE

    BusinessWhite Papers & Webcasts

    White Paper

    Insiders Can Ruin Your Company. Take Action.

    Did you know that 80 percent of threats to an organization come from the inside? The threat from insiders is often overlooked in organizations worldwide. This white paper from NetIQ, discusses key technology solutions that help to prevent and detect insider threats.

    White Paper

    Ten Steps to an Enterprise Mobility Strategy

    Enterprise employees are more mobile, relishing the ability to work productively anywhere, at any time. They may use any means to get connected, often creating financial and security risks for your company. Discover how to get control of your enterprise mobility strategy and ensure mobile worker productivity with these ten steps.

    White Paper

    What You Need to Know About the Costs of Mobility

    Mobile workers want to get connected anywhere, at any time, often at any cost. Enterprise mobility is often a hidden "black" budget in your company. Ensure that your traveling employees are productive everywhere, even while you control cost and security, through an enterprise mobility strategy.

    White Paper

    The 2011 iPass Mobile Enterprise Report

    This industry survey covers trends, recommendations and a policy guide on managing Enterprise Mobility for IT management and CIOs. Get data on employee device liability, as well as smartphone/tablet penetration, budget control and provisioning. Find out how your organization compares, how to ensure mobile worker productivity, and control costs.

    White Paper

    Smarter Commerce is redefining value chain visibility

    Smarter Commerce is redefining the value chain in the age of the customer. It starts with putting the customer at the center of your operations - which of itself is not a new idea - however, truly operationalizing this strategy is not easy.

    See more White Papers | Webcasts

    Answers - Powered by ITworld

    Ask a question

    Ask a Question