PDF widely misunderstood

By Cameron Laird  Add a new comment

PDF is in wide use, and plenty of developers bump into the standard on a daily basis. It's also widely, and sometimes deeply, misunderstood, if my reading of popular discussion sites for programmers is at all representative. "Smart Development" will open the first week of September 2009 by talking over a few of the commonest mistakes that turn up.

PDF shines at display, not persistence. I frequently run into programmers who report things such as, "my boss told me we have 23,000 PDF scans of resumes, and he assigned me to make a database by reading the names, telephone numbers and addresses from them." Some of these programmers are so inexperienced that they sink several weeks into such a project before they realize they must tell their bosses this particular project is a bad idea. Humans can read PDF images, and extract useful content. Software can do some of the same, but generally only with difficulty and many errors. PDF mostly has setters, but not getters.

If you ever receive such an assignment, you basic choices are:

  • Let your boss know that the task is much more expensive and less rewarding than he realizes;
  • Use some of the PDF-to-text tools already available to make the best of a bad situation; or
  • "Swim upstream", to discover the true home of the data you're after.

It can easily happen that, when a manager says, "get the telephone numbers off these PDFs", what he really means is, "See these telephone numbers? I need data like that. I don't care how you get it; I'm just showing you this particular representation, because you're a programmer, and we rarely understand each other." If the PDF images are generated from, say, an existing database, or correspond to a known XML feed, your best bet is to use the database or XML directly.

Later this week, I'll say a bit about PDF's security features, bookmarks, portfolios, how to do things with PDF you shouldn't do, and my favorite PDF automation. 'Have questions or criticism? Let me know; reader comments will largely determine how the rest of the week goes.

ITworld LIVE

DevelopmentWhite Papers & Webcasts

Webcast On Demand

How to Distribute Apps to Your Mobile Workforce

When considering enterprise app deployment, you may find some unexpected challenges and a number of options that range from simple distribution to running your own enterprise market. How can you determine the best approach for your organization? MOTODEV for Enterprise can help you understand and evaluate current enterprise deployment technologies and learn best practices that support your choice.

Sponsor: Motorola Mobility

Webcast On Demand

Authentication, Certificates and VPNs

MOTODEV for Enterprise can help get you up to speed quickly on key topics such as how to enable secure access to a company intranet from outside the firewall. This webinar provides a clear explanation of terms and technologies and what they can do for your enterprise app development.

Sponsor: Motorola Mobility

Webcast On Demand

Improving Enterprise App Quality with MOTODEV App Validator

MOTODEV for Enterprise supports quality app development for businesses, government, and institutions with technical resources and tools such as the MOTODEV App Validator, a free static analysis testing tool.

Sponsor: Motorola Mobility

White Paper

HR Analytics: Driving Return on Human Capital Investments

In today's economy, it's critical for organizations to make employee retention and development a major business focus, to ensure that valuable employees are not lost as the economy improves. With advanced BI solutions, organizations can be supported by workforce analytics to drive return on human capital investment and to see the value the workforce delivers to organizational performance. This white paper demonstrates how the increased power of having metrics and analytic insight can align core HR business processes with organizational goals and strategies and help ensure organizations make the right business decisions today for tomorrow.

White Paper

Positioning the CIO as a Powerful Business Partner with IT Portfolio Governance

In this whitepaper, learn how you can become a visionary portfolio manager and transform IT into a streamlined revenue and profit center.

See more White Papers | Webcasts

Ask a question

Ask a Question