ITworld.com
  Search  
ITworld Home Page ITworld Webcasts ITworld White Papers ITworld Newsletters ITworld News ITworld Topics Careers ITworld Voices ITwhirled Changing the way you view IT

Master Foo and document interchange formats

ITworld 6/23/2007

Sean McGrath, ITworld.com

Master Foo - as is his habit - was sitting in the lotus position while gazing at his laptop screen. The serenity of the 6 a.m. air was punctuated only by the background mantra-like hum of the laptop's fan as it diligently cooled the (mostly idling) 8 CPUs in Master Foo's entry level laptop. Master Foo visualized the CPUs, balanced atop Plum Blossom Poles, expertly dispatching any computational attacks from the system and then settling back into the peace and tranquility of the nirvana NOP loop.

The perfection of the scene was shattered by the all-too-familiar sound of cellphones, blackberries and palmtops being carried up Pentementi Mountain by a group of visitors seeking an audience with Master Foo. "Sometimes I wonder." Master Foo thought to himself, "Perhaps it is the portable machines that are using humans as convenient evolutionary transport, rather than humans using the machines as convenient computational devices?"

The noise of the approaching group grew louder in Master Foo's ears as his internal thought continued. "Perhaps this group of people approaching me now are being carried up the mountain by their gadgets? Perhaps gadgets are really very smart after all. Their key genius is hiding how clever they really are. Are these gadgets acting like so many Lancet Fluke infecting so many Ants?"

Master Foo parked that potentially interesting like of thought to attend to his now arrived visitors.

"Good morning Master Foo," a spokesperson for the group said. "If you have a moment, we wish to ask your opinion about these two open XML-based formats for document interchange. We have full print-outs of the specifications in this wheelbarrow we have pushed up Pentementi Mountain. We know that you can read and retain entire volumes of information by just flicking the pages and we thought perhaps we would go get a cup of green tea while you read through the few thousand pages ... Basically, we would like to know your opinion as to which is better."

"I do not need to read them," Master Foo announced as he re-arranged the Mei flowers that adorned either side of his laptop monitor.

"Ah, great," said the spokesperson gesturing to the sweat covered wheelbarrow pusher to retreat, so you have already read them? That is great because I have an 8:30 downtown and it would be really excellent if I had the answer for that meeting.

"I have not read them," replied Master Foo.

"Then how do you know that they are wrong?"

"Interchange" is not an attribute of an English language specification. The concept of interchange - the ready ability to move information from one place to another - does not work like that in the general case. It is like the blue of the sky or the sound of a sea wave. It cannot be separated from its environment without destroying it."

"I'm afraid I do not follow Master Foo," the spokesman said.

"Open one of the pages of the specification at random," instructed Master Foo.

The spokesperson rummaged through the wheelbarrow and opened one volume at a random page.

"Now what do you see?" asked Master Foo.

"I see XML tags and attributes with text that describe what the tags and attributes mean."

"That is not what you see at all," proffered Master Foo.

"Yes it is! It says here...," said the exasperated spokesperson.

"Ask yourself this question," Master Foo intoned raising his hand to thwart objections. "Of the myriad of XML elements and attributes in one of those specifications, how many permutations and combinations are possible?"

"Lots. Millions I guess," said the spokesperson.

"Beyond millions," said Master Foo. "An uncountable number of permutations and combinations. Where is the meaning of each of these combinations specified?"

"Um...," said the spokesperson, absentmindedly stroking the Bluetooth gadget in his ear.

"It is not specified in the specifications," intoned Master Foo. "It exists in one place and one place only. That is, in the source code of the application that processes the myriad of combinations. Any other attempts at capturing what the combinations mean are, at best gross over simplifications and at worst, completely wrong."

Master Foo arose from his lotus position and adopted a Single Whip Tai Chi posture. "A fully accurate specification," he continued, "for any interchange format can only be written down if what you write down is precisely the source code for the reference implementation of that specification."

The spokesperson ripped the Bluetooth ear-piece out of his ear in frustration. "But the whole point of a specification," he began, "is to try to make information interchangeable without relying on the exact source code of any particular implementation!"

"Precisely so," replied Master Foo. "A laudable goal but one that is beyond the current state of the art. It is simply not possible to specify with words and formulas and diagrams exactly what running software should do. The only comprehensive, unambiguous way to do that is through source code itself."

"This is troubling," said the spokesperson, looking to his comrades for support and encouragement.

"It gets worse I'm afraid," said Master Foo. Even if the specification was precisely the source code for a reference implementation, that would not do the trick either.

"Why not?"

"Because the word 'interchange' means different things to different people. A common interpretation is that a document that fills, say, 12 US legal pages at source should print out creating an identical set of 12 US pages once interchanged. Fonts would appear the same. Lines would break at the same points. Page boundaries would occur at the same points and so on."

"Yes! That is what I personally mean by 'interchange,'" said the spokesperson. "I thought getting an XML-based interchange format agreed internationally would do that for our industry."

"It will not and can not," said Master Foo. "The precise meaning of the contents of a document is an attribute of the observer (the software) not the observed (the data). A whole host of factors ranging from installed fonts to printer drivers to default language to screen size play a part in defining how a document looks on screen and on paper. In the general case, these factors change from user to user and thus the precise presentation will change too. Even if all users use exactly the same document processing software."

"This is deeply troubling," said the spokesperson. "Is the entire enterprise flawed I wonder?"

"No," said Master Foo. Interchange formats are most definitely a step in the right direction. The critical thing is to see them for what they are - not for what market forces may want you to see them as. "They are a help, a step in the right direction, a stepping stone. They are not the end of the journey."

"But what is the end of this journey Master Foo? Are we even headed in the right direction?" asked the spokesperson.

"The end of this road cannot be reached with more and more technology. There are only two theoretically possible end-points. Either (a) the entire world agrees to use exactly the same computer systems down to the N'th decimal place or (b) the world agrees a workable definition of the word 'interchange' so that multiple implementations can be compliant without being identical in all respects."

"Which road should we travel Master Foo?"

"Option (a) simply isn't practical in my opinion. It is like asking the entire world to standardize on exactly the same car or exactly the same house or exactly the same musical instrument. Of course, market forces will try this route first as it is the most lucrative. Option (b) is the only sensible option but people are terribly disinterested in bad news and I fear that it will be perceived (and regularly presented) as such."

"Suddenly, I find myself unable do find the energy to help bring this wheelbarrow down the mountain," said the spokesperson.

"Empty it. Leave the specifications here. I will use them as bedding for my farm animals. For many large English language specifications, the wheelbarrow that carries them is more valuable."

"Thank you Master Foo. We will leave you now and think about what you have said."

Master Foo returned to his laptop and his study of Model Theory and Z Notation.

On this topic

 

Sean McGrath is CTO of Propylon. He is an internationally acknowledged authority on XML and related standards. He served as an invited expert to the W3C's Expert Group that defined XML in 1998. He is the author of three books on markup languages published by Prentice Hall. Visit his site at: http://seanmcgrath.blogspot.com.

Read more of Sean McGrath's ITworld.com columns here.




Sponsored Links

Dashboards & KPI Reporting for Business People
PivotLink provides a new perspective on your business with drillable dashboards & reports.
CAPTURE Quad-Core Performance
Check Out The Latest In Capturing The Value Of Xeon® Quad-Core Servers For Your Business.
SOLVE SUPPORT ISSUES on the First Call!
REMOTELY CONTROL AND CONFIGURE SYSTEMS. Easily install applications, updates. All from your Desktop!
FREE Application Discovery Tool from Sophos
Scan your network for VoIP, IM, games and other potentially unwanted applications.
FREE Sophos Threat Detection Test
Scan for viruses, spyware & adware. Is your AV catching everything?
» Buy a link now

Advertisements
Sponsored links
Locate Hidden Software on business PCs with this free tool
Bring harmony to your mix of UNIX-Linux-Windows computing environments
Top 5 Reasons to Combine App Performance and Security
KODAK i1400 Series Scanners stand up to the challenge
 Home   IT in the enterprise  Productivity paradox
www.itworld.com    open.itworld.com     security.itworld.com     smallbusiness.itworld.com
storage.itworld.com     utilitycomputing.itworld.com     wireless.itworld.com

 
Contact Us   About Us   Privacy Policy    Terms of Service   Reprints  

CIO   Computerworld   CSO   GamePro   Games.net   IDG Connect   IDG World Expo   Industry Standard   Infoworld   ITworld   JavaWorld   LinuxWorld  MacUser   Macworld   Network World   PC World   Playlist  

Copyright © Computerworld, Inc. All rights reserved

Reproduction in whole or in part in any form or medium without express written permission of Computerworld Inc. is prohibited. Computerworld and Computerworld.com and the respective logos are trademarks of International Data Group Inc.