ITworld.com
  Search  
 Home  Newsletter Archive  XML IN PRACTICE
The Only Hard Problem in XML
Sign up for XML IN PRACTICE
More Newsletters
 

XML IN PRACTICE --- 01/24/2002

The harder we try to establish a hierarchical IT naming structure, the more untamable the task becomes.



Before I start, in the grandest tradition of us XML types, let me generalize. This article is about the only hard problem in computing in general -- never mind just XML. And that problem is?

Naming things.

These two simple words, added together spell trouble all over the IT landscape. On your network you have files; files have names. You put them in folders that have names, on drives that have names. All the resources on the network are named at least once and probably twice -- once with a human friendly name, like foo.baz.com, and once with a computer friendly name, like http://194.125.145.37.

Names contain hierarchical layers, which is where the trouble starts. With DNS, some top-level domain names where created: .COM, .ORG, etc. These, in turn, break into sub-domains and so on. IP addresses split into segments separated by periods and various routing techniques rely on being able to peel off various layers from the segments.

The thing about naming conventions is that they are self limiting. Creating one involves taking a view of the world of discourse, classifying it into a hierarchy of "things" that have names, and joining the name segments into longer names. In a word: taxonomies.

"Taxonomy" is a great word to describe the naming problem because it covers both giving things names and putting them into some sort of classification system. We can see taxonomy problems everywhere: the Domain Name Service, Carl Linnaeus's organization of plant species, Niels Bohr's model of the atom, etc....

And so to XML. XML is all about taxonomy problems. Every time we concoct a schema -- whatever the notation -- we are addressing the naming problem and creating a taxonomy. Every time we seek to interchange XML data with someone, we need to address the naming problem. Their data may overlap with yours in terms of its true meaning (that dreaded word "semantics") but if their system has grown up independently of yours, it almost certainly uses a different naming system.

The blue sparks that fly out of XML people's ears when you put them into a room together are due, in no small part, to naming problems. The great holy grail of interchange is to be able to interchange the meaning of information, in XML form, without the enormous overhead in time and money of industry standard schemas and/or point-to-point XML transformations.

Tim Berners-Lee's vision of the Semantic Web has caused some XML people, me included, to say, "Good. Maybe now the world will see the enormous nature of this naming knowledge problem." Note that we did not shout, "Yippee. A solution cometh!"

This problem is hard, real hard. It is hard because it requires caging a wild animal called "knowledge". Nobody knows how to do that. Wrapping angle-bracketed tags around data does not magically transform it into knowledge. Knowledge refuses to be written down. It is not "declarable" in any syntax, XML included. It remains in the minds of people, transferred transiently and at much cost into computer programs.

The field of "naming things" -- variously referred to as mereology, ontology, epistemology, data modeling, RDF, Topic Maps, Meta- Architectures -- is fascinating to watch. Much work goes into coming up with ways to express the knowledge hidden in hierarchies, yet that knowledge itself is not hierarchical, which makes humans' quest to make it so all the more puzzling.

Perhaps the problem is unsolvable. Perhaps, thanks to the Godelian insight into the incompleteness of knowledge of any field, we petty humans cannot hope to write down the knowledge inherent in the "system" to which we are a part. Which isn't to say we should give up of course, but such musings can help keep you sane when your day job revolves around grinding out naming conventions for XML schemas and XSLT transformations to change data from one naming convention to another.

 



www.itworld.com    open.itworld.com     security.itworld.com     smallbusiness.itworld.com
storage.itworld.com     utilitycomputing.itworld.com     wireless.itworld.com

 
Contact Us   About Us   Privacy Policy    Terms of Service   Reprints  

CIO   Computerworld   CSO   GamePro   Games.net   Industry Standard   Infoworld   ITworld  
JavaWorld   LinuxWorld  MacUser   Macworld   Network World   PC World   Playlist  

DEMO   IDG Connect   IDG Knowledge Hub   IDG TechNetwork   IDG World Expo  

Copyright © Computerworld, Inc. All rights reserved

Reproduction in whole or in part in any form or medium without express written permission of Computerworld Inc. is prohibited. Computerworld and Computerworld.com and the respective logos are trademarks of International Data Group Inc.