The Only Hard Problem in XML
Before I start, in the grandest tradition of us XML types, let me
generalize. This article is about the only hard problem in computing in
general -- never mind just XML. And that problem is?
Naming things.
These two simple words, added together spell trouble all over the IT
landscape. On your network you have files; files have names. You put
them in folders that have names, on drives that have names. All the
resources on the network are named at least once and probably twice --
once with a human friendly name, like foo.baz.com, and once with a
computer friendly name, like http://194.125.145.37.
Names contain hierarchical layers, which is where the trouble starts.
With DNS, some top-level domain names where created: .COM, .ORG, etc.
These, in turn, break into sub-domains and so on. IP addresses split
into segments separated by periods and various routing techniques rely
on being able to peel off various layers from the segments.
The thing about naming conventions is that they are self limiting.
Creating one involves taking a view of the world of discourse,
classifying it into a hierarchy of "things" that have names, and
joining the name segments into longer names. In a word: taxonomies.
"Taxonomy" is a great word to describe the naming problem because it
covers both giving things names and putting them into some sort of
classification system. We can see taxonomy problems everywhere: the
Domain Name Service, Carl Linnaeus's organization of plant species,
Niels Bohr's model of the atom, etc....
And so to XML. XML is all about taxonomy problems. Every time we
concoct a schema -- whatever the notation -- we are addressing the
naming problem and creating a taxonomy. Every time we seek to
interchange XML data with someone, we need to address the naming
problem. Their data may overlap with yours in terms of its true meaning
(that dreaded word "semantics") but if their system has grown up
independently of yours, it almost certainly uses a different naming
system.
The blue sparks that fly out of XML people's ears when you put them
into a room together are due, in no small part, to naming problems. The
great holy grail of interchange is to be able to interchange the
meaning of information, in XML form, without the enormous overhead in
time and money of industry standard schemas and/or point-to-point XML
transformations.
Tim Berners-Lee's vision of the Semantic Web has caused some XML
people, me included, to say, "Good. Maybe now the world will see the
enormous nature of this naming knowledge problem." Note that we did not
shout, "Yippee. A solution cometh!"
This problem is hard, real hard. It is hard because it requires caging
a wild animal called "knowledge". Nobody knows how to do that. Wrapping
angle-bracketed tags around data does not magically transform it into
knowledge. Knowledge refuses to be written down.
Sign up for ITworld's Daily newsletter
Follow ITworld on Twitter @IT_world
jfruh
Apple syncing patent can't come soon enough
pasmith
New Twitter features borrow from 3rd party clients
Esther Schindler
Open Source Changes the Software Acquisition Process
mikelgan
How to set up continuous podcast play on the new iTunes
David Strom
Five important Windows 7 mobility features
sjvn
Guard your Wi-Fi for your own sake
Sandra Henry-Stocker
Grepping on Whole Words
Sidekick: The Good News & the Bad News
Either way you look at it Microsoft Data Center management did not follow standards or best practices in this failure. In which case it makes me wonder more about the outsourcing of corporate data much less personal data.
- mburton325
Join the conversation here
Quick, practical advice for IT pros. Made fresh daily.
Want to cash in on your IT savvy? Send your tip to tips@itworld.com. If we post it, we'll send you a $25 Amazon e-gift card.












