The Only Hard Problem in XML

By Sean McGrath, ITworld |  How-to Add a new comment

Before I start, in the grandest tradition of us XML types, let me
generalize. This article is about the only hard problem in computing in
general -- never mind just XML. And that problem is?

Naming things.

These two simple words, added together spell trouble all over the IT
landscape. On your network you have files; files have names. You put
them in folders that have names, on drives that have names. All the
resources on the network are named at least once and probably twice --
once with a human friendly name, like foo.baz.com, and once with a
computer friendly name, like http://194.125.145.37.

Names contain hierarchical layers, which is where the trouble starts.
With DNS, some top-level domain names where created: .COM, .ORG, etc.
These, in turn, break into sub-domains and so on. IP addresses split
into segments separated by periods and various routing techniques rely
on being able to peel off various layers from the segments.

The thing about naming conventions is that they are self limiting.
Creating one involves taking a view of the world of discourse,
classifying it into a hierarchy of "things" that have names, and
joining the name segments into longer names. In a word: taxonomies.

"Taxonomy" is a great word to describe the naming problem because it
covers both giving things names and putting them into some sort of
classification system. We can see taxonomy problems everywhere: the
Domain Name Service, Carl Linnaeus's organization of plant species,
Niels Bohr's model of the atom, etc....

And so to XML. XML is all about taxonomy problems. Every time we
concoct a schema -- whatever the notation -- we are addressing the
naming problem and creating a taxonomy. Every time we seek to
interchange XML data with someone, we need to address the naming
problem. Their data may overlap with yours in terms of its true meaning
(that dreaded word "semantics") but if their system has grown up
independently of yours, it almost certainly uses a different naming
system.

The blue sparks that fly out of XML people's ears when you put them
into a room together are due, in no small part, to naming problems. The
great holy grail of interchange is to be able to interchange the
meaning of information, in XML form, without the enormous overhead in
time and money of industry standard schemas and/or point-to-point XML
transformations.

Tim Berners-Lee's vision of the Semantic Web has caused some XML
people, me included, to say, "Good. Maybe now the world will see the
enormous nature of this naming knowledge problem." Note that we did not
shout, "Yippee. A solution cometh!"

This problem is hard, real hard. It is hard because it requires caging
a wild animal called "knowledge". Nobody knows how to do that. Wrapping
angle-bracketed tags around data does not magically transform it into
knowledge. Knowledge refuses to be written down. It is not "declarable"
in any syntax, XML included. It remains in the minds of people,
transferred transiently and at much cost into computer programs.

The field of "naming things" -- variously referred to as mereology,
ontology, epistemology, data modeling, RDF, Topic Maps, Meta-
Architectures -- is fascinating to watch. Much work goes into coming up
with ways to express the knowledge hidden in hierarchies, yet that
knowledge itself is not hierarchical, which makes humans' quest to make
it so all the more puzzling.

Perhaps the problem is unsolvable. Perhaps, thanks to the Godelian
insight into the incompleteness of knowledge of any field, we petty
humans cannot hope to write down the knowledge inherent in the "system"
to which we are a part. Which isn't to say we should give up of course,
but such musings can help keep you sane when your day job revolves
around grinding out naming conventions for XML schemas and XSLT
transformations to change data from one naming convention to another.

    Add a comment

    Post a comment using one of these accounts
    Or join now
    At least 6 characters

    Note: Comment will appear soon after you have activated your account.
    Obscene/spam comments will be removed and accounts suspended.
    The information you submit is subject to our Privacy Policy and Terms of Service.

    ITworld LIVE

    Ask a question

    Ask a Question