If I had a dime for every time a developer has told me that they cannot
use XML in their application "because it would be too slow", I would be
a wealthy man. If I had another dime for every time this perception
proved to be ill founded, I would be wealthier still. In fact, I'd have
accumulated nearly twice as many dimes.
Lets cut to the chase and say something critical about software
developers. Being a professional software developer myself, I am
qualified to perform this self-criticism. All software developers
suffer from varying degrees of Acute Premature Optimization Syndrome
(APOS). It does not matter how many times we read C.A. Hoarse comment
that "premature optimization is the root of all evil" or Michael
Jackson's two rules of software optimization -- First Rule: Don't do
it. Second Rule: Don't do it yet -- we think about optimization all the
time. From drawing board to final rollout or abandonment, we worry
about optimization.
Poor old XML often sits right in the firing line of APOS sufferers. We
just instinctively know it is going to slow our systems down to a
snails pace. Its raw text darn it! You have to parse that stuff with a
complicated thing called an XML parser. Parsing is nasty and slow.
Everyone knows parsing is nasty and slow, right?
Indeed, parsing is often nasty -- all the more reason to use an off-the
shelf component to do the job. But slow? Speed is relative.
The fact is, I have yet to come across an XML system where the true
performance bottleneck is XML parser speed. Moreover, the people I meet
who are infatuated with parser speed are the same people who thought
the Web would never work because HTTP was too slow or that Java would
never succeed because it is too slow. Those old enough to have been
there used to claim systems written in C rather than assembler would be
too slow.
Anybody making XML system design decisions based on XML parser
benchmarks is kicking the tires (i.e., focusing on the tires when
buying a car). The tires are an obvious and accessible part of the
system, their function is well understood and measurable. But frankly,
there are way more important things going on under the hood from a
performance perspective.
Time and again, in my own work, I have made premature optimization
decisions that have cost me dear. I worked with a company in the
Eighties that well nigh went under because of premature optimization.
Unless you are very lucky, premature optimization will result in you
bending your design out of shape for a perceived performance need that
is illusory. I have been writing software since 1982 and I have yet to
accurately guess where the real performance bottlenecks of any
moderately sized system really are.
As the years go by, I also note with considerable interest, the extent
to which willfully banishing thoughts of optimization from my head at
design time leads to systems with better performance than I would have
imagined possible. In a paradoxical, Tao-like way, you'll achieve good
performance by ignoring performance issues during design. Let
performance look after itself. If the design is right, it will.
Time for another quote. This time from Jon Bentley: "Make it work
before you make it work fast". This is my all time favourite quote
about software development. It is as applicable to the design of XML
based systems as it is to any other field. So, here is how I tend to
build XML systems.
First, I design without regard to performance. I get something working -
- a representative subset of the total functionality. Who cares if it
takes an hour to boot or two minutes to redraw a Web page, it does not
matter yet. When I have a working system, I stick my probes in.
Exercising the system functionality, gathering metrics of where the
system is really spending its time. Armed with hard evidence of where
the performance bottlenecks are, I optimize those parts and only those
parts.
If you follow this approach in your own XML systems, I can guarantee
you three things:
1. You will find performance "hotspots" that account for a
disproportionate amount of the runtime of your system.
2. The location of the hotspots will surprise you.
3. These hotspots will have nothing to do with parsing XML.