| |
Spot the warning signs in configuration file design
ITworld 01/09/2008
Sean McGrath, ITworld.com
You have identified a set of parameters for your application and you are now
looking at how to store them, edit them, read them in and so on. For the sake
of illustration, let's say your application needs just two parameters called
v_height and h_width.
Well, what better place to start than a simple ini file in which you have something
like this:
[params]
v_height = 800
v_width = 1200
You might feel inclined to XML-ize this into something like this:
<params>
<v_height>800</v_height>
<v_width>1200</v_width>
</params>
So far so good. Both approaches have the benefit that you can grab off-the-shelf
bits'n'pieces to do most of the reading/writing/validating legwork. The simplicity
comes at a price. Two prices actually, readability and flexibility.
Let's start with readability. Imagine that setting the v_width parameter to
1.5 times the v_height parameter is a common idiom. A nice, self-documenting
way to write that would be:
v_height = 800
v_width = v_height * 1.5
Ah. But for that to work, your parameter file tools need to understand variables,
assignments and arithmetic. You could start coding it but gee, pretty soon you
find yourself down in the bowels of a mini-programming language in order to
handle this sort of thing:
v_height = 800
v_width = (v_height+100)/3.0 * 1.5
This is a slippery slope that gets steeper very quickly! Note that the slope
is exactly the same regardless of whether or not you start with a plain text
ini file approach or an XML approach.
<v_height>800</v_height>
<v_width>(v_height+100)/3.0 * 1.5</width>
You could, of course, add markup for the arithmetic expression but this rapidly
becomes unreadable and doesn't materially reduce the programming work involved
in evaluating the expressions.
The next big ramp up in the gradient of the slippery slope happens the day
your users say "If height is less than 100, width should always be 200.".
Now you end up wishing you could write something like this to keep everything
readable and self-documenting:
v_height = 800
if v_height < 100 then v_width = 200
else v_width = (v_height+100)/3.0 * 1.5
Looks familiar doesn't it? Apart from syntactic sugar details this is like
any number of Turing Complete programming languages. By the time your application's
parameter handling sub-system can handle the above you have created another
one! Note again, that XML-ifying this makes no material difference to the effort
involved. An <if> tag is not any easier to program against that an "if"
keyword. Moreover, XML-ifying the above leads to a very unpleasant tag soup...
What to do? There is another road. A road with great power but also great responsibility.
What if you captured the parameters in Python for example? Imagine a file called
params.py:
v_height = 800
if v_height < 100:
v_width = 200
else:
v_width = (v_height+100)/3.0 * 1.5
Done! Now all you need to do is load that up at run-time and off you go. The
same goes for Ruby or any number of interpreted programming languages. Perhaps
the language of most generic utility for this sort of thing at the moment is
Javascript. Parameterize your web application with Javascript. Send the Javascript
to the browser and eval it. Alternatively, use a server side Javascript implementation
(such as Rhino) to load you parameters into your application.
What's not to like? This looks like a great way to get really powerful, expressive
and readable configuration files for little effort. Well, there is a catch.
There always is. Once you open up your parameter files to the full power of
a programming language, bad things can happen. For example, some variant of
the following is almost certainly possible to write inadvertently orwith malice
of forethought regardless of the language you choose:
while true : x = x
The result will probably be a hung application with little in the way of useful
logging messages to indicate what is going on.
Sadly, finding a subset of a programming language so that this sort of nastiness
cannot occur is very hard to do without neutralizing the expressive power of
the language that is the whole point of the approach. To make matters worse,
it is known that it is impossible to look at a program fragment and automatically
detect if it contains unpleasant things like infinite loops.
The approach I favor is to start with a design that is, in effect, too powerful.
It is easier to pare back from an overly powerful system than it is to expand
an overly restricted system.
Put the extra power of programming language-based parameter files into your
application. Then see how the parameterization works out in practice. You can
work back from there if necessary. For example, I like to use Python syntax
for parameterization from the get-go. If I need to, I will write a parser for
whatever subset of Python my application ends up using in the real world. But
I wait for real-world experience using the application to tell me what that
subset is. I don't try to second-guess it.
Sean McGrath is CTO of Propylon. He is an internationally
acknowledged authority on XML and related standards. He
served as an invited expert to the W3C's Expert Group that
defined XML in 1998. He is the author of three books on
markup languages published by Prentice Hall. Visit his
site at: http://seanmcgrath.blogspot.com.
Read more of Sean McGrath's ITworld.com columns here.
|
|
|