Code is data, and data is code

By Sean McGrath, ITworld |  Opinion Add a new comment

I like to give names to the concepts I deal with in IT. As soon as a
concept enters my head, the race is on for the correct word or phrase to
express it. I don't like it when I cannot give things names. I get
grumpy. Sometimes, I have visions of the ghost of Ludvig von
Wittgenstein, feeding the birds outside his cottage in Renvyle[1] just
South of here. In these visions, he is laughing at my naming problems.
He is singing:

It's all just code and data.

Yes, it is true. At the moment I'm having trouble deciding what is code
and what is data. It happens to me about once a year. It passes without
medication if I get some rest and light exercise.

Having words for things in IT is a two edged sword. It is in the nature
of human language that most words have multiple meanings and that the
ultimate meaning is determined inside a brain, in private. One person's
code can be another person's data. The ghost of Wittgenstein rings in my
ears:

It's all just code and data.

Sometimes I think it is a wonder we manage to converse at all in IT. Our
field is one with more than its fair share of people I call 'dyadic
generalists'. I'm one of them. We dyadic generalists like to do two
things. We like to generalize and we like to split things into two
opposing camps. RAM/ROM, software/hardware, documents/data, programming
language/scripting language, client/server and so on.

High on that list of juxtapositions is code/data. At least, I used to
think it was a juxtaposition. These days, I'm not so sure.

The evolution of the Web is a nice microcosm of the tension between code
and data and will hopefully serve to illustrate what I'm talking about.

In the beginning there was HTML. A form of data. Then CGI programs
flourished that generated HTML. These programs took the form of code
with embedded data. Then HTML browsers developed the ability to execute
code in the form of Javascript. As a result, HTML became data with
embedded code. Then CGI went of fashion, supplanted by technologies like
ASP, JSP and PHP - all examples of data with embedded code.

The latest twist in this merry dance for preeminence between code and
data is the recent focus on techniques such as Apache Struts[2], Zope's
TAL[3] and content management frameworks such as JPublish[4], all
striving to cleanly separate the code from the data.

We have been here before or course in the war for supremacy between code
and data. Ever since user interfaces advanced beyond the patch-panel, we
have struggled with how best to separate the melange of code and data
that together conspire to construct applications that have user
interfaces. Do you remember the heated debates about the merits/demerits
of embedding SQL in Cobol? The same debate rages about embedding
HTML/XML in Java. Remember the Model, View, Controller techniques
pioneered when Smalltalk[5] roamed the land? Echoes of Apache Struts?

It would appear that when it comes to organizing the relationship
between code and data, the right answer does not live at the extremities
of the spectrum. Code in data has problems - an entire shopping basket
application in a single JSP page. Ugh! Data in code has problems - an
entire shopping basket application constructed with print statements in
Perl. Ugh!

Now, as any self-respecting dyadic generalist will point out, one of two
things can happen here. Either we find a middle ground between code and
data that everyone can live with. Or, we fundamentally revisit the
problem.

I think we need to revisit the problem. In my mind's eye, I see the
ghost of Wittgenstein again, well fed birds on each shoulder, smiling.
He says:

Code and data? It's all just *text*.

Ah. Interesting. Perhaps I should have paid more attention to Poyla's
intuition[6] that the more general problem may be easier to solve. Let's
generalize code and data to be merely specific examples of text. Does
that help?

Well, the main reason for separating code and data is to better *manage*
each. We live in a world in which, from a development perspective, code
is code and data is data. Code lives in source code control systems,
data lives in databases or XML documents. East is East and West is West.
The twain ideally would not meet until deployment time.

Perhaps this is how we should revisit the problem - by revisiting the
very notion of *text* in our computer systems. What if every text editor
on the planet was a folding text editor[7] that could seamlessly
transclude[8] text from one location into another?

With such a capability we could manage code and data separately, but by
simply opening up a different 'view' on them, see them as a merged
entity consisting of both code and data.

    Add a comment

    Post a comment using one of these accounts
    Or join now
    At least 6 characters

    Note: Comment will appear soon after you have activated your account.
    Obscene/spam comments will be removed and accounts suspended.
    The information you submit is subject to our Privacy Policy and Terms of Service.

    ITworld LIVE

    Ask a question

    Ask a Question