April 13, 2007, 1:02 PM — In one form or another, I do words for a living. I spend part of my time writing words (as I am doing right now). I also spend part of my time writing complex sets of words (known as 'computer programs') to manage other complex sets of words (known as 'enterprise content').
On an almost daily basis I must address the question of how best to organize content so that it can be managed effectively. On an almost daily basis I find myself oscillating between two distinct-yet-closely related worlds...
On one hand there is the classic word processor world. These things generally provide cozy, self-contained mechanisms for managing words, tables, graphics up to about, say, 500 pages of stuff. Once the document size goes beyond that, things become more complicated. The good news is that most enterprise content can sensibly be managed in units of 500 pages or less. There seems to be two main reasons for this rule-of-thumb limit. First, if you are going to print something to paper you must worry about how to bind it. The thicker the book, the more complex that gets. Second, word processors tend to work by loading the entire document into memory and therefore the memory size of your computer dictates what constitutes a comfortable document size.
On the other hand, there is the less cozy but utterly compelling world of the web and web-oriented editing tools. Web editing tools have a completely different feel to them. A completely different comfort zone. Web pages get unwieldy well south of 500 pages, there is no self-contained environment for managing images along with words. Typographically speaking, they are rather limited compared to modern word processors...
Yet few would argue that the fundamentals of caring and feeding of content are present in both worlds. Why oh why is there such a dichotomy in the tools used? Why - in this day and age - do we have to 'convert' content to Web format? Why - in this day and age - do we have so many problems producing decent printed pages from web browsers?
In more technical language, why do we have XML-based markup languages like ODF and OOXML for managing words when we also have XML/SGML-based markup languages like XHTML/HTML for managing words? Does it make any sense to have two competing approaches? Are these approaches converging technologically or diverging?
Opinions differ of course. My take is that convergence is inevitable. I also have a suggestion for a key piece of the jigsaw puzzle that is currently missing. If it existed, convergence would - in my opinion - proceed faster than it currently is.
The missing piece is that the Web has no concept of a 'collection of small documents that can be edited/browsed/searched as a unit'. What is a 450 page user manual really? It is a collection of smaller documents that have been hooked together into a hierarchical and sequential ordering to create a full work. How do we know this? Because the table of contents makes the internal boundaries/structure explicit. What do we do when we publish this 450 page manual on the Web? We explode the content into 'chunks' corresponding to the internal boundaries/structure. Why do we use Word Processors to create these things? Because they give us a cozy, self contained world in which to organize content into sequential, hierarchical chunks. We can move stuff around, change levels, insert graphics etc. all in one tidy file called MyMagnumOpus.xyz.
The convenience of having it all in one file cannot be underestimated. In one fell swoop, it takes a horrible problem off the table. Namely, how to name the individual chunks of content. Contrast this with a purely web-oriented environment. Each chunk of stuff has a URL (a fantastically useful thing!) but each URL must be cared for by the author. When content is re-arranged, the URLs all change ... If you have ever tried to write something non-trivial as a set of HTML pages directly using an HTML word processor you know what I'm talking about.
This is the essential difference I think between current Word Processing tools and current Web tools. The Web is about smallish chunks of stuff managed as single units known as 'pages'. Word processors manage many smallish chunks of stuff in cohesive collections known as 'documents'.
It is a race. I do not think that is too strong a term. The word processor tools need to grow better and better features for chopping stuff up for Web publication. Or, more radically, word processors need to be inverted to take cognizance of the fact that these days, web publication tends to come first, with paper publication coming later (if at all).
The web and the word processor
The Most
-
Will Do Not Track kill the 'free' Internet?
8 comments
-
How to avoid being tagged as a terrorist: Don't pay cash for coffee
6 comments
-
How to kill Web trackers dead
3 comments
-
Even after rewrites, Google Wallet retains gaping security holes, mainly due to Android
3 comments
-
Hacked Microsoft online store saved passwords in plain text
2 comments
Open Source Month
ITworld LIVE
FlimpVlad_YahP3C7ER has just joined ITworld
itcruld has just joined ITworld
adelphie has just joined ITworld
DariaJones12528 has just joined ITworld
ShojiitagakiXS_tw473572786 has just joined ITworld
jnaze shared iPad apps for book lovers on Email
Cube has just joined ITworld
Gerald Lau has just joined ITworld
ryanhellyer_tw14598449 has just joined ITworld
rasel2011 has just joined ITworld
Mark Cummuta shared IT pay: Premiums for IT skills drop as IT departments reorganize on Twitter
The white paper Guaranteeing 100% Backup Recovery was viewed
CorinaGraham has just joined ITworld
Answers - Powered by ITworld
ITworld Answers helps you solve problems and share expertise. Ask a question or take a crack at answering the new questions below.
Join Now













