The decline and fall of the relational database

Page 2 of 2

Many, many real world systems have to fight the realities of times arrow. How many systems do you know that have to store data that changes through time? Or report on how data has changed over time? Or allow modification to themselves over time? A sizable subset I suspect. And yet, the concept of time is not primary in the relational model. Of course, it is possible to model time in a relational database and implement a layer on top that adds the time dimension but it is not the relational database's strong point. Indeed, the concept of data normalization and the removal of duplication in general, has a nasty habit of making point-in-time reporting very problematic indeed. Consider the classic example of normalizing a design that contains customer information. You want to store a single copy of the customers contact information - or so the standard wisdom holds. But what if you need to find out who used to be the contact before the current person took over? In many classically designed relational information models your only recourse is to backups or historical reports. In this day and age, when storage is effectively free and technology has developed to the point where storing information deltas between time points can be done very efficiently, does it really make sense to throw any historical information away? Does it make sense to have to manually account for times arrow in every data model?

One of the reasons why this dimension of data modeling is under scrutiny is that software developers are increasingly used to the highly time-oriented data management approaches used in source code control systems such as Subversion, Git, Mercurial, Darcs, Microsoft Visual Source Safe and Perforce. Managing a complex corpus of source code has much in common with managing a complex corpus of product data, manufacturing data, personnel data...the similarities are not being lost on developers who are becoming increasingly used to being able to mix'n'match structured and un-structured information and manage it all under a system that makes the time dimension easy to access and exploit.

Tentative conclusions

I do not think that any one of the above camps can deliver a killer blow to the pre-eminence of the relational database but taken together, I think they have enough momentum to topple the giant. For years I thought that the relational database was unassailable. After all, the last time a challenger entered the fray – the object database that accompanied the object oriented analysis and design revolution – it was summarily dismissed. This time it is different, the enemy is diverse and attacking from all sides. People are revisiting the writings of the early heretics. Terms like NOSQL are being coined. The term "schema free" is accruing acceptability. Open source projects in this space are appearing at a rate of knots : mongodb, cassandra, CouchDB to name but three. The term “non-relational persistence” now Google's very well and I am detecting the use of the word “legacy” in connection with relational databases!

As Ghandi said, “first they ignore you, then they laugh at you, then they fight you, then you win”. I think we are now in stage 3 of that progression.

Pass the popcorn.

| 1 2 Page 2
ITWorld DealPost: The best in tech deals and discounts.
Shop Tech Products at Amazon