November 29, 2006, 11:25 AM — Sometimes, typos can be useful. Earlier today I wrote "rush of codd to the hand" in an e-mail when I should have written "rush of code to the hand". The e-mail concerned a not-very-pretty system of my acquaintance (I wrote it), in which the programmer (moi) had starting coding way too early in the development cycle out of sheer youthful enthusiasm.
In my experience, a rush of code to the hand is a very common problem in software development. A problem that is only adequately tackled through the application of large amounts of experience. Looking back now on the programmer I was then (we are talking 1992-93 time frame here), I simply was not experienced enough to make the right call. You live, you make mistakes, you learn, you move on.
Speaking of moving on, it is about time that I re-vectored this article to its proper subject matter which, is, believe it or not, relational databases. The "codd" in the serendipitous phrase "rush of codd to the hand" is Edgar F. Codd[1]. Codd was a British computer scientist who is best remembered as one of the founding fathers of the science of relational databases. Most students of relational databases will, sooner or later, come across his name; most likely cloaked in acronym form such as BCNF. BCNF is short for Boyce Codd Normal Form[2]. The acronym BCNF pops up a lot in database design and in particular, in an important soul cleansing endeavor known as database normalization[3].
In my experience, sheer enthusiasm can lead designers to introduce relational database into their designs way too early in the development cycle. The thought process seems to go something like this:
"We need code for the algorithms. Ok, let's use Java/C#/Python/Php (whatever). We have data that the algorithms will work on. We need a database. Ok, let's use Oracle/SQL Server/MySQL/Postgres (whatever)."
The key word here is the word "data". There are many forms of data that fit the relational database model like a glove. These forms of data are extremely commonplace. Things like customer details, line items of invoices, product inventories etc. etc. It is no wonder that relational databases are as popular as they are.
However, the problem starts when the word "data" is used as a catch-all for every type of data there is. Take documents for example. Documents are clearly data. Does it follow that they fit the relational database model like a glove? Not at all. In fact, the opposite tends to be the case. And yet, all over the map, in my travels through the software development world, I see documents bludgeoned into databases. One system of my acquaintance has to manage 120 small HTML documents in a hierarchy. The developer spent a significant amount of time - starting on day one of the project - figuring out how to represent the hierarchy he needed inside a set of relational database tables. Each record then contained a CLOB field[4] into which the HTML was placed. As part of the system design, the developer then had to ensure that he coded all the necessary CRUD functions[5] so that mere mortals could create new content, edit existing content, delete existing content and so on.
These days, I tend to store documents in a file system - at least until the volume gets into the many tens of thousands of individual documents. I use plain vanilla folders to represent hierarchy. I use plain vanilla naming conventions on filenames to provide simple "views" over the data. Where necessary, I use a search engine of some form to provide full text search. I use plain vanilla file system utilities for the CRUD functions and so on.
To some developers I meet on my travels, this feels wrong. "Surely", the thought goes, "in order to be properly managed, the content needs to be in a database?" Not so. It is absolutely true in many cases but not in all cases. Some of the best managed content I have ever seen sits on a Unix file system and some of the worst managed content I have ever seen sits in a big honking, breathtakingly expensive, relational database.
Data management is as much a people/process thing as it is a technology. It is hard to keep that in mind when the marketing materials for relational databases keep piling up on your desk. Especially since no such marketing material for the power of your file system (a system that comes for free with your operating system) ever piles up on your desk.
Remember that managing data through a simple file system does not make you a bad person. Beware of a rush of Codd to the hand. It is as dangerous as a rush of code to the hand but not as easily detected.
A rush of codd to the hand
The Most
-
Will Do Not Track kill the 'free' Internet?
8 comments
-
How to avoid being tagged as a terrorist: Don't pay cash for coffee
6 comments
-
How to kill Web trackers dead
3 comments
-
Even after rewrites, Google Wallet retains gaping security holes, mainly due to Android
3 comments
-
Hacked Microsoft online store saved passwords in plain text
2 comments
Open Source Month
ITworld LIVE
FlimpVlad_YahP3C7ER has just joined ITworld
itcruld has just joined ITworld
adelphie has just joined ITworld
DariaJones12528 has just joined ITworld
ShojiitagakiXS_tw473572786 has just joined ITworld
jnaze shared iPad apps for book lovers on Email
Cube has just joined ITworld
Gerald Lau has just joined ITworld
ryanhellyer_tw14598449 has just joined ITworld
rasel2011 has just joined ITworld
Mark Cummuta shared IT pay: Premiums for IT skills drop as IT departments reorganize on Twitter
The white paper Guaranteeing 100% Backup Recovery was viewed
CorinaGraham has just joined ITworld
DevelopmentWhite Papers & Webcasts
White Paper
HP NonStop SQL Fundamentals whitepaper
White Paper
Nebraska Medical Center case study
White Paper
Concepts of NonStop SQL/MX
See more White Papers | Webcasts
Answers - Powered by ITworld
ITworld Answers helps you solve problems and share expertise. Ask a question or take a crack at answering the new questions below.
Join Now













