A rush of codd to the hand

By Sean McGrath, ITworld.com |  Development Add a new comment

Sometimes, typos can be useful. Earlier today I wrote "rush of codd to the hand" in an e-mail when I should have written "rush of code to the hand". The e-mail concerned a not-very-pretty system of my acquaintance (I wrote it), in which the programmer (moi) had starting coding way too early in the development cycle out of sheer youthful enthusiasm.



In my experience, a rush of code to the hand is a very common problem in software development. A problem that is only adequately tackled through the application of large amounts of experience. Looking back now on the programmer I was then (we are talking 1992-93 time frame here), I simply was not experienced enough to make the right call. You live, you make mistakes, you learn, you move on.



Speaking of moving on, it is about time that I re-vectored this article to its proper subject matter which, is, believe it or not, relational databases. The "codd" in the serendipitous phrase "rush of codd to the hand" is Edgar F. Codd[1]. Codd was a British computer scientist who is best remembered as one of the founding fathers of the science of relational databases. Most students of relational databases will, sooner or later, come across his name; most likely cloaked in acronym form such as BCNF. BCNF is short for Boyce Codd Normal Form[2]. The acronym BCNF pops up a lot in database design and in particular, in an important soul cleansing endeavor known as database normalization[3].



In my experience, sheer enthusiasm can lead designers to introduce relational database into their designs way too early in the development cycle. The thought process seems to go something like this:



"We need code for the algorithms. Ok, let's use Java/C#/Python/Php (whatever). We have data that the algorithms will work on. We need a database. Ok, let's use Oracle/SQL Server/MySQL/Postgres (whatever)."



The key word here is the word "data". There are many forms of data that fit the relational database model like a glove. These forms of data are extremely commonplace. Things like customer details, line items of invoices, product inventories etc. etc. It is no wonder that relational databases are as popular as they are.



However, the problem starts when the word "data" is used as a catch-all for every type of data there is. Take documents for example. Documents are clearly data. Does it follow that they fit the relational database model like a glove? Not at all. In fact, the opposite tends to be the case. And yet, all over the map, in my travels through the software development world, I see documents bludgeoned into databases. One system of my acquaintance has to manage 120 small HTML documents in a hierarchy. The developer spent a significant amount of time - starting on day one of the project - figuring out how to represent the hierarchy he needed inside a set of relational database tables. Each record then contained a CLOB field[4] into which the HTML was placed. As part of the system design, the developer then had to ensure that he coded all the necessary CRUD functions[5] so that mere mortals could create new content, edit existing content, delete existing content and so on.



These days, I tend to store documents in a file system - at least until the volume gets into the many tens of thousands of individual documents. I use plain vanilla folders to represent hierarchy. I use plain vanilla naming conventions on filenames to provide simple "views" over the data. Where necessary, I use a search engine of some form to provide full text search. I use plain vanilla file system utilities for the CRUD functions and so on.



To some developers I meet on my travels, this feels wrong. "Surely", the thought goes, "in order to be properly managed, the content needs to be in a database?" Not so. It is absolutely true in many cases but not in all cases. Some of the best managed content I have ever seen sits on a Unix file system and some of the worst managed content I have ever seen sits in a big honking, breathtakingly expensive, relational database.



Data management is as much a people/process thing as it is a technology. It is hard to keep that in mind when the marketing materials for relational databases keep piling up on your desk. Especially since no such marketing material for the power of your file system (a system that comes for free with your operating system) ever piles up on your desk.



Remember that managing data through a simple file system does not make you a bad person. Beware of a rush of Codd to the hand. It is as dangerous as a rush of code to the hand but not as easily detected.



[1] http://en.wikipedia.org/wiki/Edgar_F._Codd

    Add a comment

    Post a comment using one of these accounts
    Or join now
    At least 6 characters

    Note: Comment will appear soon after you have activated your account.
    Obscene/spam comments will be removed and accounts suspended.
    The information you submit is subject to our Privacy Policy and Terms of Service.

    ITworld LIVE

    DevelopmentWhite Papers & Webcasts

    White Paper

    HP NonStop SQL Fundamentals whitepaper

    This whitepaper offers a detailed look into the fundamentals of HP NonStop SQL solutions. See how this system delivers unprecedented levels of application availability with fail-safe data integrity and meets the needs of enterprises with large-scale business critical applications.

    White Paper

    Nebraska Medical Center case study

    See how the Nebraska Medical Center implemented a SQL solution to make information more readily available to streamline operations, improve patient care and facilitate medical research with an enterprise solution running on HP NonStop servers.

    White Paper

    Concepts of NonStop SQL/MX

    For DBAs and developers who are familiar with Oracle solutions and want to learn about NonStop SQL/MX, this whitepaper provides an overview of the similarities and differences between the two products-with a specific focus on implementation.

    White Paper

    6 Things Your CIO Needs to Know About Requirements

    If your organization is not predictably successful on technology projects, there is likely an issue in requirements. CIOs must take action and own requirements maturity improvement. There are 6 main things a CIO must know about requirements.

    Webcast On Demand

    User Experience Monitoring

    In this webinar, you will learn hints & tips for improving end-user response times from Forrester Research analyst, Jean-Pierre Garbani.

    Sponsor: Nimsoft

    See more White Papers | Webcasts

    Ask a question

    Ask a Question