Why is the NoSQL choice so difficult?

Choosing a NoSQL database is agonizing


Show More

The last time I was evaluating NoSQL databases I ended up sticking with a relational database. I’m evaluating them again today and this time I’m pretty sure I’ll have to actually choose one. The choice is really hard for a number of reasons.

Conventional wisdom says that NoSQL databases are a great fit for certain types of data, namely non-relational data. At the same time NoSQL is touted as a superior platform for modern web applications. The reality though is that most data, especially when it comes to web applications, is relational. Is that enough reason to stick with a RDMS then? Not necessarily, but it’s going to make the choice even harder.

A big part of the problem when evaluating NoSQL is the huge amount of conflicting theory on the topic. Some people say (re: document stores) you need to store all document data within a single document and doing joins in code is blasphemy. Other say storing document references and doing in-code joins is sensible. At the same time, different databases recommend limiting the amount of nested data in a document. Others will encourage document references. This is a fundamental part if data modeling in NoSQL and there is not a clear consensus.

Then there are the top ranking posts titled “Why you should never use XYZ”, of which at least one will exist for the engine you’re considering. The legitimacy of those articles varies of course and the blanket recommendation in the title doesn’t help. What’s certain however is that someone will google your choice and that will be the first thing they read, then forward to you. Further skewing your perception, there are way more negative articles than there are success stories. It’s tough to know who is providing a valid technical concern and who has simply misunderstood a capability (or lack thereof).

Then there are the sheer number of choices. In RDBMS world, the choice is pretty easy. You have your 4 or 5 usual suspects which generally work in a similar way and you usually choose the platform that your environment (and budget) supports and your dev/ops are familiar with. There is little risk with these mature products. In the NoSQL world there are dozens of database engines to choose from. Each has their specific strengths and each has their crippling weaknesses. Making things even harder, NoSQL projects tend to come and go rather quickly making it risky to try something new or something less popular. Last time I looked I had about settled on CouchDB. Today that appears to be a project circling the drain (although it’s hard to know).

The major reason I’ve been agonizing over this decision is that it’s probably a case where you won’t know you’ve made a bad choice until you’ve done a bunch of work. You can mock up your data models and get a sense of how you’ll work with the system, but it’s only when you hit a solid wall that you find the real flaw. In my case, the application we’re building has data that is relational. The major factor in moving to a document store is that we need a schemaless design to achieve our goals. Using a NoSQL database to house relational data isn’t something that’s really talked about, but it’s definitely happening a lot.

Currently I’m down to Couchbase and MongoDB. I’m not really into Mongo so far but its massive popularity is a big positive for the engine. Of course that could be popular in the same way PHP is popular, because it’s accessible, not because it’s good. I’m working through some test projects in both and I’m leaning toward Couchbase. If anyone has experience with a NoSQL platform and wants to offer up some tips, I’m all ears. Likewise, if you’ve done work with relational data in NoSQL, speak up! I know you’re out there.

ITWorld DealPost: The best in tech deals and discounts.
Shop Tech Products at Amazon