Twice now I’ve deeply investigated NoSQL for an upcoming project, and twice now I’ve come to the conclusion that it would be a bad decision. Following up on my last post about why the decision to go NoSQL is so difficult, we’ve decided that once again, we’ll abandon the notion in favor of a good old relational database.
Many of the comments I received on that post illustrate one of the big problems with evaluating NoSQL - there are a million niche solutions and the common guidance is “it depends on what you need”. Even if you know what you need, it takes significant research and understanding to know if a particular NoSQL engine is right or wrong for that need. You can’t possibly evaluate them all, there are far too many. To make matters worse, you have to wade through engine-specific documentation to read about how you would accomplish your goals, most of which seem like workaround efforts if you have relational data or would like ACID transactions.
Compare that to a relational SQL database where, for the most part, you know how the engine will work regardless of the particular product. You also have far fewer choices and they are mature and proven. The chances of you making a poor choice are much lower with a RDBMS.
The big draw to NoSQL is it’s ability to scale out with ease and to provide very high throughput. While it would be really nice to have the same scalability with an RDBMS, the real world fact is that 99% of applications will never operate at a scale where it matters. Look at Stack Exchange. They are one of the most trafficked sites on the planet and they run on MSSQL using commodity servers. Add to that the ability to buy a 60-core server with up to 6TB of RAM basically off the shelf today and it seems difficult to imagine outgrowing it for most. So what’s the real benefit of NoSQL once you come to grips with that fact?
Initially I thought the schemaless nature of a document store would be a benefit, but I’ve come to change my view on that. For a relation web application at least. Schemaless is only going to increase code complexity. Furthermore, I like structure, especially in data. I could see schemaless being helpful if you were building a very specific type of database for something like data archival, file storage, or event logs but not so much for a web application that’s not a social network.
Every part of your software becomes more complicated with a document store vs. a relational database. It’s helpful to think of NoSQL as a flat file storage system where the filename is the key and the file contents are the value. You can store whatever you want in these files and you can read/write to them very quickly, but there are no brains behind the storage. (Of course I'm generalizing here, NoSQL engines are very sophisticated as far as managing and optimizing those files but it knows nothing about the data itself.) All of the brains of a relational database are gone and you’re left to implement everything you’ve taken for granted with SQL in your code...for every application. The overhead is not justifiable for most applications.
Even the people building NoSQL engines have a hard time describing use cases for their own products. Just read through some of the comments. Many of them are advertisements for their own products but they don’t offer any particularly compelling reason to choose NoSQL. It’s rare that a SaaS application is going to have non-relational data. That being the case, you’ll get a lot more out of a RDBMS system than you will out of a NoSQL system. I think it’s time NoSQL engines start embracing what they aren't as much as what they are. All of the talk about how you can achieve high performance relational applications in NoSQL is just clouding everyone’s judgement. There is a narrow segment of software which can benefit from NoSQL more than it can from SQL. Once all of the hype dies down and the number of NoSQL engines available is reduced to a reasonable number, NoSQL will be a useful tool for those scenarios. I also think that NoSQL will become a useful companion to SQL systems in the future more than it will become a replacement. For now I’m sticking with SQL.