What AaaS needs to really succeed
BigQuery is latest analytics as a service, but is it a complete solution?
Call it "Analytics as a Service" (AaaS) or "Query as a Service"… whatever you want to call it, today's announcement of a newly public BigQuery analytics tool from Google is sure to intrigue anyone who wants to use a pay-as-you-go model for data analysis.
But is BigQuery really useful? And what does AaaS need to really deliver on its promises?
Here's what we know: the BigQuery tool is a hosted Google service where data providers that is advertised to be able "to run SQL-like queries against very large datasets, with potentially billions of rows. This can be your own data, or data that someone else has shared for you. BigQuery works best for interactive analysis of very large datasets, typically using a small number of very large, append-only tables."
After uploading your data to the BigQuery service, you gain the advantage of Google's massive infrastructure to run your queries on a lot of data, analyzing "billions of rows in seconds."
According to the BigQuery site, "users can tap into the Big Query service via several methods, including a Web-based user interface, a REST API and a command-line tool. Data can be imported into the Google BigQuery servers in CSV format."
The pricing for the service is dual-tiered: first, there's the storage, which is $0.12 per GB/month, up to 2 TB. For queries, it's $0.035 per GB processed, with up to 1000 queries a day (or TB of data). The good news is that the whole tables don't get processed, just the columns of data in your query.
All of this sounds pretty reasonable, but it's also important to raise some questions if you thinking about doing this. First off, if you're working with unstructured data, this is very likely not going to be your best option. Having to deal with the data in tabular format pretty much nixes the use of unstructured data.
Also, if your company is burdened with a standard commercial pipe to the Internet, you will need to factor the amount of time you're going to waste on uploading terabytes of data--especially in the beginning. That may be a non-issue, particularly if you use the the API to feed the data store automatically.
Another issue: this is just an analysis tool, not a full-bore database, so if you want your queries to perform any transactions, forget it. Google is very clear on this, and recommends you use Google Cloud SQL instead if you need these kinds of features.
It's not that BigQuery is completely useless… it's just important that you consider what you're getting into. There are other trillion-record tabular hosted services out there--1010data's pops into mind--so that may be something to check out and compare.
One thing is certain: you can bet your bottom dollar that AaaS is something that you're going to see a lot more of, as businesses strive to jump on the big data bandwagon--and quickly find that they can afford to host the kind of infrastructure it takes.
BigQuery is also going to be perceived as a shot across the bows of Amazon Web Services--more than one pundit today has already noted Google's slight undercutting of AWS's $0.125 per GB/month pricing. As far back as January, analysts were predicting the coming of AaaS from AWS, and Google's entry today will certainly force Amazon's hand soon.
One thing that must happen as AaaS starts making its mark on the world will be app stores built around services like BigQuery. Right now BigQuery still relies on SQL-like knowledge to perform analysis, which still leaves a lot of smaller companies on the other side of the knowledge gap. If you say "so what," remember that for this sort of service to take off, Google's sweet spot is not the enterprise users--they will be more likely to have the wherewithal to create their own home-grown big data solution. Therefore it makes a lot more sense that Google (and Amazon and whoever else jumps in this space) will be targeting the SMBs and SMEs of the world. And that means knowledge scarcity.
A "big data" app store with tools that provided canned and easily customizable analysis and visualization would be very much needed to address this gap. Given Amazon's latest foray with their EC2 Marketplace, I have a feeling that something similar will be in the works for any big data offerings they end up offering.
Which will ultimately make AaaS very useful.
Read more of Brian Proffitt's Zettatag and Open for Discussion blogs and follow the latest IT news at ITworld. Drop Brian a line or follow Brian on Twitter at @TheTechScribe. For the latest IT news, analysis and how-tos, follow ITworld on Twitter and Facebook.