John Ternent, CIO at Island One Resorts, says that whether his analytic challenges are driven by Big Data "depends on how capital your B and D are." But he's seriously considering using Hadoop instances in the cloud as an economical way of running complex mortgage portfolio analytics for the company, which manages eight timeshare resort properties across Florida. "That's a potential solution to a very real problem we have now," he says.
2. Business Analytics Get Faster
Big Data technologies are one element of a larger trend toward faster analytics, says University of Kentucky CIO Vince Kellen. "What we really want is advanced analytics on a hell of a lot of data," Kellen says. How much data one has is less critical than how efficiently it can be analyzed, "because you want it fast."
The capacity of today's computers to process much more data in memory allows for faster results than when searching through data on disk-even if you're crunching only gigabytes of it.
Although databases have, for decades, improved performance with caching of frequently accessed data, now it's become more practical to load entire large datasets into the memory of a server or cluster of servers, with disks used only as a backup. Because retrieving data from spinning magnetic disks is partly a mechanical process, it is orders of magnitude slower than processing in memory.
Rotella says he can now "run analytics in seconds that would take us overnight five years ago." His firm does predictive analytics on large data sets, which often involves running a query, looking for patterns, and making adjustments before running the next query. Query execution time makes a big difference in how quickly an analysis progresses. "Before, the run times would take longer than the model building, but now it takes longer to build the model than to run it," he says.
Columnular database servers, which invert the traditional row-and-column organization of relational databases, address another category of performance requirements. Instead of reading entire records and pulling out selected columns, a query can access only the columns of interest-dramatically improving performance for applications that group or measure a few key columns.
Ternent warns that the performance benefits of a columnar database come only with the right application and query design. "You have to ask it the right question the right way for it to make a difference," he says. Meanwhile, he says, columnar databases only really make sense for applications that must handle over 500 gigabytes of data. "You have to get a certain scale of data before columnar makes sense because it relies on a certain level of repetition" to achieve efficiencies."