April 18, 2012, 6:11 AM — The floods that devastated the hard disk industry in Thailand are now half a year old, and the prices per terabyte are finally dropping once again. That means data will start piling up and people around the office will wonder what can be done with it. Perhaps there are some insights in those log files? Perhaps a bit of statistical analysis will find some nuggets of gold buried in all of that noise? Maybe we can find enough change buried in the couch cushions of these files to give us all a raise?
The industry now has a buzzword, "big data," for how we're going to do something with the huge amount of information piling up. "Big data" is replacing "business intelligence," which subsumed "reporting," which put a nicer gloss on "spreadsheets," which beat out the old-fashioned "printouts." Managers who long ago studied printouts are now hiring mathematicians who claim to be big data specialists to help them solve the same old problem: What's selling and why?
[ Also on InfoWorld: Enterprise Hadoop: Big data processing made easier | Explore the current trends and solutions in BI with InfoWorld's interactive Business Intelligence iGuide. | Discover what's new in business applications with InfoWorld's Technology: Applications newsletter. ]
It's not fair to suggest that these buzzwords are simple replacements for each other. Big data is a more complicated world because the scale is much larger. The information is usually spread out over a number of servers, and the work of compiling the data must be coordinated among them. In the past, the work was largely delegated to the database software, which would use its magical JOIN mechanism to compile tables, then add up the columns before handing off the rectangle of data to the reporting software that would paginate it. This was often harder than it sounds. Database programmers can tell you the stories about complicated JOIN commands that would lock up their database for hours as it tried to produce a report for the boss who wanted his columns just so.