Photo illustration: Steve Traynor, IDGE
The definition of Big Data is linked to the three Vs: volume, velocity, and variety. While structured data, such as you'll find in databases and spreadsheets, certainly can be impressive in meeting the three-V definition, it pales in comparison to unstructured data.
Most observers apply the 80-20 rule when it comes to unstructured information. That is, 80% of the data flowing inside a business today is unstructured in memos, emails, sales proposals, images, PDFs, webinars, videos, and much more. Some analysts have put the number as high as 85-90%. No matter how you look at it, unstructured data dominates the information glut in your company.
Until recently, applying analytics to unstructured data was considered more theoretical than practical. Even industries that use the most advanced technologies available were seldom applying BI tools to their unstructured data. For example, the Aite Group estimated that in 2008 a mere 2% of firms in the capital markets used analytics on unstructured information. However, according to Aite, by 2011 that number jumped to 35%.
What suddenly changed? Well, for one thing, vendors, such as Sybase, began delivering products specifically designed to glean analytical insight from the massive volumes of unstructured data. More importantly, our customers began to understand the competitive advantage of getting that insight, which prompted vendors to deliver products.
For example, quantitative analysts on Wall Street discovered the value of querying unstructured data to determine whether a company's new product was a hit or miss in the marketplace. While product shipment data, channel sell-through, and other structured data sources are extremely important, sifting through social media, such as the opinions of influential bloggers, could also add clarity as to a product's reception among potential buyers. Without taking unstructured information into account, investors would not be getting a true picture of how well a product and its company will fare in the market.
Analyzing unstructured data is not, of course, just for Wall Street quants. Retailers can learn enormous amounts about their customers through unstructured data in social media. Even old-school manufacturers can get a deeper understanding of the health of their supply chain by analyzing unstructured data about their suppliers. Few industries will not benefit from analyzing unstructured information.
Given its sheer size, the availability of tools, and the extreme value it holds, unstructured data analytics is going to be one of the most exciting areas in business intelligence in the coming decade.
Invent new possibilities with HANA, SAP's game-changing in-memory software
SAP Sybase IQ Database 15.4 provides advanced analytic techniques to unlock critical business insights from Big Data
SAP Sybase Adaptive Server Enterprise is a high-performance RDBMS for mission-critical, data-intensive environments. It ensures highest operational efficiency and throughput on a broad range of platforms.
SAP SQL Anywhere is a comprehensive suite of solutions that provides data management, synchronization and data exchange technologies that enable the rapid development and deployment of database-powered applications in remote and mobile environments