March 15, 2010, 4:49 PM — It's never been easy to access -- much less analyze -- the vast amount of data available to, for example, determine changes in customer behavior or sentiment, optimize routing of telephone switches based on call patterns, or analyze financial portfolio pricing or risk. Traditional business intelligence systems rely on highly structured (and usually transactional) data stored in massive data cubes and data warehouses, which requires significant upfront work to decide what is being analyzed and to ensure all the data is consistent with that goal -- that is, you know what you are looking for. But that approach isn't useful for exploring trends or patterns, especially in external data that wasn't formatted for your needs.
And there is lots and lots of that data -- recently been coined "big data" -- out there that could provide insight if only you could access it and analyze it. To address this need, Appistry today announced CloudIQ Storage, a new addition to its CloudIQ Platform analytics service. CloudIQ Storage lets you store vast amounts of unstructured and semistructured data -- the company claims its architecture can handle petabytes of information -- to which you can then run exploratory analytics on.
[ Stay up to date on cloud computing news and insights with InfoWorld's Cloud Computing newsletter. ]
Appistry says its distributed-file-system approach reduces the risk of bottlenecks in the access to and storage of the data. When used with its CloudIQ Platform, Appistry says CloudIQ Storage lets the analytics application "workloads" be run wherever the data is stored. By moving the workloads to the data, rather taking than the traditional approach of moving the data to the application, Appistry says the analytics results are available faster.
The company also announced a version of CloudIQ Storage for use with Apache Hadoop, an open source "big data" analytics project that relies on the MapReduce software framework for distributed processing created by Google. Appistry says the Hadoop version of CloudIQ replaces the Hadoop "namenode" -- the metadata repository -- with a clustered version that distributes the repository across multiple nodes, both to reduce the chance of a repository error bringing down the whole system and to allow for distributed workload processing. (Hadoop is used by several database providers, including IBM, Teradata, and Sybase, and by cloud analytics provider CloudEra.)
CloudIQ Storage is expected to be available by July for both private cloud and public cloud environments.