RainStor releases next-gen big data repository

New RainStor 4.5 structured data repository adds 'intra column' deduplication capabilities

RainStor today announced the next generation of its online data repository. The update adds data deduplication capabilities and improved optimization for storing computer-generated historical data.

[ See also: Big data: How a trucking firm drove out big errors ]

The new RainStor 4.5 version can run on a storage area network (SAN) or network-attached storage (NAS) system as a repository for structured data. The latest generation of the software is aimed at capturing and then serving up online transaction processing (OLTP) data sets, user log data and metadata.

The software comes with a resource description framework (RDF) interface to automatically join data from relational databases to the repository.

RainStor 4.5 adds "intra-column" deduplication, which is a single-instance storage feature that captures one copy of repetitive data and creates a pointer back to it for search queries. For example, online transaction databases may capture the same URL address over and over filling up millions of columns with repetitive data. RainStor will capture only a single copy of the URL and use it over and over when retrieving online transactions related to that particular online site.

"We're able to reduce the data footprint by 95% because of deduplication," said Ramon Chen, RainStor's vice president of product management.

The product's user interface replicates a standard relational database management system versus a data warehouse. Thus, administrators won't need additional training, Chen said.

Unlike an Oracle or a SQL database, which are optimized to find a single record among millions, RainStor's repository pre-analyzes data it stores. The product places millions of related records in large blocks that can be quickly retrieved by a computer system's memory for faster search results.

"It's like a global positioning system. In the search window, you can type in the city or type in the exact address. An Oracle database will immediately search for an exact address, which can take a long time. With RainStor, it first gets you to the city, then it narrows the search down to the exact address," Chen said.

Lucas Mearian covers storage, disaster recovery and business continuity, financial services infrastructure and health care IT for Computerworld. Follow Lucas on Twitter at @lucasmearian , or subscribe to Lucas's RSS feed . His e-mail address is lmearian@computerworld.com .

Read more about databases in Computerworld's Databases Topic Center.

This story, "RainStor releases next-gen big data repository" was originally published by Computerworld.

Top 10 Hot Internet of Things Startups
Join the discussion
Be the first to comment on this article. Our Commenting Policies