NSA's Accumulo NoSQL store offers role-based data access

Unlike other NoSQL data stores, Accumulo provides role-based access to data

By , IDG News Service |  Software

With its much-discussed enthusiasm for collecting large amounts of data, the NSA naturally found much interest in the idea of highly scalable NoSQL databases.

But the U.S. intelligence agency needed some security of its own, so it developed a NoSQL data store called Accumulo, with built-in policy enforcement mechanisms that strictly limit who can see its data.

At the O'Reilly Strata-Hadoop World conference this week in New York, one of the former National Security Agency developers behind the software, Adam Fuchs, explained how Accumulo works and how it could be used in fields other than intelligence gathering. The agency contributed the software's source code to the Apache Software Foundation in 2011.

"Every single application that we built at the NSA has some concept of multilevel security," said Fuchs, who is now the chief technology officer of Sqrrl, which offers a commercial edition of the software.

The NSA started building Accumulo in 2008. Much like Facebook did with its Cassandra database around the same time, the NSA used the Google Big Table architecture as a starting point.

In the parlance of NoSQL databases, Accumulo is a simple key/value data store, built on a shared-nothing architecture that allows for easy expansion to thousands of nodes able to hold petabytes worth of data. It features a flexible schema that allows new columns to be quickly added, and comes with some advanced data analysis features as well.

Accumulo's killer feature, however, is its "data-centric security," Fuchs said. When data is entered into Accumulo, it must be accompanied with tags specifying who is allowed to see that material. Each row of data has a cell specifying the roles within an organization that can access the data, which can map back to specific organizational security policies.

It adheres to the RBAC (role-based access control) model. This approach allowed the NSA to categorize data into its multiple levels of classification -- confidential, secret, top secret -- as well as who in an organization could access the data, based on their official role within the organization. The database is accompanied by a policy engine that decides who can see what data.

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Answers - Powered by ITworld

ITworld Answers helps you solve problems and share expertise. Ask a question or take a crack at answering the new questions below.

Ask a Question