February 12, 2013, 8:31 PM — When you hear the expression "big data," the next word you will often hear is "Hadoop." That's because the underlying technology that has made massive amounts of data accessible is based on the open source Apache Hadoop project.
From the outside looking in, you would rightly assume then that Hadoop is big data and vice versa; that without one the other cannot be. But there is a Hadoop competitor that in many ways is more mature and enterprise-ready: High Performance Computing Cluster.
Like Hadoop, HPCC is open-sourced under the Apache 2.0 license and is free to use. Both likewise leverage commodity hardware and local storage interconnected through IP networks, allowing for parallel data processing and/or querying across the architectures. This is where most of the similarities end, says Flavio Villanustre, vice president of information security and the lead for the HPCC Systems initiative within LexisNexis.
HPCC Older, Wiser Than Hadoop?
HPCC has been in production use for more than 12 years, though the HPCC open source version has been available for only a little more than a year. Hadoop, on the other hand, was originally part of the Nutch project that Google put together to parse and analyze log files and wasn't even its own Apache project until 2006. Since that time, though, it has become the de facto standard for big data projects, far outpacing HPCC's 60 or so enterprise users. Hadoop is also supported by an open source community in the millions and an entire ecosystem of start-ups springing up to take advantage of this leadership position.