- Giving data analysts, scientists and developers the capability to "cut and paste" existing ANSI SQL code from traditional data warehouses and instantly access data locked on a Hadoop cluster
- Giving developers the capability to use a standard Java JDBC interface to create new Hadoop applications or use any of the Cascading APIs and languages, like Scalding and Cascalog
- Giving companies the capability to query and export data from Hadoop directly into traditional BI tools
"We are very excited about the prospect of using standard SQL to provide seamless access to the billions of events that we track daily," says Zack Shapiro, director of engineering at Kontagent, a Concurrent customer.
"Rather than filtering through events and exporting them to MySQL, our customer support staff and data scientists will finally be able to work with tools they already know to query the raw data directly within our Hadoop cluster through the use of Lingual and Cascading," Shapiro says.
Cascading has already been adopted in some of the biggest and most well-known Big Data companies, like eBay, Etsy and Twitter. Twitter uses Cascading to streamline its data processing, data filtering and workflow optimization for large volumes of unstructured and semi-structured data. It is also the driving force behind three popular open source language extensions: PyCascading (Python + Cascading), Scalding (Scala + Cascading) and Cascalog (Clojure + Cascading).
"eBay has picked that up and is running it as well," Wensel notes. "All of eBay's search is now running on Scalding."
Thor Olavsrud covers IT Security, Big Data, Open Source, Microsoft Tools and Servers for CIO.com. Follow Thor on Twitter @ThorOlavsrud. Follow everything from CIO.com on Twitter @CIOonline, Facebook, Google + and LinkedIn. Email Thor at email@example.com