SAS performs 18 hour analytics job in 2.5 minutes

By Jennifer Kavur, ComputerWorld Canada |  Data Center, Analytics, SAS

SAS Institute Inc. has developed high performance computing model that uses HP BladeSystem Infrastructure servers configured as a private grid to process massive amounts of data. Analytics jobs that previously took an entire day to process can now be reduced to a few hours -- or minutes.

Jim Goodnight, CEO of SAS, and Chris Bailey, director of the Advanced Computing Lab at SAS, demonstrated the patent-pending technology on stage at the SAS Global Forum in Seattle. Goodnight was personally involved in the development of the code.

The demo used five racks of blades, with 196 blades in total. Each blade had eight CPU cores, providing a total of 1,664 cores. The model processed roughly one billion records of stock market data -- 100,000 market states, two horizons and 4,000 instruments -- in two minutes and 26 seconds.

This is normally an 18-hour job, said Bailey. "We are splitting it across 1,664 processors ... In this particular problem, not only are we using thousands of CPU cores, we are also not using any disk I/O at all. It has totally become an in-memory problem and we have terabytes of memory to solve it with," he said.

Bailey demonstrated getting an OLAP cube view on the fly without actually building an OLAP cube. "When we do our reporting, we are going to take the data we've just generated, which is just the leaf level data ... and we are not going to pre-aggregate it," he said.

This used to take three or four hours and now it can be done before you take your fingers off the keyboard, said Goodnight. "My advice is, don't ever build a cube if you've got that many dimensions ... put the data in memory, build the cube on the fly, build what you need on screen," he said.

The model is an example of "the exciting work" SAS is doing in its Advanced Computing Lab, said Goodnight. "The amount of data in the world, the amount of data we are having to deal with, continues to grow year after year. The size of the problems that we need to solve continues to grow. We've got to come up with a new computing paradigm we could use and we believe we've found it," he said.

In an interview the next day, Goodnight told a group of Canadian press that the high performance model is probably one and a half years away from mass adoption. People will find it hard to believe you can do a 24-hour job in 15 minutes, he said.

"Adoption of a technology of this nature is always very slow. People don't believe you can do this ... We'll see if we can convince people to give this stuff a try," he said.

Most people would call this a grid, but the computing methods SAS is using do not treat it as a grid, said Goodnight. "It treats it as a single unit so it is extremely tightly tied together ... I think of a grid as boxes scattered all over the place and we think of this as essentially one single computer," he said.

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Data CenterWhite Papers & Webcasts

See more White Papers | Webcasts

Answers - Powered by ITworld

ITworld Answers helps you solve problems and share expertise. Ask a question or take a crack at answering the new questions below.

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Ask a Question