From: www.itworld.com
May 20, 2008 —
Startup Aster Data Systems
is coming to market with what it calls an "Internet-scale" cluster
database, and has landed a major initial customer: MySpace,
to back up the claim.
Cluster databases spread workloads over a number of servers, or nodes, and
use an architecture that makes it easy to add more machines for additional processing
power. The practice of running large numbers of computers in tandem for the
same computing task is also known as MPP (massively parallel processing).
However, Aster adds a new wrinkle to the concept. Its nCluster product splits
up the various components of a data-analysis workload into discrete pieces.
The "loader" tier handles data loading and export to and from external
sources. Nodes in the "worker" layer store data on locally attached
disks for querying.
Meanwhile, a layer of "queen" nodes sits on top, "doing the
intelligent query planning and processing," said Jack Norris, vice president
of marketing.
The ability to selectively scale segments of the cluster according to demand
provides efficiency because users can add resources in areas where they're needed
most, the company said.
Meanwhile, users and their business intelligence tools interact with the cluster
as if it were a single entity, Norris said.
"An end-user doesn't need to see there's a queen tier, or worker tier
or a loader. They're just seeing an Aster database," he said.
Aster's system also provides a management capability that shifts workloads
automatically if a node fails, the company said.
MySpace's implementation, which it is using to study Web traffic, consists
of a 100-node cluster that can analyze 360 terabytes of data, according to a
case study provided by Aster,
Aster's tactic of breaking up the workload into multiple tiers is "much
smarter" than others, said Boris Evelson of Forrester Research. "This
approach addresses one important limitation that other MPP vendors have: query
response time is only as fast as the slowest node," he said.
The software is available now. It is priced based on the amount of user data,
and starts at US$100,000, Aster said.
IDG News Service