Really big data

February 16, 2001, 03:18 PM —  Network World — 

If you think you have storage issues, consider this: IBM is working with Seitel, a leading provider of seismic data for the oil and gas industry, to put together a storage-area network with an initial size of one petabyte. Wow, one petabyte of data!

To give you an idea just how big a petabyte is, it equals 1 billion (1,000,000,000) megabytes, or roughly 1 quadrillion (1,000,000,000,000,000) bytes. Another way to look at this is that it would take about 15,000 of the hard drives found in most home PCs these days to store that much data.

Roy Williams of California Institute of Technology maintains a Web page that tries to give examples of some of these large amounts of information:

http://webpages.shepherd.edu/TMCGIL01/one.htm

Taking one of his examples, the entire printed collection of the U.S. Library of Congress contains about 10 terabytes of information. A petabyte is 100 times larger than that!

While most companies are not currently trying to access petabytes of information, there are many industries that are getting very close. With the amounts of information moving to digital format, it won’t be long before all companies of any size will have to wrestle with these problems. So what does it mean to manage and access a petabyte of data?

First of all, let’s look at the physical space it would take to house a petabyte. If we use Seagate’s highest-capacity Fibre Channel Cheetah drives, soon to be available with formatted capacities of 180G bytes per drive, it would take approximately 5,600 of those drives to store a petabyte. If you stacked them all on top of each other, you would have a tower about 467 feet high. Or it you stacked them 10 high, you would need about 1,120 square feet of floor space. That is just to house the disk drives. Most storage systems come with power and cooling as well.

The next question is: How much time would it take to move a petabyte of data on a SAN? I have decided to take a leap of faith and say, for simplicity’s sake, we can get transfer rates on and off the disk and through a Fibre Channel SAN at 100 megabytes/second. I will note, however, that most applications running on Unix or Windows NT systems today cannot drive data that fast, so your results certainly will differ. To move a petabyte of data at a rate of 100M byte/sec will take nearly 116 days of constant, uninterrupted streaming.

Finally, I thought about how much it would cost to buy a petabyte of simple capacity (not fancy RAID or management software). If we say we can get capacity at $0.04 a megabyte, it would cost $40 million for one petabyte. Most "managed" storage costs between $0.25 and $0.35. At $0.30 per megabyte, that petabyte of storage would cost $300 million -- again, just for the storage -- in, perhaps, a RAID configuration.

As disk capacities continue to increase and the price of storage continues to fall, the cost of purchasing petabytes of data is not out of the realm of possibility. As this simple analysis shows, performance of the storage infrastructure will quickly become the bottleneck in managing and accessing those large amounts of data. While there are some who may read this and be amazed at the amounts of data being stored and accessed today and in the neaar future, others will chuckle and say, "so tell me something I didn’t already know!" For those of you who work with large data sets on a daily basis, the good news is that some of the problems you have been struggling with for years are becoming mainstream.

» posted by ITworld staff

Network World

Sign up for ITworld's Daily newsletter
Follow ITworld on Twitter @IT_world

I like it!
Post a comment
The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
peer-to-peer

jfruh
Apple syncing patent can't come soon enough

pasmith
New Twitter features borrow from 3rd party clients

Esther Schindler
Open Source Changes the Software Acquisition Process

mikelgan
How to set up continuous podcast play on the new iTunes

David Strom
Five important Windows 7 mobility features

sjvn
Guard your Wi-Fi for your own sake                        

Sandra Henry-Stocker
Grepping on Whole Words

 

Sidekick: The Good News & the Bad News
Either way you look at it Microsoft Data Center management did not follow standards or best practices in this failure. In which case it makes me wonder more about the outsourcing of corporate data much less personal data.
- mburton325

Join the conversation here

The Daily Tip

The Daily TipQuick, practical advice for IT pros. Made fresh daily.

Hot tips:

Want to cash in on your IT savvy? Send your tip to tips@itworld.com. If we post it, we'll send you a $25 Amazon e-gift card.

Newsletters

Subscribe to ITWORLD TODAY and receive the latest IT news and analysis.

I would like to receive offers via email from ITworld partners.
By clicking submit you agree to the terms and conditions outlined in ITworld's privacy policy.
Featured Sponsor

AISO founders envisioned a Web hosting company that was environmentally friendly. While the company employed energy-efficient innovations like solar panels, its infrastructure produced unacceptable power and cooling requirements. Find out how AISO leveraged AMD technology to overcome their challenge in this case study white paper.

In this whitepaper, Scalar explores the opportunity to change the landscape with respect to mission critical databases built around Oracle. Leveraging technologies such as Linux, high-end commodity processing power and Oracle RAC technology to architect, design, build and maintain database infrastructure that delivers maximum availability, reliability and performance at a fraction of traditional cost.

On a typical day, weather.com, the Web site for The Weather Channel in Atlanta, serves up between 15 million and 20 million page views. But in September 2004, when back-to-back hurricanes ransacked Florida, the peak traffic on one day more than tripled: over 70 million page views by more than 7 million unique visitors. Read the full success story now.

Marketplace