Next generation supercomputers face disk failure
The next generation of supercomputers could be crippled by hard drive failures every few minutes, the U.S. Department of Energy has warned, and so it is funding a Petascale Data Storage Institute to solve the problem.
The Los Alamos Laboratory has commissioned RoadRunner, a 32,000 CPU supercomputer from IBM that will operate at petaflop levels -- that is a sustained speed of 1,000 trillion calculations per second. Put alternately, this is a quadrillion, a million billion, operations per second.
Thousands of hard disks will be needed to keep the thousands of CPUs supplied with data. And Garth Gibson, an associate professor of computer science at Carnegie Mellon university, who will lead the new Institute, has warned that this system "likely will require up to hundreds of thousands of magnetic hard disks to handle the data required to run simulations, provide checkpoint/restart fault tolerance and store the output of these modeling experiments. With such a large number of components, it is a given that some component will be failing at all times."
Current teraflop-level supercomputers, operating at trillions of operations per second, have disk failures once or twice a day, according to Gary Grider, a co-principal investigator at the Los Alamos National Laboratory. Once supercomputers are built out to the scale of multiple petaflops, he said, the failure rate could jump to once every few minutes.
Storage systems for them will need to tolerate many failures, mask the effects of them, and continue to operate reliably. "It's beyond daunting," Grider said of the challenge facing the new institute. "Imagine failures every minute or two in your PC and you'll have an idea of how a high performance computer might be crippled." He emphasized: "For simulations of phenomena such as global weather or nuclear stockpile safety, we're talking about running for months and months and months to get meaningful results."
» posted by abennett
Techworld.com
Symantec Backup Exec 12 and Backup Exec System Recovery 8 deliver industry leading Windows data protection and system recovery. Download this whitepaper to find out the top reasons to upgrade and how to get continuous data protection and complete system recovery.
Data and system loss — from a hard drive failure, malicious attack, natural disaster, or simple human error — can happen anytime. Don’t leave your business vulnerable. Make sure you have a secure recovery strategy in place. Symantec's latest backup and system recovery technology can efficiently restore critical applications, individual emails and documents and even restore your entire system in minutes in the event of a loss.
Businesses face a growing challenge to ensure that the IT environment is properly protected. Backup Exec 12 integrates with other applications in the Symantec family of products, to complement your current data protection strategy, keep your data securely backed up and make it recoverable when you need it most.







