According to a report published earlier this month, only 10 percent of large companies plan to use external clouds for storage, "even for lower tiers, including archive."
The report (PDF), from market researchers TheInfoPro, projects big increases in spending among the Fortune 1000 – a net 24 percent increase in dollar amount, though 44 percent of the companies plan to increase spending and 31 percent to remain flat.
Most of that increase will go for SAN gear and virtual-server backup, as well as automated tiering – which is designed to improve cost efficiency as well as performance by using SSD as the fast-access top level for files used by a lot of people all the time.
Files used less frequently shift to cheaper levels of storage ranging from arrays of disk in a SAN to NAS boxes, to storage appliances, optical disk, paper, clay tablets and eventually the vague memory of an older guy pushing a dolly around the data warehouse who tells you long stories about his grandkids before admitting whatever data you're looking for is either lost or deleted.
(The reason for a 24 percent increase in storage spending, of course, is because no corporate data is ever lost or deleted, no matter what the sad old guy says. Most big companies are so afraid of not having a piece of data they're unexpectedly called on by a court to produce that they don't pay attention to end-of-life issues, so they let data hang around for years after it's outlived its usefulness except for the potential to be embarrassing or expensive in court. You're allowed to delete this stuff; check with the records-management specialists you decided it was too expensive to hire last year. )
"Server virtualization transformed storage architectures and cloud computing is having the same impact," according to a statement in the release from InfoPro storage analyst Marco Coulter.
He's right, but not in a good way.
Virtual-server snapshots have all but replaced regular server backups at half the companies surveyed, the report shows. The only real problem with that is the potential to lose access to data if the standardized formats for VM snapshots changes, which is a good bet in a market that is still immature and whose data formats are changing quickly.
A bigger problem could be duplication and waste of space, as identical data and systems files show up in VMs, VM snapshots used for failover, VM snapshots used for backup, VMs launched and never killed off that haunt the grid as pointless wasters of compute power and space, and are replicated to and from the cloud to be accessible to end users.
Each of those VMs has an OS, hypervisor, VM management code and metadata, standard applications, junk files associated with the OS and the applications installed as standard with every VM – by policy – in your shop, as well as the data you're actually trying to save.
That's an awful lot of cruft to save with your data.
Luckily most IT organizations have realized that storage space in the cloud is not only vastly more expensive per gigabyte than internally maintained systems, getting large chunks of data to and from cloud providers is a problem.
Ever try to download 500GB worth of games and movies all at the same time? Think you'd be old before it would finish? Think about uploading several 500GB databases from your data center to Amazon's cloud a couple of times per week. Upgrade your Internet connections first. Rather than the OC-3s most data centers use, you'll probably want a couple of OC-500s. You might have to wait before they become available.
That whole thing with Amazon's EC2 being down for a week and a bunch of companies completely losing the data they had stored up there might serve as a warning, too. It would be a shame to wait that long for an upload and then have it just disappear.
It would also be a shame to have to explain to your CFO why you're buying Amazon storage at 14 cents/GB/month ( $1.68/GB/year) compared to anywhere from $1.24/GB/year upward for internal systems, depending on your levels of backup, availability and other features.
The cloud, in general, is just not a great place to build big chunks of storage, any more than huge bags of pet food, small but heavy construction materials or large items like desks or doorframes are good candidates for delivery through FedEx or UPS. You can still do some things with them online – research their features, reliability and components, but they're not the kind of thing that would fit neatly through that particular delivery medium or in your mailbox without making everything else back up just a bit.
Cloud is a great platform for some things. Apps, small chunks of data like personal directories or contact lists, enough data to keep your online CRM app running.
Using it for storage – especially lower-tier storage items that would normally be archived on clay tablets or the memory of old warehouse managers? No.