Windows Server 8: Massive storage enhancements ahead

Microsoft makes big investments in data deduplication, live disk repair and more...

By , ITworld |  Storage, Data Deduplication, Microsoft

In the first part of my 2-part review of Windows Server 8 I looked at some of the best of the more than 300 new features Microsoft packed into the upcoming server OS. Now it's time to turn our attention to some massive storage enhancements.

[ See also: Windows Server 8: Highlights of the upcoming server OS ]


Data deduplication

My personal highlight of the entire three days of the Windows Server 8 reviewers workshop were the talks about Storage. The killer-feature to me is the new and built-in data deduplication, which detects duplicate data in files and folders, puts it in a separate store (System Volume Information) and simply gets rid of the redundant bytes. The file itself is 100% intact, though once it gets accessed it pulls the (now missing) information back from the one single data store.

Now, deduplication isn’t something groundbreaking. It's been done before, and it's been done well, but dedup has never found its way into the OS, which means it's deeply integrated and highly manageable. Microsoft Research invested 2 years on this algorithm and came up with techniques to minimize the performance impact caused by pulling one piece of data from one part of the disk and when fetching other pars from the data duplication store (fragmentation!); according to Microsoft's server team, dedup has a less than 3-4% impact on overall performance when accessing the data, although only performance tests will tell the true story.

However, the benefit greatly outweighs the possible downsides. Generally, you can expect a chunking rate of between 30% and 90%, which is absolutely amazing. On day 3 of the Windows Server 8 reviewers workshop, I had the chance to catch up with the development and program management team behind data deduplication and found out a couple of interesting tidbits:

  • Deduplication automatically runs on "idle". Say you've enabled deduplication on drive E and copy 20 gigs of files over, deduplication wouldn't start immediatelly. It would, however, wait until the server isn't quite as busy and perform the deduplication process. You have to keep in mind that going through files and detecting data is quite an I/O eater.

  • Admins can determine which files get deduplicated based on their age. Maybe you don’t want to dedup files that the server just created.

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Answers - Powered by ITworld

ITworld Answers helps you solve problems and share expertise. Ask a question or take a crack at answering the new questions below.

Ask a Question
randomness