Decentralized Deduplication in SAN Cluster File Systems
USENIX Annual Technical Conference (ATC), June 2009
Decentralized Deduplication in SAN Cluster File Systems
| 18,257 views | | | | |
|
| |
|
Abstract
File systems hosting virtual machines typically contain many duplicated blocks of data resulting in wasted storage space and increased storage array cache footprint. Deduplication addresses these problems by storing a single instance of each unique data block and sharing it between all original sources of that data. While deduplication is well understood for file systems with a centralized component, we investigate it in a decentralized cluster file system, specifically in the context of VM storage. We propose DEDE, a block-level deduplication sys- tem for live cluster file systems that does not require any central coordination, tolerates host failures, and takes ad- vantage of the block layout policies of an existing cluster file system.



This is a really interesting. I wonder how much DEDE has improved in the year and a half since this publication was released.