Rosenfeld, Eitan and Amit, Nadav and Tsafrir, Dan

Workshop on the Interaction amongst Virtualization, Operating Systems and Computer Architecture (WIVOSCA), 2013

Contemporary storage systems that utilize replication often maintain more than two replicas of each data item, reducing the risk of permanent data loss due to simultaneous disk failures. The price of the additional copies is smaller usable storage space, increased network traffic, and higher power consumption. We propose to alleviate this problem with SIMFAIL, a storage system that maintains only two replicas and utilizes per-disk “add-ons”, which are simple hardware devices equipped with relatively small memory that proxy disk I/O traffic. SIMFAIL can significantly reduce the risk of data loss due to temporally adjacent disk failures by quickly copying at risk data from disks to their add-ons. SIMFAIL can further eliminate the risk entirely by maintaining local parity information of disks on their add-ons (such that each add-on holds the parity of its own disk’s data chunks). We postulate that SIMFAIL may open the door for cloud providers to reduce the number of data replicas they use from three to two.