AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (4 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

Erasure Coding for Cloud Storage Systems: A Survey

Department of Electrical and Computing Engineering, University of Toronto, Toronto M5S 3G8, Canada
Show Author Information

Abstract

In the current era of cloud computing, data stored in the cloud is being generated at a tremendous speed, and thus the cloud storage system has become one of the key components in cloud computing. By storing a substantial amount of data in commodity disks inside the data center that hosts the cloud, the cloud storage system must consider one question very carefully: how do we store data reliably with a high efficiency in terms of both storage overhead and data integrity? Though it is easy to store replicated data to tolerate a certain amount of data losses, it suffers from a very low storage efficiency. Conventional erasure coding techniques, such as Reed-Solomon codes, are able to achieve a much lower storage cost with the same level of tolerance against disk failures. However, it incurs much higher repair costs, not to mention an even higher access latency. In this sense, designing new coding techniques for cloud storage systems has gained a significant amount of attention in both academia and the industry. In this paper, we examine the existing results of coding techniques for cloud storage systems. Specifically, we present these coding techniques into two categories: regenerating codes and locally repairable codes. These two kinds of codes meet the requirements of cloud storage along two different axes: optimizing bandwidth and I/O overhead. We present an overview of recent advances in these two categories of coding techniques. Moreover, we introduce the main ideas of some specific coding techniques at a high level, and discuss their motivations and performance.

References

[1]
Amazon simple storage service (Amazon S3), http://aws.amazon.com/s3/, 2013.
[2]
Worl’ data more than doubling every two years — Driving big data opportunity, new IT roles, http://www.emc.com/about/news/press/2011/20110628-01.htm, 2013.
[3]
IDC says world’s storage is breaking Moore’s law, more than doubling every two years, http://enterprise.media.seagate.com/2011/06/inside-it-storage/idc-says-worlds-storage-is-breaking-moores-law-more-than-doubling-every-two-years/, 2012.
[4]
D. Beaver, S. Kumar, H. C. Li, J. Sobel, and P. Vajgel, Finding a needle in Haystack: Facebook’s photo storage, in Proc. 9th USENIX Conference on Operating Systems Design and Implementation (OSDI), 2010.
[5]
M. Sathiamoorthy, M. Asteris, D. Papailiopoulos, A. G. Dimakis, R. Vadali, S. Chen, and D. Borthakur, XORing elephants: Novel erasure codes for big data, Proc. VLDB Endowment, to appear.
[6]
A. W. Kosner, Amazon cloud goes down Friday night, taking Netflix, Instagram And Pinterest with it, http://www.forbes.com/sites/anthonykosner/2012/06/30/amazon-cloud-goes-down-friday-night-taking-netflix-instagram-and-pinterest-with-it/, Forbes, 2012, June 30.
[8]
M. Al-Fares, A. Loukissas, and A. Vahdat, A scalable, commodity data center network architecture, in Proc. ACM SIGCOMM Conference on Data Communication (SIGCOMM), 2008.
[9]
A. G. Dimakis, P. B. Godfrey, M. J. W. Y. Wu, and K. Ramchandran, Network coding for distributed storage system, IEEE Trans. Inform. Theory, vol. 56, no. 9, pp. 4539-4551, 2010.
[10]
I. Reed and G. Solomon, Polynomial codes over certain finite fields, Journal of the Society for Industrial and Applied Mathematics, vol. 8, no. 2, pp. 300-304, 1960.
[11]
C. Huang, H. Simitci, Y. Xu, A. Ogus, B. Calder, P. Gopalan, J. Li, and S. Yekhanin, Erasure coding in windows azure storage, in Proc. USENIX Annual Technical Conference (USENIX ATC), 2012.
[12]
S.-Y. R. Li, R. W. Yeung, and N. Cai, Linear network coding, IEEE Trans. on Inform. Theory, vol. 49, no. 2, pp. 371-381, 2003.
[13]
T. Ho, R. Koetter, M. Medard, D. Karger, and M. Effros, The benefits of coding over routing in a randomized setting, in Proc. IEEE International Symp. Inform. Theory, 2003, pp. 442-442.
[14]
A. Duminuco and E. Biersack, A practical study of regenerating codes for peer-to-peer backup systems, in Proc. IEEE International Conference on Distributed Computing Systems (ICDCS), 2009, pp. 376-384.
[15]
K. Rashmi, N. Shah, and P. Kumar, Optimal exact-regenerating codes for distributed storage at the MSR and MBR points via a product-matrix construction, IEEE Trans. on Inform. Theory, vol. 57, no. 8, pp. 5227-5239, 2011.
[16]
C. Suh and K. Ramchandran, Exact-repair MDS code construction using interference alignment, IEEE Trans. on Inform. Theory, vol. 57, no. 3, pp. 1425-1442, 2011.
[17]
N. B. Shah, K. V. Rashmi, P. V. Kumar, and K. Ramchandran, Interference alignment in regenerating codes for distributed storage: Necessity and code constructions, IEEE Trans. on Inform. Theory, vol. 58, no. 4, pp. 2134-2158, 2012.
[18]
V. R. Cadambe, S. A. Jafar, and H. Maleki, Distributed data storage with minimum storage regenerating codes - Exact and functional repair are asymptotically equally efficient, http://arxiv.org/abs/1004.4299, 2010.
[19]
C. Suh and K. Ramchandran, On the existence of optimal exact-repair MDS codes for distributed storage, http://arxiv.org/abs/1004.4663, 2010.
[20]
D. S. Papailiopoulos and A. G. Dimakis, Distributed storage codes through hadamard designs, in Proc. IEEE International Symposium on Information Theory (ISIT), 2011, pp. 1230-1234.
[21]
Y. Wu, A construction of systematic MDS codes with minimum repair bandwidth, IEEE Trans. on Inform. Theory, vol. 57, no. 6, pp. 3738-3741, 2011.
[22]
I. Tamo, Z. Wang, and J. Bruck, MDS array codes with optimal rebuilding, in Proc. IEEE International Symposium on Information Theory (ISIT), 2011, pp. 1240-1244.
[23]
V. R. Cadambe, C. Huang, S. A. Jafar, and J. Li, Optimal repair of MDS codes in distributed storage via subspace interference alignment, http://arxiv.org/abs/1106.1250, 2011.
[24]
R. Bhagwan, K. Tati, Y.-C. Cheng, S. Savage, and G. M. Voelker, Total recall: System support for automated availability management, in Proc. USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2004.
[25]
Y. Hu, Y. Xu, X. Wang, C. Zhan, and P. Li, Cooperative recovery of distributed storage systems from multiple losses with network coding, IEEE Journal on Selected Areas in Communications, vol. 28, no. 2, pp. 268-276, 2010.
[26]
K. Shum, Cooperative regenerating codes for distributed storage systems, in IEEE International Conference on Communications (ICC), 2011.
[27]
A. Kermarrec and N. Le, Repairing multiple failures with coordinated and adaptive regenerating codes, in IEEE International Symposium on Network Coding (NetCod), 2011.
[28]
K. Shum and Y. Hu, Exact minimum-repair-bandwidth cooperative regenerating codes for distributed storage systems, in Proc. IEEE International Symposium on Information Theory (ISIT), 2011.
[29]
A. Wang and Z. Zhang, Exact cooperative regenerating codes with minimum-repair-bandwidth for distributed storage, http://arxiv.org/abs/1207.0879, 2012.
[30]
N. Le Scouarnec, Exact scalar minimum storage coordinated regenerating codes, in Proc. IEEE International Symposium on Information Theory (ISIT), 2012.
[31]
K. Shum and Y. Hu, Functional-repair-by-transfer regenerating codes, in Proc. IEEE International Symposium on Information Theory (ISIT), 2012.
[32]
Y. Hu, P. P. C. Lee, and K. W. Shum, Analysis and construction of functional regenerating codes with uncoded repair for distributed storage systems, http://arxiv.org/abs/1208.2787v1, 2012.
[33]
N. B. Shah, K. V. Rashmi, P. V. Kumar, and K. Ramchandran, Distributed storage codes with repair-by-transfer and nonachievability of interior points on the storage-bandwidth tradeoff, IEEE Trans. on Inform. Theory, vol. 58, no. 3, pp. 1837-1852, 2012.
[34]
K. Rashmi, N. Shah, and P. Kumar, Explicit construction of optimal exact regenerating codes for distributed storage, in Forty-Seventh Annual Allerton Conference on Communication, Control, and Computing, 2009.
[35]
J. Li, X. Wang, and B. Li, Pipelined regeneration with regenerating codes for distributed storage systems, in Proc. IEEE International Symposium on Network Coding (NetCod), 2011.
[36]
J. Li, X. Wang, and B. Li, Cooperative pipelined regeneration in distributed storage systems, in Proc. IEEE INFOCOM, to appear.
[37]
A. Duminuco and E. W. Biersack, Hierarchical codes: A flexible trade-off for erasure codes in peer-to-peer storage systems, Peer-to-Peer Networking and Applications, vol. 3, no. 1, pp. 52-66, 2010.
[38]
Z. Huang, E. Biersack, and Y. Peng, Reducing repair traffic in P2P backup systems: Exact regenerating codes on hierarchical codes, ACM Transactions on Storage (TOS), vol. 7, no. 3, p. 10, 2011.
[39]
F. Oggier and A. Datta, Self-repairing homomorphic codes for distributed storage systems, in Proc. IEEE INFOCOM, 2011.
[40]
F. Oggier and A. Datta, Self-repairing codes for distributed storage—A projective geometric construction, in IEEE Information Theory Workshop (ITW), 2011.
[41]
D. S. Papailiopoulos, J. Luo, A. G. Dimakis, C. Huang, and J. Li, Simple regenerating codes: Network coding for cloud storage, in Proc. IEEE INFOCOM, 2012, pp. 2801-2805.
[42]
P. Gopalan, C. Huang, H. Simitci, and S. Yekhanin, On the locality of codeword symbols, IEEE Transactions on Information Theory, vol. 58, no. 11, pp. 6925-6934, 2012.
[43]
D. S. Papailiopoulos and A. G. Dimakis, Locally repairable codes, in Proc. IEEE International Symposium on Information Theory (ISIT), 2012, pp. 2771-2775.
Tsinghua Science and Technology
Pages 259-272
Cite this article:
Li J, Li B. Erasure Coding for Cloud Storage Systems: A Survey. Tsinghua Science and Technology, 2013, 18(3): 259-272. https://doi.org/10.1109/TST.2013.6522585

766

Views

43

Downloads

52

Crossref

N/A

Web of Science

75

Scopus

0

CSCD

Altmetrics

Received: 09 March 2013
Accepted: 15 March 2013
Published: 03 June 2013
© The author(s) 2013
Return