Abstract
Cloud storage systems are now mature enough to handle a massive volume of heterogeneous and rapidly changing data, which is known as Big Data. However, failures are inevitable in cloud storage systems as they are composed of large scale hardware components. Improving fault tolerance in cloud storage systems for Big Data applications is a significant challenge. Replication and Erasure coding are the most important data reliability techniques employed in cloud storage systems. Both techniques have their own trade-off in various parameters such as durability, availability, storage overhead, network bandwidth and traffic, energy consumption and recovery performance. This survey explores the challenges involved in employing both techniques in cloud storage systems for Big Data applications with respect to the aforementioned parameters. In this paper, we also introduce a conceptual hybrid technique to further improve reliability, latency, bandwidth usage, and storage efficiency of Big Data applications on cloud computing.
Original language | English |
---|---|
Pages (from-to) | 35-47 |
Number of pages | 13 |
Journal | Journal of Network and Computer Applications |
Volume | 97 |
DOIs | |
Publication status | Published - 1 Nov 2017 |
Bibliographical note
Publisher Copyright:© 2017 Elsevier Ltd
Keywords
- big data
- cloud computing
- fault-tolerant computing
- replication (experimental design)