A METHOD OF ADJUSTING THE NUMBER OF REPLICA DYNAMICALLY IN HDFS

Bing Li, Ke Xu

Abstract

With the development of Cloud Computing, Hadoop –an Infrastructure as a Service open source project of Apache – has been used more and more widely. As the basis of Hadoop, the Hadoop Distributed File System (HDFS) provides basic function of file storage. HDFS-an open source implementation of Google File System (GFS) –was designed for specific demand. Once the demand was changed, HDFS cannot fit it very well. Especially when the access demand of file is different, there will be hot spots and the existing replica will not be enough. It will lower the efficiency of the whole system. This paper introduced a system-level strategy which could adjust the replica number of specific file dynamically. And the experiment shows that this mechanism can prevent the problem of decline of user experience bring by hot spots and improve the overall efficiency.

References

  1. J. Dean and S. Ghemawat, “MapReduce: Simplified data processing on large clusters”, In OSDI'04: Proceedings of the 6th Symposium on Operating Systems Design & Implementation, pages 10-10, 2004.
  2. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung, “The Google File System”, In SOSP'03: Proceedings of the nineteenth ACM symposium on Operating systems principles, page 29-43, New York, NY, YSA, 2003. ACM Press.
  3. Tom White, “Hadoop: The Definitive Guide”, O'Reilly Press, 2009
  4. Feng Wang, Jie Qiu, Jie Yang. “Hadoop High Availability through metadata Replication”, CloudDB'09, November 2, 2009, Hong Kong, China.
  5. Grant Mackey, Saba Sehrish, Jun Wang, “Improving Metadata Management for Small Files in HDFS”, IEEE International Conference on Cluster Computing and Workshop, 2009.
  6. Xuhui Liu, Jizhong Han, Yunqin Zhong, Chengde Han and Xubin He, “Implementing WebGIS on Hadoop: A Case Study of Improving Small File I/O Performance on HDFS”, IEEE, International Conference on Cluster Computing and Workshops, 2009
  7. Jeffrey Shafer, Scott Rixner, and Alan L. Cox, “The Hadoop Distributed Filesystem: Balancing Portability and Performance”, IEEE International Sympossium on Performane Analysis of Systems & Software (ISPASS), 2010.
Download


Paper Citation


in Harvard Style

Li B. and Xu K. (2011). A METHOD OF ADJUSTING THE NUMBER OF REPLICA DYNAMICALLY IN HDFS . In Proceedings of the 13th International Conference on Enterprise Information Systems - Volume 1: SSE, (ICEIS 2011) ISBN 978-989-8425-53-9, pages 529-533. DOI: 10.5220/0003587005290533


in Bibtex Style

@conference{sse11,
author={Bing Li and Ke Xu},
title={A METHOD OF ADJUSTING THE NUMBER OF REPLICA DYNAMICALLY IN HDFS},
booktitle={Proceedings of the 13th International Conference on Enterprise Information Systems - Volume 1: SSE, (ICEIS 2011)},
year={2011},
pages={529-533},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003587005290533},
isbn={978-989-8425-53-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 13th International Conference on Enterprise Information Systems - Volume 1: SSE, (ICEIS 2011)
TI - A METHOD OF ADJUSTING THE NUMBER OF REPLICA DYNAMICALLY IN HDFS
SN - 978-989-8425-53-9
AU - Li B.
AU - Xu K.
PY - 2011
SP - 529
EP - 533
DO - 10.5220/0003587005290533