Survey on Storage and Optimization Techniques of HDFS

doi:10.13328/j.cnki.jos.005872

微信服务号

微信订阅号

Home > Archive>Volume 31, Issue 1, 2020 >137-161. DOI:10.13328/j.cnki.jos.005872

PDF HTML XML Export Cite reminder

Survey on Storage and Optimization Techniques of HDFS
DOI:
                        10.13328/j.cnki.jos.005872
                    
Author:
                        
                        
                    
Affiliation:
Clc Number:TP311
Fund Project:National Key Research and Development Program of China (2018YFB1004401); National Natural ScienceFoundation of China (U1711261, 61432006, 61732014)

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

As an append-only and read optimized open-source distributed file system, HDFS (Hadoop distributed file system) provides portability, high fault-tolerance, and massive horizontal scalability. Over the past decade, HDFS has been widely used for big data storage, and it manages various data, such as text, graph, key-values, etc. Moreover, big data systems based on or compatible with HDFS have been prevalent in many application scenarios such as complex SQL analysis, ad-hoc queries, interactive analysis, key-value storage, and iterative computation. HDFS has been the universal underlying file system to store massive data and support manifold analytical applications. Therefore, it is of great significance to optimizing the storage performance and data access efficiency of HDFS. In this study, the principles and features of HDFS are summarized and a survey on storage and optimization techniques of HDFS is carried out from three dimensions, including logic file structure, hardware, and application scenarios. It is also proposed that storage over heterogeneous hardware, workload-guided adaptive storage optimization, and storage optimization combined with machine learning technologies could be the most appealing research directions in the future.

Reference

Cited by

Get Citation

金国栋,卞昊穹,陈跃国,杜小勇. HDFS 存储和优化技术研究综述.软件学报,2020,31(1):137-161

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:January 17,2019
Revised:March 11,2019
Adopted:
Online: August 12,2019
Published: January 06,2020

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

Article Metrics

History