HDFS Data Consistency Modelling and Analysis Based on Colored Petri Net
Author:
Affiliation:

Clc Number:

TP311

Fund Project:

National Natural Science Foundation of China (71690231, 61802224)

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    As one of the core components of Apache Hadoop, the Hadoop distributed file system (HDFS) has been widely used in the industry. HDFS adopts a multiple replicas mechanism to ensure data reliability, which may incur inconsistency because of node failure, network partition, and write failure. HDFS is considered to have reduced data consistency compared to traditional file systems, which is difficult for users to understand when there will be inconsistent. At present, there is no relevant work to verify the consistency mechanism. When the data is inconsistent, it will increase the uncertainty of the upper applications. Thus, research for data consistency model is required. The large scale of HDFS makes the analysis more difficult. Code reading, abstracting, colored Petri net modeling, and state-space analysis are conducted to comprehend the system. The works are listed as the following. (1) Colored petri nets are used to model HDFS's process of reading and writing files, the model describes the functions of inner components and their cooperation mechanism in detail. (2) Data layer consistency and operation layer consistency of HDFS are analyzed with state-space tools based on a colored Petri net model, figuring out data consistency guaranteed by the system. (3) A time point repeatable read method is proposed to verify operation layer consistency and serial repeatable strategy is utilized to decrease state-space complexity. Based on the contribution above, the directions for HDFS application development are proposed, helping to improve the data consistency. The CPN modeling method and technique are applicated in the analysis of other distributed information systems.

    Reference
    Related
    Cited by
Get Citation

乔嘉林,黄向东,杨义繁,王建民,吴凯.基于着色Petri网的HDFS数据一致性建模与分析.软件学报,2021,32(10):2993-3013

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:November 14,2018
  • Revised:January 18,2020
  • Adopted:
  • Online: October 09,2021
  • Published: October 06,2021
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063