HDFS Data Consistency Modelling and Analysis Based on Colored Petri Net

doi:10.13328/j.cnki.jos.006026

微信服务号

微信订阅号

2025-6-2- 21

Home > Archive>Volume 32, Issue 10, 2021 >2993-3013. DOI:10.13328/j.cnki.jos.006026

PDF HTML XML Export Cite reminder

HDFS Data Consistency Modelling and Analysis Based on Colored Petri Net
DOI:
                        10.13328/j.cnki.jos.006026
                    
Author:
                        QIAO Jia-LinQIAO Jia-Lin
School of Software, Tsinghua University, Beijing 100084, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
HUANG Xiang-DongHUANG Xiang-Dong
School of Software, Tsinghua University, Beijing 100084, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
YANG Yi-FanYANG Yi-Fan
School of Software, Tsinghua University, Beijing 100084, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
WANG Jian-MinWANG Jian-Min
School of Software, Tsinghua University, Beijing 100084, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
WU KaiWU Kai
Xinjiang GoldWind Sci & Tech Co., Ltd., Urumqi 830026, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:TP311
Fund Project:National Natural Science Foundation of China (71690231, 61802224)

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

As one of the core components of Apache Hadoop, the Hadoop distributed file system (HDFS) has been widely used in the industry. HDFS adopts a multiple replicas mechanism to ensure data reliability, which may incur inconsistency because of node failure, network partition, and write failure. HDFS is considered to have reduced data consistency compared to traditional file systems, which is difficult for users to understand when there will be inconsistent. At present, there is no relevant work to verify the consistency mechanism. When the data is inconsistent, it will increase the uncertainty of the upper applications. Thus, research for data consistency model is required. The large scale of HDFS makes the analysis more difficult. Code reading, abstracting, colored Petri net modeling, and state-space analysis are conducted to comprehend the system. The works are listed as the following. (1) Colored petri nets are used to model HDFS's process of reading and writing files, the model describes the functions of inner components and their cooperation mechanism in detail. (2) Data layer consistency and operation layer consistency of HDFS are analyzed with state-space tools based on a colored Petri net model, figuring out data consistency guaranteed by the system. (3) A time point repeatable read method is proposed to verify operation layer consistency and serial repeatable strategy is utilized to decrease state-space complexity. Based on the contribution above, the directions for HDFS application development are proposed, helping to improve the data consistency. The CPN modeling method and technique are applicated in the analysis of other distributed information systems.

Key words:HDFS;consistency;modelling;colored Petri net;CPN tools

Get Citation

乔嘉林,黄向东,杨义繁,王建民,吴凯.基于着色Petri网的HDFS数据一致性建模与分析.软件学报,2021,32(10):2993-3013

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:November 14,2018
Revised:January 18,2020
Adopted:
Online: October 09,2021
Published: October 06,2021

You are the first2049922Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History