基于混沌工程的微服务韧性风险识别和分析

doi:10.13328/j.cnki.jos.006231

微信服务号

微信订阅号

2025年6月1日 17:44 星期日

首页 > 过刊浏览>2021年第32卷第5期 >1231-1255. DOI:10.13328/j.cnki.jos.006231

PDF HTML阅读 XML下载导出引用引用提醒

基于混沌工程的微服务韧性风险识别和分析
DOI:
                        10.13328/j.cnki.jos.006231
                    
CSTR:
                        
                    
作者:
                        殷康璘殷康璘
同济大学软件学院, 上海 201804
在期刊界中查找
在百度中查找
在本站中查找
杜庆峰杜庆峰
同济大学软件学院, 上海 201804
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:殷康璘(1992-),男,博士,CCF学生会员,主要研究领域为软件工程,智能运维.
杜庆峰(1968-),男,博士,教授,博士生导师,主要研究领域为软件工程与质量控制,机器学习与智能运维.
通讯作者:杜庆峰,E-mail:du_cloud@tongji.edu.cn
中图分类号:
基金项目:国家自然科学基金（U1934212）；国家重点研发计划（2020YFB2103300）

Microservice Resilience Risk Identification and Analysis Based on Chaos Engineering

Author:

YIN Kang-Lin
YIN Kang-Lin
School of Software Engineering, Tongji University, Shanghai 201804, China
在期刊界中查找
在百度中查找
在本站中查找
DU Qing-Feng
DU Qing-Feng
School of Software Engineering, Tongji University, Shanghai 201804, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

National Natural Science Foundation of China (U1934212); National Key Research and Development Program of China (2020YFB2103300)

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

微服务架构近年来已成为互联网应用所采用的主流架构模式.然而与传统的软件架构相比，微服务架构更加复杂的部署结构使其面临更多能够导致系统发生故障的潜在威胁，且微服务架构系统故障的症状也更加多样化.在可靠性等一些传统的软件度量已不能充分体现微服务架构系统故障应对能力的情况下，微服务的开发者们开始使用"韧性（resilience）"一词描述微服务架构系统的故障应对能力.为了提高微服务架构系统的韧性，开发者往往需要针对特定的系统环境扰动因素设计应对机制.如何判断一个系统环境扰动因素是否为影响微服务系统韧性的风险因素，以及如何在系统运行发布之前尽可能多地寻找到这些潜在的韧性风险，都是微服务架构系统开发过程中待研究的问题.在先前研究中提出的微服务韧性度量模型的基础上，结合混沌工程，提出了针对微服务架构系统的韧性风险识别和分析方法.韧性风险的识别方法通过不断地向微服务架构系统引入随机系统环境扰动并观察系统服务性能的变化，寻找系统潜在的韧性风险，大幅度减少了软件风险识别过程中的人力成本.对于识别到的韧性风险，通过收集执行混沌工程过程中的系统性能监控数据，韧性风险分析方法将利用因果搜索算法构建出各项系统性能指标之间的影响链路，并将可能性较高的链路提供给运维人员，作为进一步分析的参考.最后，通过在一个微服务架构系统上实施的案例，研究展示了所提出的韧性风险识别和分析方法的有效性.

关键词:微服务;韧性;软件风险识别;混沌工程

Abstract:

Microservice architecture has already become the mainstream architecture pattern of Internet applications in recent years. However, compared with traditional software architectures, microservice architecture has a more sophisticated deployment structure, which makes it have to face more potential threats that make the system in fault, as well as the greater diversity of fault symptoms. Since traditional measurements like reliability cannot fully show a microservice architecture system's capability to cope with failures, microservice developers started to use the word "resilience" to describe such capability. In order to improve a microservice architecture system's resilience, developers usually need to design specific mechanisms for different system environment disruptions. How to judge whether a system environment disruption is a risk to microservice resilience, and how to find these resilience risks as much as possible before the system is released, are the research questions in microservice development. According to the microservice resilience measurement model which is proposed in authors' previous research, by integrating the chaos engineering practice, resilience risk identification and analysis approaches for microservice architecture systems are proposed. The identification approach continuously generates random system environment disruptions to the target system and monitors variations in system service performance, to find potential resilience risks, which greatly reduces human effort in risk identification. For identified resilience risks, by collecting performance monitoring data during chaos engineering, the analysis approach uses the causality search algorithm to build influence chains among system performance indicators, and provide chains with high possibility to system operators for further analysis. Finally, the effectiveness of the proposed approach is proved by a case study on a microservice architecture system.

Key words:microservice;resilience;software risk identification;chaos engineering

引用本文

殷康璘,杜庆峰.基于混沌工程的微服务韧性风险识别和分析.软件学报,2021,32(5):1231-1255

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2020-07-10
最后修改日期:2020-12-15
录用日期:
在线发布日期: 2021-02-07
出版日期: 2021-05-06

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码