国家重点研发计划(2019YFB1802504, 2019YFE0105500); 国家自然科学基金面上项目(62072264)
IT系统运维目前正面临着IT规模快速膨胀、系统架构日趋复杂、自主可控要求日益突出等众多挑战. 智能运维技术作为一种利用大数据和机器学习对海量运维数据分析的手段, 能够辅助运维人员更为高效地运行和维护IT系统. 然而, 在企业进行智能运维工程化实践的过程中, 往往会遇到各种困难, 需要智能运维技术的标准规范以指导企业开展智能运维的能力建设工作. 为推动智能运维的标准化工作, 对多个行业的智能运维实施单位开展了问卷调研, 分析总结国内智能运维的实践现状; 对国内外现行的运维标准、人工智能标准和智能运维标准进行梳理, 研究智能运维的标准化工作当前进展; 根据对实践现状和现有标准的调研分析结果, 提出智能运维的能力建设标准框架AIOps-OSA. 该框架从企业建设智能运维能力的角度列举出在组织、场景和能力上的关键要点. 在实际标准的编制过程中通过对AIOps-OSA内各项要点提出具体的规范要求, 可形成对企业具有指导作用的智能运维标准规范.
The operation of IT systems faces many challenges of rapid IT scale expansion, increasingly complex system architecture, and growing demand for autonomy. By employing big data and machine learning technologies to analyze massive operation data, artificial intelligence for IT operations (AIOps) can assist IT operators in operating and maintaining IT systems more efficiently. However, enterprises often encounter various difficulties when practicing AIOps. Thus standards of AIOps are required to guide enterprises in building AIOps capability. To promote the standardization of AIOps, this study surveys the AIOps-in-practice enterprises in various industries to analyze the practice status of AIOps. Existing standards on operation, artificial intelligence, and AIOps are studied to figure out the current progress of AIOps standardization. According to the conclusions above, the study proposes an AIOps capability standard framework AIOps-OSA. The framework lists the critical points of organization, scenarios, and abilities from the perspective of building AIOps capabilities of enterprises. During actual standard preparation, aguiding AIOps standard for enterprises can be formed by applying detailed requirements to AIOps-OSA.
包航宇,殷康璘,曹立,李世宁,孙永敬,尹汇锋,汤汝鸣,侯岳,王士强,裴丹,杨晓勤,王立新.智能运维的实践: 现状与标准化.软件学报,2023,34(9):4069-4095