一种预测流程剩余时间的可解释特征分层方法
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP311

基金项目:

国家自然科学基金(61902222);山东省泰山学者工程专项基金(ts20190936,tsqn201909109);山东省自然科学基金优秀青年基金(ZR2021YQ45);山东省高等学校青创科技计划创新团队项目(2021KJ031);山东科技大学领军人才与优秀科研团队计划(2015TDJH102)


Explainable Feature-based Hierarchical Approach to Predict Remaining Process Time
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    流程剩余时间预测对于业务异常的预防和干预有着重要的价值和意义.现有的剩余时间预测方法通过深度学习技术达到了更高的准确率,然而大多数深度模型结构复杂难以解释预测结果,即不可解释问题.此外,剩余时间预测除了活动这一关键属性还会根据领域知识选择若干其他属性作为预测模型的输入特征,缺少通用的特征选择方法,对于预测的准确率和模型的可解释性存在一定的影响.针对上述问题,提出基于可解释特征分层模型(explainable feature-based hierarchical model,EFH model)的流程剩余时间预测框架.具体而言,首先提出特征自选择策略,通过基于优先级的后向特征删除和基于特征重要性值的前向特征选择,得到对预测任务具有积极影响的属性作为模型输入.然后提出可解释特征分层模型架构,通过逐层加入不同特征得到每层的预测结果,解释特征值与预测结果的内在联系.采用LightGBM (light gradient boosting machine)和LSTM (long short-term memory)算法实例化所提方法,框架是通用的,不限于选用算法.最后在8个真实事件日志上与最新方法进行比较.实验结果表明所提方法能够选取出有效特征,提高预测的准确率,并解释预测结果.

    Abstract:

    Remaining process time prediction is important for preventing and intervening in abnormal business operations. For predicting the remaining time, existing approaches have achieved high accuracy through deep learning techniques. However, most of these techniques involve complex model structures, and the prediction results are difficult to be explained, namely, unexplainable issues. In addition, the prediction of the remaining time usually uses the key attribute, namely activity, or selects several other attributes as the input features of the predicted model according to the domain knowledge. However, a general feature selection method is missing, which may affect both prediction accuracy and model explainability. To tackle these two challenges, this study introduces a remaining process time prediction framework based on an explainable feature-based hierarchical (EFH) model. Specifically, a feature self-selection strategy is first proposed, and the attributes that have a positive impact on the prediction task are obtained as the input features of the model through the backward feature deletion based on priority and the forward feature selection based on feature importance. Then an EFH model is proposed. The prediction results of each layer are obtained by adding different features layer by layer, so as to explain the relationship between input features and prediction results. The study also uses the light gradient boosting machine (LightGBM) and long short-term memory (LSTM) algorithms to implement the proposed approach, and the framework is general and not limited to the algorithms selected in this study. Finally, the proposed approach is compared with other methods on eight real-life event logs. The experimental results show that the proposed approach can select effective features and improve prediction accuracy. In addition, the prediction results are explained.

    参考文献
    相似文献
    引证文献
引用本文

郭娜,刘聪,李彩虹,陆婷,闻立杰,曾庆田.一种预测流程剩余时间的可解释特征分层方法.软件学报,2024,35(3):1341-1356

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-07-18
  • 最后修改日期:2022-09-12
  • 录用日期:
  • 在线发布日期: 2023-05-10
  • 出版日期:
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号