面向数据库配置优化的反事实解释方法
作者:
作者简介:

朱霄(2000-), 男, 硕士生, 主要研究领域为机器学习可解释性, 反事实解释, 数据库智能化;邵心玥(1996-), 女, 博士生, 主要研究领域为黑盒算法可解释性, 反事实解释;张岩(1965-), 男, 副教授, CCF高级会员, 主要研究领域为数据库, 信息可用性管理, 算法理论;王宏志(1978-), 男, 博士, 教授, 博士生导师, CCF杰出会员, 主要研究领域为数据库管理系统, 大数据分析与治理.

通讯作者:

王宏志, E-mail: wangzh@hit.edu.cn

中图分类号:

TP311

基金项目:

国家自然科学基金(62232005); 四川省科技计划(2020YFSY0069)


Counterfactual Interpretation Method for Database Configuration Optimization
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [32]
  • |
  • 相似文献 [20]
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    数据库性能受数据库配置参数的影响, 参数设置的好坏会直接反映到数据库性能表现上, 因此, 数据库调参方法的优劣至关重要. 然而, 传统的数据库调参方法存在诸多局限性, 例如无法充分利用历史调参数据、浪费时间人力资源等. 而反事实解释方法是一种对原数据进行少量修改, 从而将原预测改变为期望预测的方法, 其起到的是建议的作用. 这种作用可以用于数据库配置优化, 即对数据库配置进行少量修改, 从而使得数据库的性能表现得到优化. 因此, 提出面向数据库配置优化的反事实解释方法, 对于在特定负载条件下性能表现不佳的数据库, 所提方法可以对数据库配置进行修改, 生成相应的数据库配置反事实, 从而优化数据库性能. 进行两种实验, 分别用于评估反事实解释方法的优劣以及验证其优化数据库的效果, 实验结果表明: 综合各个评估指标, 提出的反事实解释方法要优于其他的经典反事实解释方法, 并且生成的反事实能够确实有效地提高数据库性能.

    Abstract:

    The database performance is affected by the database configuration parameters. The quality of parameter settings will directly affect the performance of the database. Therefore, the quality of the database parameter tuning method is important. However, traditional database parameter tuning methods have many limitations, such as the inability to make full use of historical parameter tuning data, wasting time and human resources, and so on. The counterfactual interpretation methods aim to change the original prediction to the expected prediction by making small modifications to the original data. The method plays a role of suggestion, and this can be used for database configuration optimization, namely, making small modifications to the database configuration to optimize the performance of the database. Therefore, this study proposes a counterfactual interpretation method for database configuration optimization. For databases with poor performance under specific load conditions, this method can modify the database configuration and generate corresponding database configuration counterfactuals to optimize database performance. This study conducts two kinds of experiments to evaluate the counterfactual interpretation method and verify the effect of optimizing the database. The experimental results show that the counterfactual interpretation methods proposed in this study are better than other typical counterfactual interpretation methods in terms of various evaluation indicators, and the generated counterfactuals can effectively improve database performance.

    参考文献
    [1] 陈镭. 基于机器学习的数据库系统自动调参研究. 软件导刊, 2021, 20(11): 148–151. [doi: 10.11907/rjdk.211707]
    Chen L. Automatic database tuning research based on machine learning. Software Guide, 2021, 20(11): 148–151 (in Chinese with English abstract). [doi: 10.11907/rjdk.211707]
    [2] 曾春艳, 严康, 王志锋, 余琰, 纪纯妹. 深度学习模型可解释性研究综述. 计算机工程与应用, 2021, 57(8): 1–9. [doi: 10.3778/j.issn.1002-8331.2012-0357]
    Zeng CY, Yan K, Wang ZF, Yu Y, Ji CM. Survey of interpretability research on deep learning models. Computer Engineering and Applications, 2021, 57(8): 1–9 (in Chinese with English abstract). [doi: 10.3778/j.issn.1002-8331.2012-0357]
    [3] 雷霞, 罗雄麟. 深度学习可解释性研究综述. 计算机应用, 2022, 42(11): 3588–3602. [doi: 10.11772/j.issn.1001-9081.2021122118]
    Lei X, Luo XL. Review on interpretability of deep learning. Journal of Computer Applications, 2022, 42(11): 3588–3602 (in Chinese with English abstract). [doi: 10.11772/j.issn.1001-9081.2021122118]
    [4] 李国良, 周煊赫, 孙佶, 余翔, 袁海涛, 刘佳斌, 韩越. 基于机器学习的数据库技术综述. 计算机学报, 2020, 43(11): 2019–2049. [doi: 10.11897/SP.J.1016.2020.02019]
    Li GL, Zhou XH, Sun J, Yu X, Yuan HT, Liu JB, Han Y. A survey of machine learning based database techniques. Chinese Journal of Computers, 2020, 43(11): 2019–2049 (in Chinese with English abstract). [doi: 10.11897/SP.J.1016.2020.02019]
    [5] Wei Z, Ding ZH, Hu JL. Self-tuning performance of database systems based on fuzzy rules. In: Proc. of the 11th Int’l Conf. on Fuzzy Systems and Knowledge Discovery. Xiamen: IEEE, 2014. 194–198.
    [6] Zhu YQ, Liu JX, Guo MY, Bao YG, Ma WL, Liu ZY, Song KP, Yang YC. BestConfig: Tapping the performance potential of systems via automatic configuration tuning. In: Proc. of the 2017 Symp. on Cloud Computing. Santa Clara: ACM, 2017. 338–350.
    [7] Zhang J, Liu Y, Zhou K, Li GL, Xiao ZL, Cheng B, Xing JS, Wang YT, Cheng TH, Liu L, Ran MW, Li ZK. An end-to-end automatic cloud database tuning system using deep reinforcement learning. In: Proc. of the 2019 Int’l Conf. on Management of Data. Amsterdam: ACM, 2019. 415–432.
    [8] Li GL, Zhou XH, Li SF, Gao B. QTune: A query-aware database tuning system with deep reinforcement learning. Proceedings of the VLDB Endowment, 2019, 12(12): 2118–2130. [doi: 10.14778/3352063.3352129]
    [9] Le T, Wang SH, Lee D. GRACE: Generating concise and informative contrastive sample to explain neural network model’s prediction. In: Proc. of the 26th ACM SIGKDD Int’l Conf. on Knowledge Discovery & Data Mining. ACM, 2020. 238–248.
    [10] White A, d’Avila Garcez A. Measurable counterfactual local explanations for any classifier. arXiv:1908.03020, 2019.
    [11] Sharma S, Henderson J, Ghosh J. CERTIFAI: Counterfactual explanations for robustness, transparency, interpretability, and fairness of artificial intelligence models. arXiv:1905.07857, 2019.
    [12] Guidotti R, Monreale A, Ruggieri S, Pedreschi D, Turini F, Giannotti F. Local rule-based explanations of black box decision systems. arXiv:1805.10820, 2018.
    [13] Mothilal RK, Sharma A, Tan CH. Explaining machine learning classifiers through diverse counterfactual explanations. In: Proc. of the 2020 Conf. on Fairness, Accountability, and Transparency. Barcelona: ACM, 2020. 607–617.
    [14] Yang F, Alva SS, Chen JH, Hu X. Model-based counterfactual synthesizer for interpretation. In: Proc. of the 27th ACM SIGKDD Conf. on Knowledge Discovery & Data Mining. Singapore: ACM, 2021. 1964–1974.
    [15] Panchal G, Ganatra A, Kosta YP, Panchal D. Behaviour analysis of multilayer perceptrons with multiple hidden neurons and hidden layers. International Journal of Computer Theory and Engineering, 2011, 3(2): 332–337. [doi: 10.7763/IJCTE.2011.V3.328]
    [16] Li ZW, Liu F, Yang WJ, Peng SH, Zhou J. A survey of convolutional neural networks: Analysis, applications, and prospects. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(12): 6999–7019. [doi: 10.1109/TNNLS.2021.3084827]
    [17] Wang F, Tax DMJ. Survey on the attention based RNN model and its applications in computer vision. arXiv:1601.06823, 2016.
    [18] He KM, Zhang XY, Ren SQ, Sun J. Deep residual learning for image recognition. In: Proc. of the 2016 IEEE Conf. on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 770–778.
    [19] Schaetti N. Character-based convolutional neural network and ResNet18 for twitter author profiling: Notebook for PAN at CLEF 2018. In: Proc. of the 2018 Working Notes of CLEF-Conf. and Labs of the Evaluation Forum. Avignon: CEUR-WS.org, 2018.
    [20] Verma S, Boonsanong V, Hoang M, Hines KE, Dickerson JP, Shah C. Counterfactual explanations and algorithmic recourses for machine learning: A review. arXiv:2010.10596, 2020.
    [21] Stepin I, Alonso JM, Catala A, Pereira-Fari?a M. A survey of contrastive and counterfactual explanation generation methods for explainable artificial intelligence. IEEE Access, 2021, 9: 11974–12001. [doi: 10.1109/ACCESS.2021.3051315]
    [22] Ribeiro MT, Singh S, Guestrin C. “Why should I trust you?”: Explaining the predictions of any classifier. In: Proc. of the 22nd ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. San Francisco: ACM, 2016. 1135–1144.
    [23] Damodaran BD, Salim S, Vargese SM. Performance evaluation of MySQL and MongoDB databases. International Journal on Cybernetics & Informatics, 2016, 5(2): 387–394. [doi: 10.5121/ijci.2016.5241]
    [24] Pandey R. Performance benchmarking and comparison of cloud-based databases MongoDB (NoSQL) vs. MySQL (Relational) using YCSB. Technical Report. Dublin: National College of Ireland, 2020. 8.
    [25] van Aken D, Pavlo A, Gordon GJ, Zhang BH. Automatic database management system tuning through large-scale machine learning. In: Proc. of the 2017 ACM Int’l Conf. on Management of Data. Chicago: ACM, 2017. 1009–1024.
    [26] Hu ZB, Tereykovskiy IA, Tereykovska LO, Pogorelov VV. Determination of structural parameters of multilayer perceptron designed to estimate parameters of technical systems. International Journal of Intelligent Systems and Applications, 2017, 9(10): 57–62. [doi: 10.5815/ijisa.2017.10.07]
    [27] Dhurandhar A, Chen PY, Luss R, Tu CC, Ting PS. Explanations based on the missing: Towards contrastive explanations with pertinent negatives. In: Proc. of the 32nd Int’l Conf. on Neural Information Processing Systems. Montréal: Curran Associates Inc., 2018. 590–601.
    [28] van Looveren A, Klaise J. Interpretable counterfactual explanations guided by prototypes. In: Proc. of the 2021 Joint European Conf. on Machine Learning and Knowledge Discovery in Databases. Bilbao: Springer, 2021. 650–665.
    引证文献
引用本文

朱霄,邵心玥,张岩,王宏志.面向数据库配置优化的反事实解释方法.软件学报,2024,35(9):4469-4492

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-11-14
  • 最后修改日期:2023-02-15
  • 在线发布日期: 2023-10-18
  • 出版日期: 2024-09-06
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号