卷积神经网络的可解释性研究综述
作者:
作者简介:

窦慧(1992-),女,博士生,主要研究领域为神经网络,数据可视化.;张凌茗(1997-),男,硕士生,主要研究领域为神经网络,计算机视觉.;韩峰(1994-),男,博士生,主要研究领域为计算机视觉,多模态学习.;申富饶(1973-),男,博士,教授,博士生导师,CCF杰出会员,主要研究领域为神经计算,机器人智能.;赵健(1979-),男,博士,副教授,主要研究领域为通信网络,神经计算.

通讯作者:

申富饶,E-mail:frshen@nju.edu.cn;赵健,E-mail:jianzhao@nju.edu.cn

基金项目:

科技部科技创新2030重大项目(2021ZD0201300); 国家自然科学基金(61876076)


Survey on Convolutional Neural Network Interpretability
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [134]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    神经网络模型性能日益强大, 被广泛应用于解决各类计算机相关任务, 并表现出非常优秀的能力, 但人类对神经网络模型的运行机制却并不完全理解. 针对神经网络可解释性的研究进行了梳理和汇总, 就模型可解释性研究的定义、必要性、分类、评估等方面进行了详细的讨论. 从解释算法的关注点出发, 提出一种神经网络可解释算法的新型分类方法, 为理解神经网络提供一个全新的视角. 根据提出的新型分类方法对当前卷积神经网络的可解释方法进行梳理, 并对不同类别解释算法的特点进行分析和比较. 同时, 介绍了常见可解释算法的评估原则和评估方法. 对可解释神经网络的研究方向与应用进行概述. 就可解释神经网络面临的挑战进行阐述, 并针对这些挑战给出可能的解决方向.

    Abstract:

    With the increasingly powerful performance of neural network models, they are widely used to solve various computer-related tasks and show excellent capabilities. However, a clear understanding of the operation mechanism of neural network models is lacking. Therefore, this study reviews and summarizes the current research on the interpretability of neural networks. A detailed discussion is rendered on the definition, necessity, classification, and evaluation of research on model interpretability. With the emphasis on the focus of interpretable algorithms, a new classification method for the interpretable algorithms of neural networks is proposed, which provides a novel perspective for the understanding of neural networks. According to the proposed method, this study sorts out the current interpretable methods for convolutional neural networks and comparatively analyzes the characteristics of interpretable algorithms falling within different categories. Moreover, it introduces the evaluation principles and methods of common interpretable algorithms and expounds on the research directions and applications of interpretable neural networks. Finally, the problems confronted in this regard are discussed, and possible solutions to these problems are given.

    参考文献
    [1] Bodria F, Giannotti F, Guidotti R, Naretto F, Pedreschi D, Rinzivillo S. Benchmarking and survey of explanation methods for black box models. arXiv:2102.13076, 2021.
    [2] 孔祥维, 唐鑫泽, 王子明. 人工智能决策可解释性的研究综述. 系统工程理论与实践, 2021, 41(2): 524–536. [doi: 10.12011/SETP2020-1536]
    Kong XW, Tang XZ, Wang ZM. A survey of explainable artificial intelligence decision. Systems Engineering - Theory & Practice, 2021, 41(2): 524–536 (in Chinese with English abstract). [doi: 10.12011/SETP2020-1536]
    [3] Goyal Y, Wu ZY, Ernst J, Batra D, Parikh D, Lee S. Counterfactual visual explanations. In: Proc. of the 36th Int’l Conf. on Machine Learning. Long Beach: PMLR, 2019. 2376–2384.
    [4] Wang YL, Su H, Zhang B, Hu XL. Interpret neural networks by identifying critical data routing paths. In: Proc. of the 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 8906–8914.
    [5] Frank Pasquale. The black box society: The secret algorithms that control money and information. Business Ethics Quarterly, 2016, 26(4): 568–571.
    [6] Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 2019, 1(5): 206–215. [doi: 10.1038/s42256-019-0048-x]
    [7] 苏炯铭, 刘鸿福, 项凤涛, 吴建宅, 袁兴生. 深度神经网络解释方法综述. 计算机工程, 2020, 46(9): 1–15.
    Su JM, Liu HF, Xiang FT, Wu JZ, Yuan XS. Survey of interpretation methods for deep neural networks. Computer Engineering, 2020, 46(9): 1–15 (in Chinese with English abstract).
    [8] T, Luigs HG, Mahlein AK, Kersting K. Making deep neural networks right for the right scientific reasons by interacting with their explanations. Nature Machine Intelligence, 2020, 2(8): 476–486.
    [9] Zech JR, Badgeley MA, Liu M, Costa AB, Titano JJ, Oermann EK. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study. PLoS Medicine, 2018, 15(11): e1002683. [doi: 10.1371/journal.pmed.1002683]
    [10] Badgeley MA, Zech JR, Oakden-Rayner L, Glicksberg BS, Liu M, Gale W, Mcconnell MV, Percha B, Snyder TM, Dudley JT. Deep learning predicts hip fracture using confounding patient and healthcare variables. npj Digital Medicine, 2019, 2(1): 31. [doi: 10.1038/s41746-019-0105-1]
    [11] Hamamoto R, Suvarna K, Yamada M, Kobayashi K, Shinkai N, Miyake M, Takahashi M, Jinnai S, Shimoyama R, Sakai A, Takasawa K, Bolatkan A, Shozu K, Dozen A, Machino H, Takahashi S, Asada K, Komatsu M, Sese J, Kaneko S. Application of artificial intelligence technology in oncology: Towards the establishment of precision medicine. Cancers, 2020, 12(12): 3532. [doi: 10.3390/cancers12123532]
    [12] Miller T. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 2019, 267: 1–38. [doi: 10.1016/j.artint.2018.07.007]
    [13] Doshi-Velez F, Kim B. A roadmap for a rigorous science of interpretability. arXiv:1702.08608, 2017.
    [14] Comandè G. Regulating algorithms’ regulation? First ethico-legal principles, problems, and opportunities of algorithms. In: Cerquitelli T, Quercia D, Pasquale F, eds. Transparent Data Mining for Big and Small Data. Cham: Springer, 2017. 169–206.
    [15] Wachter S, Mittelstadt B, Floridi L. Why a right to explanation of automated decision-making does not exist in the general data protection regulation. International Data Privacy Law, 2017, 7(2): 76–99. [doi: 10.1093/idpl/ipx005]
    [16] Phillips PJ, Hahn C, Fontana P, Yates A, Greene KK, Broniatowski DA, Przybocki MA. Four principles of explainable artificial intelligence. Technical Report NISTIR 8312, National Institute of Standards and Technology, 2021. 1–43.
    [17] Pope PE, Kolouri S, Rostami M, Martin CE, Hoffmann H. Explainability methods for graph convolutional neural networks. In: Proc. of the 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. 10772–10781.
    [18] Hofman JM, Sharma A, Watts DJ. Prediction and explanation in social systems. Science, 2017, 355(6324): 486–488. [doi: 10.1126/science.aal3856]
    [19] Weller A. Challenges for transparency. arXiv:1708.01870, 2019.
    [20] Lipton ZC. The Mythos of Model Interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue, 2018, 16(3): 31–57. [doi: 10.1145/3236386.3241340]
    [21] Doshi-Velez F, Kim B. Towards a rigorous science of interpretable machine learning. arXiv:1702.08608, 2017.
    [22] Zhang Y, Tiňo P, Leonardis A, Tang K. A survey on neural network interpretability. IEEE Transactions on Emerging Topics in Computational Intelligence, 2021, 5(5): 726–742. [doi: 10.1109/TETCI.2021.3100641]
    [23] Montavon G, Samek W, Müller KR. Methods for interpreting and understanding deep neural networks. Digital Signal Processing, 2018, 73: 1–15. [doi: 10.1016/j.dsp.2017.10.011]
    [24] Gilpin LH, Bau D, Yuan BZ, Bajwa A, Specter M, Kagal L. Explaining explanations: An overview of interpretability of machine learning. In: Proc. of the 5th Int’l Conf. on Data Science and Advanced Analytics. Turin: IEEE, 2018. 80–89.
    [25] Andrews R, Diederich J, Tickle AB. Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledge-Based Systems, 1995, 8(6): 373–389. [doi: 10.1016/0950-7051(96)81920-4]
    [26] Freitas AA. Comprehensible classification models: A position paper. ACM SIGKDD Explorations Newsletter, 2013, 15(1): 1–10. [doi: 10.1145/2594473.2594475]
    [27] Johansson U, König R, Niklasson L. The truth is in there-rule extraction from opaque models using genetic programming. In: Proc. of the 17th Int’l Florida Artificial Intelligence Research Society Conf. Miami Beach: AAAI Press, 2004. 658–663.
    [28] Arrieta AB, Díaz-Rodríguez N, Del Ser J, Bennetot A, Tabik S, Barbado A, Garcia S, Gil-Lopez S, Molina D, Benjamins R, Chatila R, Herrera F. Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 2020, 58: 82–115. [doi: 10.1016/j.inffus.2019.12.012]
    [29] Abiodun OI, Jantan A, Omolara AE, Dada KV, Mohamed NA, Arshad H. State-of-the-art in artificial neural network applications: A survey. Heliyon, 2018, 4(11): e00938. [doi: 10.1016/j.heliyon.2018.e00938]
    [30] 包希港, 周春来, 肖克晶, 覃飙. 视觉问答研究综述. 软件学报, 2021, 32(8): 2522–2544. http://www.jos.org.cn/1000-9825/6215.htm
    Bao XG, Zhou CL, Xiao KJ, Qin B. Survey on visual question answering. Ruan Jian Xue Bao/Journal of Software, 2021, 32(8): 2522–2544 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6215.htm
    [31] Chouldechova A. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big Data, 2017, 5(2): 153–163. [doi: 10.1089/big.2016.0047]
    [32] 阮利, 温莎莎, 牛易明, 李绍宁, 薛云志, 阮涛, 肖利民. 基于可解释基拆解和知识图谱的深度神经网络可视化. 计算机学报, 2021, 44(9): 1786–1805. [doi: 10.11897/SP.J.1016.2021.01786]
    Ruan L, Wen SS, Niu YM, Li SN, Xue YZ, Ruan T, Xiao LM. Deep neural network visualization based on interpretable basis decomposition and knowledge graph. Chinese Journal of Computers, 2021, 44(9): 1786–1805 (in Chinese with English abstract). [doi: 10.11897/SP.J.1016.2021.01786]
    [33] Zhang ZZ, Xie YP, Xing FY, McGough M, Yang L. MDNet: A semantically and visually interpretable medical image diagnosis network. In: Proc. of the 2017 IEEE Conf. on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017. 3549–3557.
    [34] Zhu YH, Ma JB, Yuan CG, Zhu XF. Interpretable learning based dynamic graph convolutional networks for Alzheimer’s disease analysis. Information Fusion, 2022, 77: 53–61. [doi: 10.1016/j.inffus.2021.07.013]
    [35] Kim J, Canny J. Interpretable learning for self-driving cars by visualizing causal attention. In: Proc. of the 2017 IEEE Int’l Conf. on Computer Vision. Venice: IEEE, 2017. 2961–2969.
    [36] You J, Leskovec J, He KM, Xie SN. Graph structure of neural networks. In: Proc. of the 37th Int’l Conf. on Machine Learning. PMLR, 2020. 10881–10891.
    [37] 王婉臻, 饶元, 吴连伟, 李薛. 基于人工智能的司法判决预测研究与进展. 中文信息学报, 2021, 35(9): 1–14. [doi: 10.3969/j.issn.1003-0077.2021.09.001]
    Wang WZ, Rao Y, Wu LW, Li X. Progress of judicial judgment prediction based on artificial intelligence. Journal of Chinese Information Processing, 2021, 35(9): 1–14 (in Chinese with English abstract). [doi: 10.3969/j.issn.1003-0077.2021.09.001]
    [38] Lo Piano S. Ethical principles in machine learning and artificial intelligence: Cases from the field and possible ways forward. Humanities and Social Sciences Communications, 2020, 7(1): 9. [doi: 10.1057/s41599-020-0501-9]
    [39] Ashoori M, Weisz JD. In AI we trust? Factors that influence trustworthiness of AI-infused decision-making processes. arXiv:1912.02675, 2019.
    [40] Thiebes S, Lins S, Sunyaev A. Trustworthy artificial intelligence. Electronic Markets, 2021, 31(2): 447–464. [doi: 10.1007/s12525-020-00441-4]
    [41] Brundage M, Avin S, Wang J, et al. Toward trustworthy AI development: Mechanisms for supporting verifiable claims. arXiv:2004.07213, 2020.
    [42] Rudin C, Chen CF, Chen Z, Huang HY, Semenova L, Zhong CD. Interpretable machine learning: Fundamental principles and 10 grand challenges. arXiv:2103.11251, 2021.
    [43] 陈珂锐, 孟小峰. 机器学习的可解释性. 计算机研究与发展, 2020, 57(9): 1971–1986.
    Chen KR, Meng XF. Interpretation and understanding in machine learning. Journal of Computer Research and Development, 2020, 57(9): 1971–1986 (in Chinese with English abstract).
    [44] Lakkaraju H, Kamar E, Caruana R, Horvitz E. Identifying unknown unknowns in the open world: Representations and policies for guided exploration. Proceedings of the AAAI Conference on Artificial Intelligence, 2017, 31(1): 2124–2132. (查阅所有网上资料, 未找到对应的页码信息, 请联系作者确认)
    [45] Kroll JA, Huey J, Barocas S, Felten EW, Reidenberg JR, Robinson DG, Yu H. Accountable algorithms. University of Pennsylvania Law Review, 2017, 165: 633–705.
    [46] Danks D, London AJ. Regulating autonomous systems: Beyond standards. IEEE Intelligent Systems, 2017, 32(1): 88–91. [doi: 10.1109/MIS.2017.1]
    [47] Kingston JKC. Artificial intelligence and legal liability. In: Proc. of the 2016 Int’l Conf. on Innovative Techniques and Applications of Artificial Intelligence. Springer, 2016. 269–279.
    [48] 周志杰, 曹友, 胡昌华, 唐帅文, 张春潮, 王杰. 基于规则的建模方法的可解释性及其发展. 自动化学报, 2021, 47(6): 1201-1216. [doi: 10.16383/j.aas.c200402]
    Zhou ZJ, Cao Y, Hu CH, Tang SW, Zhang CC, Wang J. The interpretability of rule-based modeling approach and its development. Acta Automatica Sinica, 2021, 47(6): 1201-1216 (in Chinese with English abstract). [doi: 10.16383/j.aas.c200402]
    [49] Minematsu T, Shimada A, Taniguchi RI. Analytics of deep neural network in change detection. In: Proc. of the 14th IEEE Int’l Conf. on Advanced Video and Signal Based Surveillance. Lecce: IEEE, 2017. 1–6.
    [50] Minematsu T, Shimada A, Uchiyama H, Taniguchi RI. Analytics of deep neural network-based background subtraction. Journal of Imaging, 2018, 4(6): 78. [doi: 10.3390/jimaging4060078]
    [51] Rudin C, Radin J. Why are we using black box models in AI when we don’t need to? A lesson from an explainable AI competition. Harvard Data Science Review, 2019, 1(2). 1–9.
    [52] Laugel T, Lesot MJ, Marsala C, Renard X, Detyniecki M. The dangers of post-hoc interpretability: Unjustified counterfactual explanations. In: Proc. of the 28th Int’l Joint Conf. on Artificial Intelligence. Macao: AAAI Press, 2019. 2801–2807.
    [53] Lakkaraju H, Bastani O. “How do I fool you?” Manipulating user trust via misleading black box explanations. In: Proc. of the 2020 AAAI/ACM Conf. on AI, Ethics, and Society. New York: ACM, 2020. 79–85.
    [54] Simonyan K, Vedaldi A, Zisserman A. Deep inside convolutional networks: Visualising image classification models and saliency maps. In: Proc. of the 2nd Int’l Conf. on Learning Representations. Banff, 2014. 1–8.
    [55] Zhou BL, Khosla A, Lapedriza A, Oliva A, Torralba A. Learning deep features for discriminative localization. In: Proc. of the 2016 IEEE Conf. on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 2921–2929.
    [56] Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In: Proc. of the 2017 IEEE Int’l Conf. on Computer Vision. Venice: IEEE, 2017. 618–626.
    [57] Bau D, Zhou BL, Khosla A, Oliva A, Torralba A. Network dissection: Quantifying interpretability of deep visual representations. In: Proc. of the 2017 IEEE Conf. on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017. 3319–3327.
    [58] Kim B, Wattenberg M, Gilmer J, Cai CJ, Wexler J, Viégas FB, Sayres R. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (TCAV). In: Proc. of the 35th Int’l Conf. on Machine Learning. Stockholm: PMLR, 2018. 2673–2682.
    [59] Ribeiro MT, Singh S, Guestrin C. “Why should I trust you?” Explaining the predictions of any classifier. In: Proc. of the 22nd ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. San Francisco: ACM, 2016. 1135–1144.
    [60] Ramamurthy KN, Vinzamuri B, Zhang YF, Dhurandhar A. Model agnostic multilevel explanations. In: Proc. of the 34th Int’l Conf. on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2020. 501.
    [61] Samek W, Müller KR. Towards explainable artificial intelligence. In: Samek W, Montavon G, Vedaldi A, Hansen LK, Müller KR, eds. Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Cham: Springer, 2019. 5–22.
    [62] Yu FX, Qin ZW, Liu CC, Zhao L, Wang YZ, Chen X. Interpreting and evaluating neural network robustness. In: Proc. of the 28th Int’l Joint Conf. on Artificial Intelligence. Macao: AAAI Press, 2019. 4199–4205.
    [63] 成科扬, 孟春运, 王文杉, 师文喜, 詹永照. 解耦表征学习研究进展. 计算机应用, 2021, 41(12): 3409–3418. [doi: 10.11772/j.issn.1001-9081.2021060895]
    Cheng KY, Meng CY, Wang WS, Shi WX, Zhan YZ. Research advances in disentangled representation learning. Journal of Computer Applications, 2021, 41(12): 3409–3418 (in Chinese with English abstract). [doi: 10.11772/j.issn.1001-9081.2021060895]
    [64] Qin ZW, Yu FX, Liu CC, Chen X. How convolutional neural networks see the world - A survey of convolutional neural network visualization methods. Mathematical Foundations of Computing, 2018, 1(2): 149–180. [doi: 10.3934/mfc.2018008]
    [65] Zhang QS, Zhu SC. Visual interpretability for deep learning: A survey. Frontiers of Information Technology & Electronic Engineering, 2018, 19(1): 27–39. [doi: 10.1631/FITEE.1700808]
    [66] Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D. A survey of methods for explaining black box models. ACM Computing Surveys, 2019, 51(5): 93. [doi: 10.1145/3236009]
    [67] Angelov PP, Soares EA, Jiang R, Arnold NI, Atkinson PM. Explainable artificial intelligence: An analytical review. WIREs Data Mining and Knowledge Discovery, 2021, 11(5): e1424. [doi: 10.1002/widm.1424]
    [68] Linardatos P, Papastefanopoulos V, Kotsiantis S. Explainable AI: A review of machine learning interpretability methods. Entropy, 2021, 23(1): 18. [doi: 10.3390/E23010018]
    [69] Bach S, Binder A, Montavon G, Klauschen F, Müller KR, Samek W. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS One, 2015, 10(7): e0130140. [doi: 10.1371/journal.pone.0130140]
    [70] Zhang QS, Wu YN, Zhu SC. Interpretable convolutional neural networks. In: Proc. of the 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 8827–8836.
    [71] Erhan D, Bengio Y, Courville A, Vincent P. Visualizing higher-layer features of a deep network. Technical Report 1341, Montreal: University of Montreal, 2009. 1–13.
    [72] Olah C, Mordvintsev A, Schubert L. Feature visualization. Distill, 2017, 2(11): 1. [doi: 10.23915/distill.00007] (查阅所有网上资料, 未找到对应的页码信息, 请联系作者确认)
    [73] Zhang QS, Wang WG, Zhu SC. Examining CNN representations with respect to dataset bias. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(1): 4464–4473. [doi: 10.1609/aaai.v32i1.11833] (查阅所有网上资料, 未找到对应的页码信息, 请联系作者确认)
    [74] Wang F, Liu HJ, Cheng J. Visualizing deep neural network by alternately image blurring and deblurring. Neural Networks, 2018, 97: 162–172. [doi: 10.1016/j.neunet.2017.09.007]
    [75] Yosinski J, Clune J, Nguyen A, Fuchs T, Lipson H. Understanding neural networks through deep visualization. arXiv:1506.06579, 2015.
    [76] Nguyen A, Dosovitskiy A, Yosinski J, Brox T, Clune J. Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. In: Proc. of the 30th Int’l Conf. on Neural Information Processing Systems. Barcelona: Curran Associates Inc., 2016. 3395–3403.
    [77] Fong R, Vedaldi A. Net2Vec: Quantifying and explaining how concepts are encoded by filters in deep neural networks. In: Proc. of the 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 8730–8738.
    [78] Zhou BL, Sun YY, Bau D, Torralba A. Interpretable basis decomposition for visual explanation. In: Proc. of the 15th European Conf. on Computer Vision. Munich: Springer, 2018. 122–138.
    [79] Pinheiro PO, Collobert R. From image-level to pixel-level labeling with convolutional networks. In: Proc. of the 2015 IEEE Conf. on Computer Vision and Pattern Recognition. Boston: IEEE, 2015. 1713–1721.
    [80] Cheng X, Rao ZF, Chen YL, Zhang QS. Explaining knowledge distillation by quantifying the knowledge. In: Proc. of the 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020. 12922–12932.
    [81] Nguyen A, Yosinski J, Clune J. Multifaceted feature visualization: Uncovering the different types of features learned by each neuron in deep neural networks. arXiv:1602.03616, 2016.
    [82] Zhang QS, Wang X, Cao RM, Wu YN, Shi F, Zhu SC. Extraction of an explanatory graph to interpret a CNN. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(11): 3863–3877. [doi: 10.1109/TPAMI.2020.2992207]
    [83] Zhang QS, Cao RM, Shi F, Wu YN, Zhu SC. Interpreting CNN knowledge via an explanatory graph. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(1): 4454–4463. [doi: 10.1609/aaai.v32i1.11819] (查阅所有网上资料, 未找到对应的页码信息, 请联系作者确认)
    [84] Zintgraf LM, Cohen TS, Adel T, Welling M. Visualizing deep neural network decisions: Prediction difference analysis. In: Proc. of the 5th Int’l Conf. on Learning Representations. Toulon: OpenReview.net, 2017. 1–11.
    [85] Smilkov D, Thorat N, Kim B, Viégas F, Wattenberg M. SmoothGrad: Removing noise by adding noise. arXiv:1706.03825, 2017.
    [86] Singla S, Wallace E, Feng S, Feizi S. Understanding impacts of high-order loss approximations and features in deep learning interpretation. In: Proc. of the 36th Int’l Conf. on Machine Learning. Long Beach: PMLR, 2019. 5848–5856.
    [87] Wang SJ, Zhou TY, Bilmes JA. Bias also matters: Bias attribution for deep neural network explanation. In: Proc. of the 36th Int’l Conf. on Machine Learning. Long Beach: PMLR, 2019. 6659–6667.
    [88] Zeiler MD, Krishnan D, Taylor GW, Fergus R. Deconvolutional networks. In: Proc. of the 2010 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2010. 2528–2535.
    [89] Zeiler MD, Taylor GW, Fergus R. Adaptive deconvolutional networks for mid and high level feature learning. In: Proc. of the 2011 IEEE Int’l Conf. on Computer Vision. Barcelona: IEEE, 2011. 2018–2025.
    [90] Zeiler MD, Fergus R. Visualizing and understanding convolutional networks. In: Proc. of the 13th European Conf. on Computer Vision. Zurich: Springer, 2014. 818–833.
    [91] Springenberg JT, Dosovitskiy A, Brox T, Riedmiller MA. Striving for simplicity: The all convolutional net. In: Proc. of the 3rd Int’l Conf. on Learning Representations. San Diego, 2015. 1–11.
    [92] Heskes T, Sijben E, Bucur IG, Claassen T. Causal shapley values: Exploiting causal knowledge to explain individual predictions of complex models. In: Proc. of the 34th Advances in Neural Information Processing Systems. 2020. 4778–4789.
    [93] Ancona M, Öztireli C, Gross MH. Explaining deep neural networks with a polynomial time algorithm for Shapley value approximation. In: Proc. of the 36th Int’l Conf. on Machine Learning. Long Beach: PMLR, 2019. 272–281.
    [94] Van Looveren A, Klaise J. Interpretable counterfactual explanations guided by prototypes. In: Proc. of the 2021 European Conf. on Machine Learning and Knowledge Discovery in Databases. Bilbao: Springer, 2021. 650–665.
    [95] Koh PW, Liang P. Understanding black-box predictions via influence functions. In: Proc. of the 34th Int’l Conf. on Machine Learning. Sydney: JMLR.org, 2017. 1885–1894.
    [96] Fong RC, Vedaldi A. Interpretable explanations of black boxes by meaningful perturbation. In: Proc. of the 2017 IEEE Int’l Conf. on Computer Vision. Venice: IEEE, 2017. 3449–3457.
    [97] Agarwal C, Nguyen A. Explaining image classifiers by removing input features using generative models. In: Proc. of the 15th Asian Conf. on Computer Vision. Kyoto: Springer, 2020. 101–118.
    [98] Wagner J, Köhler JM, Gindele T, Hetzel L, Wiedemer JT, Behnke S. Interpretable and fine-grained visual explanations for convolutional neural networks. In: Proc. of the 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. 9097–9107.
    [99] Xu KD, Liu SJ, Zhang GY, Sun MS, Zhao P, Fan QF, Gan C, Lin X. Interpreting adversarial examples by activation promotion and suppression. arXiv:1904.02057, 2019.
    [100] Lapuschkin S, Wäldchen S, Binder A, Montavon G, Samek W, Müller KR. Unmasking clever Hans predictors and assessing what machines really learn. Nature Communications, 2019, 10(1): 1096. [doi: 10.1038/s41467-019-08987-4]
    [101] Yang CL, Rangarajan A, Ranka S. Global model interpretation via recursive partitioning. In: Proc. of the 20th IEEE Int’l Conf. on High Performance Computing and Communications, the 16th IEEE Int’l Conf. on Smart City, the 4th IEEE Int’l Conf. on Data Science and Systems. Exeter: IEEE, 2018. 1563–1570.
    [102] Salman S, Payrovnaziri SN, Liu XW, Rengifo-Moreno P, He Z. DeepConsensus: Consensus-based interpretable deep neural networks with application to mortality prediction. In: Proc. of the 2020 Int’l Joint Conf. on Neural Networks. Glasgow: IEEE, 2020. 1–8.
    [103] Ghorbani A, Wexler J, Zou J, Kim B. Towards automatic concept-based explanations. In: Proc. of the 33rd Conf. on Neural Information Processing Systems. Vancouver, 2019. 9273–9282.
    [104] 化盈盈, 张岱墀, 葛仕明. 深度学习模型可解释性的研究进展. 信息安全学报, 2020, 5(3): 1–12. [doi: 10.19363/J.cnki.cn10-1380/tn.2020.05.01]
    Hua YY, Zhang DC, Ge SM. Research progress in the interpretability of deep learning models. Journal of Cyber Security, 2020, 5(3): 1–12 (in Chinese with English abstract). [doi: 10.19363/J.cnki.cn10-1380/tn.2020.05.01]
    [105] 司念文, 张文林, 屈丹, 罗向阳, 常禾雨, 牛铜. 卷积神经网络表征可视化研究综述. 自动化学报, 2022, 48(8): 1890–1920. [doi: 10.16383/j.aas.c200554]
    Si NW, Zhang WL, Qu D, Luo XY, Chang HY, Niu T. Representation visualization of convolutional neural networks: A survey. Acta Automatica Sinica, 2022, 48(8): 1890–1920 (in Chinese with English abstract). [doi: 10.16383/j.aas.c200554]
    [106] 石晓荣, 倪亮, 王健, 郭宇航. 基于最小熵约束的可解释卷积神经网络. 航天控制, 2021, 39(5): 39–43. [doi: 10.3969/j.issn.1006-3242.2021.05.007]
    Shi XR, Ni L, Wang J, Guo YH. Interpretable CNN based on minimum entropy constraint. Aerospace Control, 2021, 39(5): 39–43 (in Chinese with English abstract). [doi: 10.16804/j.cnki.issn1006-3242.2021.05.007]
    [107] Adler P, Falk C, Friedler SA, Nix T, Rybeck G, Scheidegger C, Smith B, Venkatasubramanian S. Auditing black-box models for indirect influence. Knowledge and Information Systems, 2018, 54(1): 95–122. [doi: 10.1007/s10115-017-1116-3]
    [108] Michelini PN, Liu HW, Lu YH, Jiang XQ. Understanding convolutional networks using linear interpreters—Extended abstract. In: Proc. of the 2019 IEEE/CVF Int’l Conf. on Computer Vision Workshop. Seoul: IEEE, 2019. 4186–4189.
    [109] Michelini PN, Liu HW, Lu YH, Jiang XQ. A tour of convolutional networks guided by linear interpreters. In: Proc. of the 2019 IEEE/CVF Int’l Conf. on Computer Vision. Seoul: IEEE, 2019. 4752–4761.
    [110] Zhang JM, Bargal SA, Lin Z, Brandt J, Shen XH, Sclaroff S. Top-down neural attention by excitation backprop. International Journal of Computer Vision, 2018, 126(10): 1084–1102. [doi: 10.1007/s11263-017-1059-x]
    [111] Singh C, Murdoch WJ, Yu B. Hierarchical interpretations for neural network predictions. In: Proc. of the 27th Int’l Conf. on Learning Representations. New Orleans: OpenReview.net, 2019. 1–11.
    [112] Chen JB, Song L, Wainwright MJ, Jordan MI. Learning to explain: An information-theoretic perspective on model interpretation. In: Proc. of the 35th Int’l Conf. on Machine Learning. Stockholm: PMLR, 2018. 882–891.
    [113] Herman B. The promise and peril of human evaluation for model interpretability. arXiv:1711.07414, 2019.
    [114] Chang CH, Creager E, Goldenberg A, Duvenaud D. Explaining image classifiers by counterfactual generation. In: Proc. of the 7th Int’l Conf. on Learning Representations. New Orleans: OpenReview.net, 2019. 1–11.
    [115] Wang HF, Du MN, Yang F, Zhang ZJ. Score-CAM: Improved visual explanations via score-weighted class activation mapping. arXiv:1910.01279, 2020.
    [116] Chattopadhay A, Sarkar A, Howlader P, Balasubramanian VN. Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks. In: Proc. of the 2018 IEEE Winter Conf. on Applications of Computer Vision. Lake Tahoe: IEEE, 2018. 839–847.
    [117] Ross AS, Hughes MC, Doshi-Velez F. Right for the right reasons: Training differentiable models by constraining their explanations. In: Proc. of the 26th Int’l Joint Conf. on Artificial Intelligence. Melbourne: IJCAI.org, 2017. 2662–2670.
    [118] Soares E, Angelov P, Biaso S, Froes MH, Abe DK. SARS-CoV-2 CT-scan dataset: A large dataset of real patients CT scans for SARS-CoV-2 identification. medRxiv, 2020.
    [119] Tetila E, Bressem K, Astolfi G, Sant’Ana DA, Pache MC, Pistori H. System for quantitative diagnosis of COVID-19-associated pneumonia based on superpixels with deep learning and chest CT. Research Square, 2020.
    [120] Soares E, Angelov P. Fair-by-design explainable models for prediction of recidivism. arXiv:1910.02043, 2019.
    [121] Soares E, Angelov P, Costa B, Castro M. Actively semi-supervised deep rule-based classifier applied to adverse driving scenarios. In: Proc. of the 2019 Int’l Joint Conf. on Neural Networks. Budapest: IEEE, 2019. 1–8.
    [122] Soares E, Angelov PP, Costa B, Castro MPG, Nageshrao S, Filev D. Explaining deep learning models through rule-based approximation and visualization. IEEE Transactions on Fuzzy Systems, 2021, 29(8): 2399–2407. [doi: 10.1109/TFUZZ.2020.2999776]
    [123] Gao JY, Wang XT, Wang YS, Xie X. Explainable recommendation through attentive multi-view learning. In: Proc. of the 33rd AAAI Conf. on Artificial Intelligence and the 31st Innovative Applications of Artificial Intelligence Conf. and the 9th AAAI Symp. on Educational Advances in Artificial Intelligence. Honolulu: AAAI Press, 2019. 445.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

窦慧,张凌茗,韩峰,申富饶,赵健.卷积神经网络的可解释性研究综述.软件学报,2024,35(1):159-184

复制
分享
文章指标
  • 点击次数:3408
  • 下载次数: 6646
  • HTML阅读次数: 2256
  • 引用次数: 0
历史
  • 收稿日期:2022-01-20
  • 最后修改日期:2022-04-01
  • 在线发布日期: 2023-02-22
  • 出版日期: 2024-01-06
文章二维码
您是第19750169位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号