人工智能的逆向工程——反向智能研究综述

doi:10.13328/j.cnki.jos.006699

微信服务号

微信订阅号

2025年2月28日 20:31 星期五

首页 > 过刊浏览>2023年第34卷第2期 >712-732. DOI:10.13328/j.cnki.jos.006699

PDF HTML阅读 XML下载导出引用引用提醒

人工智能的逆向工程——反向智能研究综述
DOI:
                        10.13328/j.cnki.jos.006699
                    
CSTR:
                        
                    
作者:
                        李长升李长升
北京理工大学 计算机学院, 北京 100081
在期刊界中查找
在百度中查找
在本站中查找
汪诗烨汪诗烨
北京理工大学 计算机学院, 北京 100081
在期刊界中查找
在百度中查找
在本站中查找
李延铭李延铭
北京理工大学 计算机学院, 北京 100081
在期刊界中查找
在百度中查找
在本站中查找
张成喆张成喆
北京理工大学 计算机学院, 北京 100081
在期刊界中查找
在百度中查找
在本站中查找
袁野袁野
北京理工大学 计算机学院, 北京 100081
在期刊界中查找
在百度中查找
在本站中查找
王国仁王国仁
北京理工大学 计算机学院, 北京 100081
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:李长升(1985—),男,博士,教授,博士生导师,CCF专业会员,主要研究领域为机器学习;汪诗烨(1995—),女,博士生,CCF学生会员,主要研究领域为机器学习;李延铭(1997—),男,硕士生,主要研究领域为机器学习;张成喆(2000—),男,学士,主要研究领域为人工智能安全;袁野(1981—),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为数据库;王国仁(1966—),男,博士,教授,博士生导师,CCF杰出会员,主要研究领域为不确定数据管理,数据密集型计算,可视媒体数据分析管理,非结构化数据管理,分布式查询处理与优化,生物信息学
通讯作者:李长升，lcs@bit.edu.cn
中图分类号:
基金项目:国家自然科学基金优秀青年科学基金（62122013）；国家自然科学基金广东联合基金重点项目（U2001211）；北京理工大学青年教师学术启动计划（3070012222010）

Survey on Reverse-engineering Artificial Intelligence

Author:

LI Chang-Sheng
LI Chang-Sheng
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
在期刊界中查找
在百度中查找
在本站中查找
WANG Shi-Ye
WANG Shi-Ye
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
在期刊界中查找
在百度中查找
在本站中查找
LI Yan-Ming
LI Yan-Ming
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
在期刊界中查找
在百度中查找
在本站中查找
ZHANG Cheng-Zhe
ZHANG Cheng-Zhe
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
在期刊界中查找
在百度中查找
在本站中查找
YUAN Ye
YUAN Ye
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
在期刊界中查找
在百度中查找
在本站中查找
WANG Guo-Ren
WANG Guo-Ren
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [138]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

在大数据时代，人工智能得到了蓬勃发展，尤其以机器学习、深度学习为代表的技术更是取得了突破性进展.随着人工智能在实际场景中的广泛应用，人工智能的安全和隐私问题也逐渐暴露出来，并吸引了学术界和工业界的广泛关注.以机器学习为代表，许多学者从攻击和防御的角度对模型的安全问题进行了深入的研究，并且提出了一系列的方法.然而，当前对机器学习安全的研究缺少完整的理论架构和系统架构.从训练数据逆向还原、模型结构反向推演、模型缺陷分析等角度进行了总结和分析，建立了反向智能的抽象定义及其分类体系.同时，在反向智能的基础上，将机器学习安全作为应用对其进行简要归纳.最后探讨了反向智能研究当前面临的挑战以及未来的研究方向.建立反向智能的理论体系，对于促进人工智能健康发展极具理论意义.

关键词:反向智能;人工智能安全;逆向还原;缺陷分析

Abstract:

In the era of big data, artificial intelligence, especially the representative technologies of machine learning and deep learning, has made great progress in recent years. As artificial intelligence has been widely used to various real-world applications, the security and privacy problems of artificial intelligence is gradually exposed, and has attracted increasing attention in academic and industry communities. Researchers have proposed many works focusing on solving the security and privacy issues of machine learning from the perspective of attack and defense. However, current methods on the security issue of machine learning lack of the complete theory framework and system framework. This survey summarizes and analyzes the reverse recovery of training data and model structure, the defect of the model, and gives the formal definition and classification system of reverse-engineering artificial intelligence. In the meantime, this survey summarizes the progress of machine learning security on the basis of reverse-engineering artificial intelligence, where the security of machine learning can be taken as an application. Finally, the current challenges and future research directions of reverse-engineering artificial intelligence are discussed, while building the theory framework of reverse-engineering artificial intelligence can promote the develop of artificial intelligence in a healthy way.

Key words:reverse-engineering artificial intelligence;artificial intelligence security;reverse recovery;defect analysis

参考文献

[1] Bishop CM. Pattern Recognition and Machine Learning. New York:Springer, 2006.

[2] Xu H, Ma Y, Liu HC, Debayan D, Liu H, Tang JL, Jain, Anil K. Adversarial attacks and defenses in images, graphs and text:A review. Int'l Journal of Automation and Computing, 2020, 17(2):151-178.

[3] Zhang CN, Philipp B, Lin CG, Adil K, Wu J, Kweon, In So. A survey on universal adversarial attack. arXiv:2103. 01498, 2021.

[4] Milad N, Shokri R, Houmansadr A. Machine learning with membership privacy using adversarial regularization. In:Proc. of the ACM SIGSAC Conf. on Computer and Communications Security. 2018. 634-646.

[5] Reza S, Stronati M, Song CZ, Shmatikov V. Membership inference attacks against machine learning models. In:Proc. of the 2017 IEEE Symp. on Security and Privacy (SP). IEEE, 2017. 3-18.

[6] Luca M, Song CZ, De Cristofaro E, Shmatikov V. Exploiting unintended feature leakage in collaborative learning. In:Proc. of the 2019 IEEE Symp. on Security and Privacy (SP). IEEE, 2019. 691-706.

[7] Song L, Shokri R, Mittal P. Membership inference attacks against adversarially robust deep learning models. In:Proc. of the 2019 IEEE Security and Privacy Workshops (SPW). IEEE, 2019. 50-56.

[8] Samuel Y, Giacomelli I, Fredrikson M, Jha S. Privacy risk in machine learning:Analyzing the connection to overfitting. In:Proc. of the 31st IEEE Computer Security Foundations Symp. (CSF). IEEE, 2018. 268-282.

[9] Christopher A Choquette Choo, Tramer F, Carlini N, Papernot N. Label-only membership inference attacks. In:Proc. of the Int'l Conf. on Machine Learning. PMLR, 2021. 1964-1974.

[10] Truex S, Liu L, Gursoy ME, et al. Demystifying membership inference attacks in machine learning as a service. IEEE Trans. on Services Computing, 2019.

[11] Truex S, Liu L, Gursoy ME, et al. Effects of differential privacy and data skewness on membership inference vulnerability. In:Proc. of the 1st IEEE Int'l Conf. on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA). IEEE, 2019. 82-91.

[12] Hayes, Jamie, Melis L, Danezis G, Cristofaro ED. LOGAN:Evaluating information leakage of generative models using generative adversarial networks. arXiv:1705. 07663, 2017.

[13] Nasr, Milad, Shokri R, Houmansadr A. Comprehensive privacy analysis of deep learning:Passive and active white-box inference attacks against centralized and federated learning. In:Proc. of the 2019 IEEE Symp. on Security and Privacy. IEEE, 2019. 739-753.

[14] Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images. In:Handbook of Systemic Autoimmune Diseases. 2009.

[15] Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017. 4700-4708.

[16] Ateniese G, Mancini LV, Spognardi A, et al. Hacking smart machines with smarter ones:How to extract meaningful data from machine learning classifiers. Int'l Journal of Security and Networks, 2015, 10(3):137-150.

[17] Ganju K, Wang Q, Yang W, Gunter CA, Borisov N. Property inference attacks on fully connected neural networks using permutation invariant representations. In:Proc. of the 2018 ACM SIGSAC Conf. on Computer and Communications Security. 2018. 619-633.

[18] Zaheer M, Kottur S, Ravanbakhsh S, Poczos B, Salakhutdinov RR, Smola AJ. Deep sets. In:Advances in Neural Information Processing Systems. 2017. 3394-3404.

[19] Gopinath D, Converse H, Pasareanu C, Taly A. Property inference for deep neural networks. In:Proc. of the 34th IEEE/ACM Int'l Conf. on Automated Software Engineering (ASE). IEEE, 2019. 797-809.

[20] Zhang YH, Jia RX, Pei HZ, Wang WX, Li B, Song D. The secret revealer:Generative model-inversion attacks against deep neural networks. In:Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition. IEEE, 2020. 253-261.

[21] Fredrikson M, Lantz E, Jha S, Lin S, Page D, Ristenpart T. Privacy in pharmacogenetics:An end-to-end case study of personalized warfarin dosing. In:Proc. of the 23rd {USENIX} Security Symp. ({USENIX} Security 2014). 2014. 17-32.

[22] Fredrikson M, Jha S, Ristenpart T. Model inversion attacks that exploit confidence information and basic countermeasures. In:Proc. of the 22nd ACM SIGSAC Conf. on Computer and Communications Security. 2015. 1322-1333.

[23] Chen S, Kahla M, Jia RX, Qi GJ. Knowledge-enriched distributional model inversion attacks. In:Proc. of the IEEE/CVF Int'l Conf. on Computer Vision. 2021. 16178-16187.

[24] Myung IJ. Tutorial on maximum likelihood estimation. Journal of Mathematical Psychology, 2003, 47(1):90-100.

[25] Bernardo JM, Smith AFM. Bayesian Theory. Vol. 405, John Wiley & Sons, 2009.

[26] Silverman BW. Density Estimation for Statistics and Data Analysis. Routledge, 2018.

[27] Cover T. Estimation by the nearest neighbor rule. IEEE Trans. on Information Theory, 1968, 14(1):50-55.

[28] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. In:Advances in Neural Information Processing Systems. 2014.

[29] Kingma DP, Welling M. Auto-encoding variational Bayes. arXiv preprint arXiv:1312. 6114, 2013.

[30] Wang P, Li YJ, Singh KK, Lu JW, Vasconcelos N. IMAGINE:Image synthesis by image-guided model inversion. ArXiv abs/2104. 05895, 2021.

[31] Wang KC, Yan F, Ke L, Khisti AJ, Zemel R, Makhzani A. Variational model inversion attacks. In:Proc. of the 35th Conf. on Neural Information Processing Systems. 2021.

[32] Wang P, Li YJ, Singh KK, Lu JW, Vasconcelos N. IMAGINE:Image synthesis by image-guided model inversion. In:Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition. IEEE, 2021. 3681-3690.

[33] Zhang YH, Jia RX, Pei HZ, Wang WX, Li B, Song D. The secret revealer:Generative model-inversion attacks against deep neural networks. In:Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition. IEEE, 2020. 253-261.

[34] Yin HX, Molchanov P, Alvarez JM, Li ZZ, Mallya A, Hoiem D, Jha NK, Kautz J. Dreaming to distill:Data-free knowledge transfer via deepinversion. In:Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition. 2020. 8715-8724.

[35] Thiagarajan JJ, Narayanaswamy V, Rajan D, Liang J, Chaudhari A, Spanias A. Designing counterfactual generators using deep model inversion. In:Proc. of the 35th Conf. on Neural Information Processing Systems. 2021.

[36] Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow IJ, Fergus R. Intriguing properties of neural networks. In:Proc. of the Int'l Conf. on Learning Representations. 2014.

[37] Dezfooli M, Mohsen S, Fawzi A, Frossard P. Deepfool:A simple and accurate method to fool deep neural networks. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2016. 2574-2582.

[38] Oh SJ, Schiele B, Fritz M. Towards reverse-engineering black-box neural networks. In:Proc. of the Explainable AI:Interpreting, Explaining and Visualizing Deep Learning. Cham:Springer, 2019, 121-144.

[39] Yan MJ, Fletcher CW, Torrellas J. Cache telepathy:Leveraging shared resource attacks to learn DNN architectures. In:Proc. of the USENIX Security Symp. 2020. 2003-2020.

[40] Hua WZ, Zhang ZR, Suh GE. Reverse engineering convolutional neural networks through side-channel information leaks. In:Proc. of the ACM/ESDA/IEEE Design Automation Conf. (DAC). IEEE, 2018. 1-6.

[41] Naghibijouybari H, Neupane A, Qian ZY, Abu-Ghazaleh N. Rendered insecure:GPU side channel attacks are practical. In:Proc. of the ACM SIGSAC Conf. on Computer and Communications Security. 2018. 2139-2153.

[42] Hu X, Liang L, Li SC, Deng L, Zuo PF, Ji Y, Xie XF, Ding YF, Liu C, Sherwood T, et al. DeepSniffer:A DNN model extraction framework based on learning architectural hints. In:Proc. of the Int'l Conf. on Architectural Support for Pro-gramming Languages and Operating Systems. 2020. 385-399.

[43] Zhu YK, Cheng YQ, Zhou HS, Lu YT. Hermes attack:Steal {DNN} models with lossless inference accuracy. In:Proc. of the 30th {USENIX} Security Symp. 2021.

[44] Lou XX, Guo SW, Li JW, Wu YX, Zhang TW. NASPY:Automated extraction of automated machine learning models. In:Proc. of the Int'l Conf. on Learning Representations. 2022.

[45] Tramèr F, Zhang F, Juels A, et al. Stealing machine learning models via prediction {APIs}. In:Proc. of the 25th USENIX Security Symp. (USENIX Security 2016). 2016. 601-618.

[46] Lydia A, Francis S. Adagrad-An optimizer for stochastic gradient descent. Int'l Journal of Information Computer Science, 2019, 6(5):566-568.

[47] Robbins H, Monro S. A stochastic approximation method. The Annals of Mathematical Statistics, 1951, 400-407.

[48] Kingma DP, Ba J. Adam:A method for stochastic optimization. arXiv:1412. 6980, 2014.

[49] Tieleman T, Hinton G. Lecture 6. 5-rmsprop:Divide the gradient by a running average of its recent magnitude. COURSERA:Neural Networks for Machine Learning, 2012, 4(2):26-31.

[50] Maheswaranathan N, Sussillo D, Metz L, Sun RX, Sohl-Dickstein J. Reverse engineering learned optimizers reveals known and novel mechanisms. In:Proc. of the Conf. on Neural Information Processing Systems. 2021. 19910-19922.

[51] Sussillo D, Barak O. Opening the black box:Low-dimensional dynamics in high-dimensional recurrent neural networks. Neural Computation, 2013, 25(3):626-649.

[52] Andrychowicz M, Denil M, Gomez S, Hoffman MW, Pfau D, Schaul T, Shillingford B, De Freitas N. Learning to learn by gradient descent by gradient descent. In:Proc. of the Advances in Neural Information Processing Systems. 2016. 3981-3989.

[53] Polyak BT. Some methods of speeding up the convergence of iteration methods. USSR Computational Mathematics and Mathematical Physics, 1964, 4(5):1-17.

[54] Duchi J, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 2011, 12(7):257-269.

[55] Wang BH, Gong NZQ. Stealing hyperparameters in machine learning. In:Proc. of the IEEE Symp. on Security and Privacy (SP). IEEE, 2018. 36-52.

[56] Montgomery DC, Peck EA, Vining GG. Introduction to Linear Regression Analysis. John Wiley & Sons, 2015.

[57] Jagielski M, Carlini N, Berthelot D, Kurakin A, Papernot N. High accuracy and high fidelity extraction of neural networks. In:Proc. of the {USENIX} Security Symp. 2020. 1345-1362.

[58] Papernot N, McDaniel P, Goodfellow I, Jha S, Celik ZB, Swami A. Practical black-box attacks against machine learning. In:Proc. of the 2017 ACM on Asia Conf. on Computer and Communications Security. 2017. 506-519.

[59] Pal S, Gupta Y, Shukla A, et al. Activethief:Model extraction using active learning and unannotated public data. Proc. of the AAAI Conf. on Artificial Intelligence, 2020, 34(1):865-872.

[60] Orekondy T, Schiele B, Fritz M. Knockoff nets:Stealing functionality of black-box models. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 4954-4963.

[61] Shi Y, Sagduyu Y, Grushin A. How to steal a machine learning classifier with deep learning. In:Proc. of the 2017 IEEE Int'l Symp. on Technologies for Homeland Security (HST). IEEE, 2017. 1-5.

[62] Settles B, Craven M. An analysis of active learning strategies for sequence labeling tasks. In:Proc. of the 2008 Conf. on Empirical Methods in Natural Language Processing. 2008. 1070-1079.

[63] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv:1503. 02531, 2015.

[64] Kariyappa, Sanjay, Prakash A, Qureshi MK. Maze:Data-free model stealing attack using zeroth-order gradient estimation. In:Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition. 2021. 13814-13823.

[65] Fang GF, Song J, Shen CC, Wang XC, Chen D, Song ML. Data-free adversarial distillation. arXiv:1912. 11006, 2019.

[66] Micaelli P, Storkey AJ. Zero-shot knowledge transfer via adversarial belief matching. In:Advances in Neural Information Processing Systems. 2019. 9551-9561.

[67] Ghadimi S, Lan GH. Stochastic first-and zeroth-order methods for nonconvex stochastic program-ming. SIAM Journal on Optimization, 2013, 23(4):2341-2368.

[68] Nesterov Y, Spokoiny V. Random gradient-free minimization of convex functions. Foundations of Com-putational Mathematics, 2017, 17(2):527-566.

[69] Truong JB, et al. Data-free model extraction. In:Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition. 2021. 4771-4780.

[70] Gong X, Chen Y, Yang W, Mei G, Wang Q. INVERSENET:Augmenting model extraction attacks with training data inversion. In:Proc. of the 30th Int'l Joint Conf. on Artificial Intelligence {IJCAI-21}. 2021.

[71] Ehlers R. Formal verification of piece-wise linear feed-forward neural networks. In:Proc. of the Int'l Symp. on Automated Technology for Verification and Analysis. Cham:Springer, 2017.

[72] Katz G, Barrett C, Dill DL, et al. Reluplex:An efficient SMT solver for verifying deep neural networks. In:Proc. of the Int'l Conf. on Computer Aided Verification. Cham:Springer, 2017. 97-117.

[73] Tjeng V, Xiao K, Tedrake R. Evaluating robustness of neural networks with mixed integer programming. arXiv:1711.07356, 2017.

[74] Singh G, Gehr T, Püschel M, et al. An abstract domain for certifying neural networks. Proc. of the ACM on Programming Languages, 2019, 3(POPL):1-30.

[75] Salman H, Yang G, Zhang H, et al. A convex relaxation barrier to tight robustness verification of neural networks. arXiv:1902. 08722, 2019.

[76] Xu K, Shi Z, Zhang H, et al. Automatic perturbation analysis for scalable certified robustness and beyond. In:Advances in Neural Information Processing Systems. 2020. 33.

[77] Hein M, Andriushchenko M. Formal guarantees on the robustness of a classifier against adversarial manipulation. arXiv:1705. 08475, 2017.

[78] Zhang H, Zhang PC, Hsieh CJ. Recurjac:An efficient recursive algorithm for bounding Jacobian matrix of neural networks and its applications. Proc. of the AAAI Conf. on Artificial Intelligence, 2019, 33(1).

[79] Lecuyer M, et al. Certified robustness to adversarial examples with differential privacy. In:Proc. of the 2019 IEEE Symp. on Security and Privacy (SP). IEEE, 2019.

[80] Cohen J, Rosenfeld E, Kolter Z. Certified adversarial robustness via randomized smoothing. In:Proc. of the Int'l Conf. on Machine Learning. PMLR, 2019.

[81] Yang G, Duan T, Hu JE, et al. Randomized smoothing of all shapes and sizes. In:Proc. of the Int'l Conf. on Machine Learning. PMLR, 2020. 10693-10705.

[82] Pouyanfar S, et al. Dynamic sampling in convolutional neural networks for imbalanced data classification. In:Proc. of the 2018 IEEE Conf. on Multimedia Information Processing and Retrieval (MIPR). IEEE, 2018.

[83] He H, Garcia EA. Learning from imbalanced data. IEEE Trans. on Knowledge and Data Engineering, 2009, 21(9):1263-1284.

[84] Wang YR, et al. Dynamic curriculum learning for imbalanced data classification. In:Proc. of the IEEE/CVF Int'l Conf. on Computer Vision. 2019.

[85] Wu T, et al. Adversarial robustness under long-tailed distribution. In:Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition. 2021.

[86] Yin X, Yu X, Sohn K, et al. Feature transfer learning for face recognition with under-represented data. In:Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition. 2019. 5704-5713.

[87] Chawla NV, Bowyer KW, Hall LO, et al. SMOTE:Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 2002, 16:321-357.

[88] Huang C, Li Y, Loy CC, et al. Learning deep representation for imbalanced classification. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2016. 5375-5384.

[89] Zhang X, Fang Z, Wen Y, et al. Range loss for deep face recognition with long-tailed training data. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2017. 5409-5418.

[90] Kang B, Xie S, Rohrbach M, et al. Decoupling representation and classifier for long-tailed recognition. arXiv:1910.09217, 2019.

[91] Erhan D, Bengio Y, Courville A, et al. Visualizing higher-layer features of a deep network. University of Montreal, 2009, 1341(3):1.

[92] Nguyen A, Dosovitskiy A, Yosinski J, et al. Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. In:Advances in Neural Information Processing Systems, Vol. 29. 2016. 3387-3395.

[93] Mahendran A, Vedaldi A. Understanding deep image representations by inverting them. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2015. 5188-5196.

[94] Dosovitskiy A, Brox T. Inverting visual representations with convolutional networks. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2016. 4829-4837.

[95] Du M, Liu N, Song Q, et al. Towards explanation of DNN-based prediction with guided feature inversion. In:Proc. of the 24th ACM SIGKDD Int'l Conf. on Knowledge Discovery & Data Mining. 2018. 1358-1367.

[96] Kahng M, Andrews PY, Kalro A, et al. Activis:Visual exploration of industry-scale deep neural network models. IEEE Trans. on Visualization and Computer Graphics, 2017, 24(1):88-97.

[97] Strobelt H, Gehrmann S, Pfister H, et al. Lstmvis:A tool for visual analysis of hidden state dynamics in recurrent neural networks. IEEE Trans. on Visualization and Computer Graphics, 2017, 24(1):667-676.

[98] Strobelt H, Gehrmann S, Behrisch M, et al. Seq2seq-vis:A visual debugging tool for sequence-to-sequence models. IEEE Trans. on Visualization and Computer Graphics, 2018, 25(1):353-363.

[99] Biggio B, et al. Evasion attacks against machine learning at test time. In:Proc. of the Joint European Conf. on Machine Learning and Knowledge Discovery in Databases. 2013.

[100] Salem A, Zhang Y, Humbert M, et al. Ml-leaks:Model and data independent membership inference attacks and defenses on machine learning models. arXiv:1806.01246, 2018.

[101] Wang Q, Guo W, Zhang K, et al. Adversary resistant deep neural networks with an application to Malware detection. In:Proc. of the 23rd ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. 2017. 1145-1153.

[102] Nguyen A, Yosinski J, Clune J. Deep neural networks are easily fooled:High confidence predictions for unrecognizable images. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2015.

[103] Jia JY, Salem A, Backes M, Zhang Y, Gong NZQ. Memguard:Defending against black-box membership inference attacks via adversarial examples. In:Proc. of the 2019 ACM SIGSAC Conf. on Computer and Communications Security (CCS). 2019.

[104] Goodfellow IJ, Shlens J, Szegedy C. Explaining and harnessing adversarial examples. arXiv:1412.6572, 2014.

[105] Metzen JH, Genewein T, Fischer V, Bischoff B. On detecting adversarial perturbations. In:Proc. of the Int'l Conf. on Learning Representations. 2017.

[106] Papernot N, et al. The limitations of deep learning in adversarial settings. In:Proc. of the IEEE European Symp. on Security and Privacy (EuroS & P). IEEE, 2016. 372-387.

[107] Tramèr F, Kurakin A, Papernot N, Boneh D, McDaniel P. Ensemble adversarial training:Attacks and defenses. arXiv:1705.07204, 2017.

[108] Dwork C. Differential privacy:A survey of results. In:Proc. of the Int'l Conf. on Theory and Applications of Models of Computation. Berlin, Heidelberg:Springer, 2008. 1-19.

[109] Abadi M, Chu A, Goodfellow I, McMahan HB, Mironov I, Talwar K, Zhang L. Deep learning with differential privacy. In:Proc. of the 2016 ACM SIGSAC Conf. on Computer and Communications Security. 2016. 308-318.

[110] Papernot N, Abadi M, Erlingsson U, Goodfellow I, Talwar K. Semi-supervised knowledge transfer for deep learning from private training data. arXiv:1610.05755, 2016.

[111] Orekondy T, Schiele B, Fritz M. Prediction poisoning:Towards defenses against DNN model stealing attacks. arXiv:1906.10908, 2019.

[112] Metzen JH, et al. On detecting adversarial perturbations. arXiv:1702.04267, 2017.

[113] Wang JY, et al. Detecting adversarial samples for deep neural networks through mutation testing. arXiv:1805.05010, 2018.

[114] Juuti M, Szyller S, Marchal S, Asokan N. PRADA:Protecting against DNN model stealing attacks. In:Proc. of the 2019 IEEE European Symp. on Security and Privacy (EuroS & P). IEEE, 2019. 512-527.

[115] Kariyappa S, Qureshi MK. Defending against model stealing attacks with adaptive misinformation. In:Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition. 2020. 770-778.

[116] Carlini N, Wagner D. Adversarial examples are not easily detected:Bypassing ten detection methods. In:Proc. of the ACM Workshop on Artificial Intelligence and Security. 2017. 3-14.

[117] Carlini N, Wagner D. Magnet and "efficient defenses against adversarial attacks" are not robust to adversarial examples. arXiv:1711.08478, 2017.

[118] Nelson B, Barreno M, Chi FJ, Joseph AD, Rubinstein BIP, Saini U, Sutton C, Tygar JD, Xia K. Exploiting machine learning to subvert your spam filter. LEET, 2008, 8(1-9):16-17.

[119] Xiao H, Biggio B, Brown G, Fumera G, Eckert C, Roli F. Is feature selection secure against training data poisoning? In:Proc. of the Int'l Conf. on Machine Learning. PMLR, 2015. 1689-1698.

[120] Biggio B, Nelson B, Laskov P. Support vector machines under adversarial label noise. In:Proc. of the Asian Conf. on Machine Learning. PMLR, 2011. 97-112.

[121] Liu YF, Ma XJ, Bailey J, Lu F. Reflection backdoor:A natural backdoor attack on deep neural networks. In:Proc. of the European Conf. on Computer Vision. Cham:Springer, 2020. 182-199.

[122] Zhao SH, Ma XJ, Zheng X, Bailey J, Chen JJ, Jiang YG. Clean-label backdoor attacks on video recognition models. In:Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition. 2020. 14443-14452.

[123] Saha A, Subramanya A, Pirsiavash H. Hidden trigger backdoor attacks. In:Proc. of the AAAI Conf. on Artificial Intelligence. 2020. 11957-11965.

[124] Bagdasaryan E, Shmatikov V. Blind backdoors in deep learning models. In:Proc. of the 30th USENIX Security Symp. (USENIX Security 2021). 2021. 1505-1521.

[125] Lin YC, Hong ZW, Liao YH, Shih ML, Liu MY, Sun M. Tactics of adversarial attack on deep reinforcement learning agents. In:Proc. of the Int'l Joint Conf. on Artificial Intelligence. 2017. 3756-3762.

[126] Kos J, Song D. Delving into adversarial attacks on deep policies. In:Proc. of the Int'l Conf. on Learning Representations (Workshop). 2017.

[127] Jia R, Liang P. Adversarial examples for evaluating reading comprehension systems. In:Proc. of the Conf. Empirical Methods Natural Lang. Process (EMNLP). 2017. 1-11.

[128] Sharif M, Bhagavatula S, Bauer L, et al. Accessorize to a crime:Real and stealthy attacks on state-of-the-art face recognition. In:Proc. of the 2016 ACM SIGSAC Conf. on Computer and Communications Security. 2016. 1528-1540.

[129] Xie C, Wang J, Zhang Z, et al. Adversarial examples for semantic segmentation and object detection. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2017. 1369-1378.

[130] Fischer V, Kumar MC, Metzen JH, Brox T. Adversarial examples for semantic image segmentation. In:Proc. of the Int'l Conf. on Learning Representations (Workshop). 2017.

[131] Carlini N, Wagner D. Audio adversarial examples:Targeted attacks on speech-to-text. In:Proc. of the IEEE Security and Privacy Workshops (SPW). IEEE, 2018. 1-7.

[132] Madry A, Makelov A, Schmidt L, et al. Towards deep learning models resistant to adversarial attacks. arXiv:1706.06083, 2017.

[133] Xu W, Evans D, Qi Y. Feature squeezing:Detecting adversarial examples in deep neural networks. arXiv:1704.01155, 2017.

[134] Xie C, Wu Y, Maaten L, et al. Feature denoising for improving adversarial robustness. In:Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition. 2019. 501-509.

[135] Papernot N, McDaniel P, Wu X, et al. Distillation as a defense to adversarial perturbations against deep neural networks. In:Proc. of the 2016 IEEE Symp. on Security and Privacy (SP). IEEE, 2016. 582-597.

[136] Carlini N, Wagner D. Towards evaluating the robustness of neural networks. In:Proc. of the 2017 IEEE Symp. on Security and Privacy (SP). IEEE, 2017. 39-57.

[137] Gu S, Rigazio L. Towards deep neural network architectures robust to adversarial examples. arXiv:1412.5068, 2014.

[138] Carlini, Nicholas, Wagner D. Towards evaluating the robustness of neural networks. In:Proc. of the IEEE Symp. on Security and Privacy (SP). IEEE, 2017. 39-57.

引用本文

李长升,汪诗烨,李延铭,张成喆,袁野,王国仁.人工智能的逆向工程——反向智能研究综述.软件学报,2023,34(2):712-732

复制

文章指标

点击次数:2555
下载次数: 6197
HTML阅读次数: 3859
引用次数: 0

历史

收稿日期:2022-01-25
最后修改日期:2022-04-12
录用日期:
在线发布日期: 2022-12-30
出版日期: 2023-02-06

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码