面向深度学习的后门攻击及防御研究综述

doi:10.13328/j.cnki.jos.007364

微信服务号

微信订阅号

2025年7月14日 21:05 星期一

首页 > 过刊浏览>2025年第36卷第7期 >3271-3305. DOI:10.13328/j.cnki.jos.007364

PDF HTML阅读 XML下载导出引用引用提醒

面向深度学习的后门攻击及防御研究综述
DOI:
                        10.13328/j.cnki.jos.007364
                    
CSTR:
                        32375.14.jos.007364
                    
作者:
                        高梦楠高梦楠
南京邮电大学 计算机学院、软件学院、网络空间安全学院, 江苏 南京 210023
在期刊界中查找
在百度中查找
在本站中查找
陈伟陈伟
南京邮电大学 计算机学院、软件学院、网络空间安全学院, 江苏 南京 210023;江苏省大数据安全与智能处理重点实验室 (南京邮电大学), 江苏 南京 210023
在期刊界中查找
在百度中查找
在本站中查找
吴礼发吴礼发
南京邮电大学 计算机学院、软件学院、网络空间安全学院, 江苏 南京 210023;江苏省大数据安全与智能处理重点实验室 (南京邮电大学), 江苏 南京 210023
在期刊界中查找
在百度中查找
在本站中查找
张伯雷张伯雷
南京邮电大学 计算机学院、软件学院、网络空间安全学院, 江苏 南京 210023
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:TP306
基金项目:国家自然科学基金(62202238); 江苏省重点研发项目(BE2022065-5)

Survey on Backdoor Attacks and Defenses for Deep Learning Research

Author:

GAO Meng-Nan
GAO Meng-Nan
School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
在期刊界中查找
在百度中查找
在本站中查找
CHEN Wei
CHEN Wei
School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China;Jiangsu Key Laboratory of Big Data Security & Intelligent Processing (Nanjing University of Posts and Telecommunications), Nanjing 210023, China
在期刊界中查找
在百度中查找
在本站中查找
WU Li-Fa
WU Li-Fa
School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China;Jiangsu Key Laboratory of Big Data Security & Intelligent Processing (Nanjing University of Posts and Telecommunications), Nanjing 210023, China
在期刊界中查找
在百度中查找
在本站中查找
ZHANG Bo-Lei
ZHANG Bo-Lei
School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [103]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

深度学习模型是人工智能系统的重要组成部分, 被广泛应用于现实多种关键场景. 现有研究表明, 深度学习的低透明度与弱可解释性使得深度学习模型对扰动敏感. 人工智能系统面临多种安全威胁, 其中针对深度学习的后门攻击是人工智能系统面临的重要威胁. 为了提高深度学习模型的安全性, 全面地介绍计算机视觉、自然语言处理等主流深度学习系统的后门攻击与防御研究进展. 首先根据现实中攻击者能力将后门攻击分为全过程可控后门、模型修改后门和仅数据投毒后门. 然后根据后门构建方式进行子类划分. 接着根据防御策略对象将现有后门防御方法分为基于输入的后门防御与基于模型的后门防御. 最后汇总后门攻击常用数据集与评价指标, 并总结后门攻击与防御领域存在的问题, 在后门攻击的安全应用场景与后门防御的有效性等方面提出建议与展望.

关键词:深度学习;后门攻击;后门防御;人工智能安全

Abstract:

Deep learning models are integral components of artificial intelligence systems, widely deployed in various critical real-world scenarios. Research has shown that the low transparency and weak interpretability of deep learning models render them highly sensitive to perturbations. Consequently, artificial intelligence systems are exposed to multiple security threats, with backdoor attacks on deep learning models representing a significant concern. This study provides a comprehensive overview of the research progress on backdoor attacks and defenses in mainstream deep learning systems, including computer vision and natural language processing. Backdoor attacks are categorized based on the attacker’s capabilities into full-process controllable backdoors, model modification backdoors, and data poisoning backdoors, which are further classified according to the backdoor construction methods. Defense strategies are divided into input-based defenses and model-based defenses, depending on the target of the defensive measures. This study also summarizes commonly used datasets and evaluation metrics in this domain. Lastly, existing challenges in backdoor attack and defense research are discussed, alongside recommendations and future directions focusing on security application scenarios of backdoor attacks and the efficacy of defense mechanisms.

Key words:deep learning;backdoor attack;backdoor defense;AI security

参考文献

[1] Anderljung M, Barnhart J, Korinek A, Leung J, O’Keefe C, Whittlestone J, Avin S, Brundage M, Bullock J, Cass-Beggs D, Chang B, Collins T, Fist T, Hadfield G, Hayes A, Ho L, Hooker S, Horvitz E, Kolt N, Schuett J, Shavit Y, Siddarth D, Trager R, Wolf K. Frontier AI regulation: Managing emerging risks to public safety. arXiv:2307.03718, 2023.

[2] Chen XY, Liu C, Li B, Lu K, Song D. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv:1712.05526, 2017.

[3] 陈宇飞, 沈超, 王骞, 李琦, 王聪, 纪守领, 李康, 管晓宏. 人工智能系统安全与隐私风险. 计算机研究与发展, 2019, 56(10): 2135–2150.

Chen YF, Shen C, Wang Q, Li Q, Wang C, Ji SL, Li K, Guan XH. Security and privacy risks in artificial intelligence system. Journal of Computer Research and Development, 2019, 56(10): 2135–2150 (in Chinese with English abstract).

[4] Yan BC, Lan JH, Yan Z. Backdoor attacks against voice recognition systems: A survey. arXiv:2307.13643, 2023.

[5] Chen YJ, Gong XL, Wang Q, Di X, Huang HY. Backdoor attacks and defenses for deep neural networks in outsourced cloud environments. IEEE Network, 2020, 34(5): 141–147.

[6] 纪守领, 杜天宇, 李进锋, 沈超, 李博. 机器学习模型安全与隐私研究综述. 软件学报, 2021, 32(1): 41–67. http://www.jos.org.cn/1000-9825/6131.htm

Ji SL, Du TY, Li JF, Shen C, Li B. Security and privacy of machine learning models: A survey. Ruan Jian Xue Bao/Journal of Software, 2021, 32(1): 41–67 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6131.htm

[7] 全国信息安全标准化技术委员会. 人工智能安全标准化白皮书. 2023版. https://www.tc260.org.cn/upload/2023-05-31/1685501487351066337.pdf

National Information Security Standardization Technical Committee. White paper on artificial intelligence safety standardization. 2023 (in Chinese). https://www.tc260.org.cn/upload/2023-05-31/1685501487351066337.pdf

[8] Goodfello IJ, Shlens J, Szegedy C. Explaining and harnessing adversarial examples. arXiv:1412.6572, 2015.

[9] Li YD, Zhang SG, Wang WP, Song H. Backdoor attacks to deep learning models and countermeasures: A survey. IEEE Open Journal of the Computer Society, 2023, 4: 134–146.

[10] Goldblum M, Tsipras D, Xie CL, Chen XY, Schwarzschild A, Song D, Madry A, Li B, Goldstein T. Dataset security for machine learning: Data poisoning, backdoor attacks, and defenses. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2023, 45(2): 1563–1580.

[11] Gao YS, Doan BG, Zhang Z, Ma SQ, Zhang JL, Fu AM, Nepal S, Kim H. Backdoor attacks and countermeasures on deep learning: A comprehensive review. arXiv:2007.10760, 2020.

[12] Wu BY, Zhu ZH, Liu L, Liu QS, He ZF, Lyu SW. Attacks in adversarial machine learning: A systematic survey from the life-cycle perspective. arXiv:2302.09457, 2024.

[13] Omar M. Backdoor learning for NLP: Recent advances, challenges, and future research directions. arXiv:2302.06801, 2023.

[14] Li YM, Jiang Y, Li ZF, Xia ST. Backdoor learning: A survey. IEEE Trans. on Neural Networks and Learning Systems, 2024, 35(1): 5–22.

[15] 黄舒心, 张全新, 王亚杰, 张耀元, 李元章. 深度神经网络的后门攻击研究进展. 计算机科学, 2023, 50(9): 52–61.

Huang SX, Zhang QX, Wang YJ, Zhang YY, Li YZ. Research progress of backdoor attacks in deep neural networks. Computer Science, 2023, 50(9): 52–61 (in Chinese with English abstract).

[16] Li SF, Dong T, Zhao BZH, Xue MH, Du SG, Zhu HJ. Backdoors against natural language processing: A review. IEEE Security & Privacy, 2022, 20(5): 50–59.

[17] 杜巍, 刘功申. 深度学习中的后门攻击综述. 信息安全学报, 2022, 7(3): 1–16.

Du W, Liu GS. A survey of backdoor attack in deep learning. Journal of Cyber Security, 2022, 7(3): 1–16 (in Chinese with English abstract).

[18] 郑明钰, 林政, 刘正宵, 付鹏, 王伟平. 文本后门攻击与防御综述. 计算机研究与发展, 2024, 61(1): 221–242.

Zheng MY, Lin Z, Liu ZX, Fu P, Wang WP. Survey of textual backdoor attack and defense. Journal of Computer Research and Development, 2024, 61(1): 221–242 (in Chinese with English abstract).

[19] 陈梦轩, 张振永, 纪守领, 魏贵义, 邵俊. 图像对抗样本研究综述. 计算机科学, 2022, 49(2): 92–106.

Chen MX, Zhang ZY, Ji SL, Wei GY, Shao J. Survey of research progress on adversarial examples in images. Computer Science, 2022, 49(2): 92–106 (in Chinese with English abstract).

[20] Pan XD, Zhang M, Sheng BN, Zhu JM, Yang M. Hidden trigger backdoor attack on NLP models via linguistic style manipulation. In: Proc. of the 31st USENIX Security Symp. Boston: USENIX Association, 2022. 3611–3628.

[21] Gu TY, Dolan-Gavitt B, Garg S. BadNets: Identifying vulnerabilities in the machine learning model supply chain. arXiv:1708.06733, 2019.

[22] Wang ZT, Zhai J, Ma SQ. BppAttack: Stealthy and efficient Trojan attacks against deep neural networks via image quantization and contrastive adversarial learning. In: Proc. of the 2022 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022. 15054–15063. [doi: 10.1109/CVPR52688.2022.01465]

[23] Liu YF, Ma XJ, Bailey J, Lu F. Reflection backdoor: A natural backdoor attack on deep neural networks. In: Proc. of the 16th European Conf. on Computer Vision. Glasgow: Springer, 2020. 182–199. [doi: 10.1007/978-3-030-58607-2_11]

[24] Li SF, Xue MH, Zhao B, Zhu HJ, Zhang XP. Invisible backdoor attacks on deep neural networks via steganography and regularization. IEEE Trans. on Dependable and Secure Computing, 2021, 18(5): 2088–2105.

[25] Sun WL, Jiang XY, Dou SG, Li DS, Miao DQ, Deng C, Zhao CR. Invisible backdoor attack with dynamic triggers against person re-IDentification. IEEE Trans. on Information Forensics and Security, 2024, 19: 307–319.

[26] Nguyen A, Tran A. WaNet-imperceptible warping-based backdoor attack. arXiv:2102.10369, 2021.

[27] Lin JY, Xu L, Liu YQ, Zhang XY. Composite backdoor attack for deep neural network by mixing existing benign features. In: Proc. of the 2020 ACM SIGSAC Conf. on Computer and Communications Security. ACM, 2020. 113–131. [doi: 10.1145/3372297.3423362]

[28] Sarkar E, Benkraouda H, Krishnan G, Gamil H, Maniatakos M. FaceHack: Attacking facial recognition systems using malicious facial characteristics. IEEE Trans. on Biometrics, Behavior, and Identity Science, 2022, 4(3): 361–372.

[29] Zhong HT, Liao C, Squicciarini AC, Zhu SC, Miller D. Backdoor embedding in convolutional neural network models via invisible perturbation. In: Proc. of the 10th ACM Conf. on Data and Application Security and Privacy. New Orleans: ACM, 2020. 97–108. [doi: 10.1145/3374664.3375751]

[30] He Y, Shen ZL, Xia C, Hua JY, Tong W, Zhong S. SGBA: A stealthy scapegoat backdoor attack against deep neural networks. Computers & Security, 2024, 136: 103523.

[31] Jia JY, Liu YP, Gong NZ. BadEncoder: Backdoor attacks to pre-trained encoders in self-supervised learning. In: Proc. of the 2022 IEEE Symp. on Security and Privacy. San Francisco: IEEE, 2022. 2043–2059. [doi: 10.1109/SP46214.2022.9833644]

[32] Carlini N, Terzis A. Poisoning and backdooring contrastive learning. arXiv:2106.09667, 2022.

[33] Zeng Y, Park W, Mao ZM, Jia RX. Rethinking the backdoor attacks’ triggers: A frequency perspective. In: Proc. of the 2021 IEEE/CVF Int’l Conf. on Computer Vision (ICCV). Montreal: IEEE, 2021. 16453–16461. [doi: 10.1109/ICCV48922.2021.01616]

[34] Xu ZQJ, Zhang YY, Xiao YY. Training behavior of deep neural network in frequency domain. In: 26th Int’l Conf. on Neural Information Processing. Sydney: Springer, 2019. 264–274. [doi: 10.1007/978-3-030-36708-4_22]

[35] Xu ZQJ, Zhang YY, Luo T, Xiao YY, Ma Z. Frequency principle: Fourier analysis sheds light on deep neural networks. arXiv:1901.06523, 2024.

[36] Gao YD, Chen HL, Sun P, Li JJ, Zhang AQ, Wang ZB, Liu WF. A dual stealthy backdoor: From both spatial and frequency perspectives. In: Proc. of the 38th AAAI Conf. on Artificial Intelligence. Vancouver: AAAI Press, 2024. 1851–1859. [doi: 10.1609/aaai.v38i3.27954]

[37] Xia J, Yue ZH, Zhou YB, Ling ZW, Wei X, Chen MS. WaveAttack: Asymmetric frequency obfuscation-based backdoor attacks against deep neural networks. arXiv:2310.11595, 2023.

[38] Feng Y, Ma BT, Zhang J, Zhao SS, Xia Y, Tao DC. FIBA: Frequency-Injection based backdoor attack in medical image analysis. In: Proc. of the 2022 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022. 20844–20853. [doi: 10.1109/CVPR52688.2022.02021]

[39] Chen XY, Salem A, Chen DF, Backes M, Ma SQ, Shen QN, Wu ZH, Zhang Y. BadNL: Backdoor attacks against NLP models with semantic-preserving improvements. In: Proc. of the 37th Annual Computer Security Applications Conf. ACM, 2021. 554–569. [doi: 10.1145/3485832.3485837]

[40] Qi FC, Yao Y, Xu S, Liu ZY, Sun MS. Turn the combination lock: Learnable textual backdoor attacks via word substitution. In: Proc. of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int’l Joint Conf. on Natural Language Processing, Vol. 1 (Long Papers). ACL, 2021. 4873–4883. [doi: 10.18653/v1/2021.acl-long.377]

[41] Parhankangas A, Renko M. Linguistic style and crowdfunding success among social and commercial entrepreneurs. Journal of Business Venturing, 2017, 32(2): 215–236.

[42] Dai JZ, Chen CS, Li YF. A backdoor attack against LSTM-based text classification systems. IEEE Access, 2019, 7: 138872–138878.

[43] Yang WK, Lin YK, Li P, Zhou J, Sun X. Rethinking stealthiness of backdoor attack against NLP models. In: Proc. of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int’l Joint Conf. on Natural Language Processing, Vol. 1 (Long Papers). ACL, 2021. 5543–5557. [doi: 10.18653/v1/2021.acl-long.431]

[44] Qi FC, Li MK, Chen YY, Zhang ZY, Liu ZY, Wang YS, Sun MS. Hidden Killer: Invisible textual backdoor attacks with syntactic trigger. In: Proc. of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int’l Joint Conf. on Natural Language Processing, Vol. 1 (Long Papers). ACL, 2021. 443–453. [doi: 10.18653/v1/2021.acl-long.37]

[45] Zhou XK, Li JW, Zhang TW, Lyu LJ, Yang MQ, He J. Backdoor attacks with input-unique triggers in NLP. arXiv:2303.14325, 2023.

[46] Chan A, Tay Y, Ong YS, Zhang A. Poison attacks against text datasets with conditional adversarially regularized autoencoder. In: Findings of the Association for Computational Linguistics: EMNLP 2020. ACL, 2020. 4175–4189. [doi: 10.18653/v1/2020.findings-emnlp.373]

[47] Jin RN, Huang CY, You CY, Li XX. Backdoor attack on unpaired medical image-text foundation models: A pilot study on MedCLIP. arXiv:2401.01911, 2024.

[48] Yao YS, Li HY, Zheng HT, Zhao BY. Latent backdoor attacks on deep neural networks. In: Proc. of the 2019 ACM SIGSAC Conf. on Computer and Communications Security. London: ACM, 2019. 2041–2055. [doi: 10.1145/3319535.3354209]

[49] Shen LJ, Ji SL, Zhang XH, Li JF, Chen J, Shi J, Fang CF, Yin JW, Wang T. Backdoor pre-trained models can transfer to all. In: Proc. of the 2021 ACM SIGSAC Conf. on Computer and Communications Security. ACM, 2021. 3141–3158. [doi: 10.1145/3460120.3485370]

[50] Chen KJ, Meng YX, Sun XF, Guo SW, Zhang TW, Li JW, Fan C. BADPRE: Task-Agnostic backdoor attacks to pre-trained NLP foundation models. arXiv:2110.02467, 2021.

[51] Liu MX, Zhang ZH, Zhang YM, Zhang C, Li Z, Li Q, Duan HX, Sun DH. Automatic generation of adversarial readable Chinese texts. IEEE Trans. on Dependable and Secure Computing, 2023, 20(2): 1756–1770.

[52] Turner A, Tsipras D, Madry A. Label-Consistent backdoor attacks. arXiv:1912.02771, 2019.

[53] Saha A, Subramanya A, Pirsiavash H. Hidden trigger backdoor attacks. In: Proc. of the 34th AAAI Conf. on Artificial Intelligence. New York: AAAI Press, 2020. 11957–11965. [doi: 10.1609/aaai.v34i07.6871]

[54] Ning R, Li J, Xin CS, Wu HY. Invisible poison: A blackbox clean label backdoor attack to deep neural networks. In: Proc. of the 2021 IEEE Conf. on Computer Communications. Vancouver: IEEE, 2021. 1–10. [doi: 10.1109/INFOCOM42981.2021.9488902]

[55] Tan TJL, Shokri R. Bypassing backdoor detection algorithms in deep learning. In: Proc. of the 2020 IEEE European Symp. on Security and Privacy. Genoa: IEEE, 2020. 175–183. [doi: 10.1109/EuroSP48549.2020.00019]

[56] Zhao SH, Ma XJ, Zheng X, Bailey J, Chen JJ, Jiang YG. Clean-label backdoor attacks on video recognition models. In: Proc. of the 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020. 14431–14440. [doi: 10.1109/CVPR42600.2020.01445]

[57] Cheng SY, Liu YQ, Ma SQ, Zhang XY. Deep feature space Trojan attack of neural networks by controlled detoxification. In: Proc. of the 35th AAAI Conf. on Artificial Intelligence. Virtually: AAAI Press, 2021. 1148–1156. [doi: 10.1609/aaai.v35i2.16201]

[58] Hammoud HAAK, Ghanem B. Check your other door! Creating backdoor attacks in the frequency domain. In: Proc. of the 33rd British Machine Vision Conf. London: BMVA Press, 2022. 259.

[59] Liu XR, Tan YA, Wang YJ, Qiu KF, Li YZ. Stealthy low-frequency backdoor attack against deep neural networks. arXiv:2305.09677, 2023.

[60] Yue C, Lv PZ, Liang RG, Chen K. Invisible backdoor attacks using data poisoning in frequency domain. In: Proc. of the 26th European Conf. on Artificial Intelligence. Kraków: IOS Press, 2023. 2954–2961. [doi: 10.3233/FAIA230610]

[61] Li XK, Chen ZR, Zhao Y, Tong ZK, Zhao YB, Lim A, Zhou JT. PointBA: Towards backdoor attacks in 3D point cloud. In: Proc. of the 2021 IEEE Int’l Conf. on Computer Vision (ICCV). Montreal: IEEE, 2021. 16472–16481. [doi: 10.1109/ICCV48922.2021.01618]

[62] Fan LK, He FZ, Si TZ, Fan RB, Ye CL, Li B. MBA: Backdoor attacks against 3D mesh classifier. IEEE Trans. on Information Forensics and Security, 2024, 19: 2127–2142.

[63] Sasaki S, Hidano S, Uchibayashi T, Suganuma T, Hiji M, Kiyomoto S. On embedding backdoor in malware detectors using machine learning. In: Proc. of the 17th Int’l Conf. on Privacy, Security and Trust. Fredericton: IEEE, 2019. 1–5. [doi: 10.1109/PST47121.2019.8949034]

[64] Li CR, Chen X, Wang DR, Wen S, Ahmed ME, Camtepe S, Xiang Y. Backdoor attack on machine learning based android malware detectors. IEEE Trans. on Dependable and Secure Computing, 2022, 19(5): 3357–3370.

[65] Tian JW, Qiu KF, Gao DB, Wang Z, Kuang XH, Zhao G. Sparsity brings vulnerabilities: Exploring new metrics in backdoor attacks. In: Proc. of the 32nd USENIX Security Symp. Anaheim: USENIX Association, 2023. 2689–2706.

[66] Salem A, Wen R, Backes M, Ma SQ, Zhang Y. Dynamic backdoor attacks against machine learning models. In: Proc. of the 7th IEEE European Symp. on Security and Privacy (EuroS&P). Genoa: IEEE, 2022. 703–718. [doi: 10.1109/EuroSP53844.2022.00049]

[67] Nguyen TA, Tran TA. Input-aware dynamic backdoor attack. In: Proc. of the 34th Int’l Conf. on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2020. 3454–3464.

[68] Doan K, Lao YJ, Zhao WJ, Li P. LIRA: Learnable, imperceptible and robust backdoor attacks. In: Proc. of the 2021 IEEE/CVF Int’l Conf. on Computer Vision (ICCV). Montreal: IEEE, 2021. 11946–11956. [doi: 10.1109/ICCV48922.2021.01175]

[69] Gong XL, Chen YJ, Wang Q, Huang HY, Meng LS, Shen C, Zhang Q. Defense-resistant backdoor attacks against deep neural networks in outsourced cloud environment. IEEE Journal on Selected Areas in Communications, 2021, 39(8): 2617–2631.

[70] Xue MF, Ni SF, Wu YH, Zhang YS, Liu WQ. Imperceptible and multi-channel backdoor attack. Applied Intelligence, 2024, 54(1): 1099–1116.

[71] Chow KH, Wei WQ, Yu L. Imperio: Language-guided backdoor attacks for arbitrary model control. In: Proc. of the 33rd Int’l Joint Conf. on Artificial Intelligence. 2024. 704–712. [doi: 10.24963/ijcai.2024/78]

[72] Liu YQ, Ma SQ, Aafer Y, Lee WC, Zhai J, Wang WH, Zhang XY. Trojaning attack on neural networks. In: Proc. of the 25th Annual Network and Distributed System Security Symp. San Diego: Internet Society, 2018. [doi: 10.14722/ndss.2018.23291]

[73] Lv PZ, Ma HL, Zhou JC, Liang RG, Chen K, Zhang SZ, Yang YF. DBIA: Data-free backdoor injection attack against transformer networks. arXiv:2111.11870, 2021.

[74] Lv PZ, Yue C, Liang RG, Yang YF, Zhang SZ, Ma HL, Chen K. A data-free backdoor injection approach in neural networks. In: Proc. of the 32nd USENIX Security Symp. Anaheim: USENIX Association, 2023. 2671–2688.

[75] Yu Y, Wang YF, Yang WH, Lu SJ, Tan YP, Kot AC. Backdoor attacks against deep image compression via adaptive frequency trigger. In: Proc. of the 2023 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023. 12250–12259. [doi: 10.1109/CVPR52729.2023.01179]

[76] Yang WK, Li L, Zhang ZY, Ren XC, Sun X, He B. Be careful about poisoned word embeddings: Exploring the vulnerability of the embedding layers in NLP models. In: Proc. of the 2021 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL, 2021. 2048–2058. [doi: 10.18653/v1/2021.naacl-main.165]

[77] Li LY, Song DM, Li XN, Zeng JH, Ma RT, Qiu XP. Backdoor attacks on pre-trained models by layerwise weight poisoning. In: Proc. of the 2021 Conf. on Empirical Methods in Natural Language Processing. ACL, 2021. 3023–3032. [doi: 10.18653/v1/2021.emnlp-main.241]

[78] Tang RX, Du MN, Liu NH, Yang F, Hu X. An embarrassingly simple approach for Trojan attack in deep neural networks. In: Proc. of the 26th ACM SIGKDD Int’l Conf. on Knowledge Discovery & Data Mining. ACM, 2020. 218–228. [doi: 10.1145/3394486.3403064]

[79] Hong S, Carlini N, Kurakin A. Handcrafted backdoors in deep neural networks. In: Proc. of the 36th Conf. on Neural Information Processing Systems. New Orleans: Curran Associates Inc., 2022. 8068–8080.

[80] Kurita K, Michel P, Neubig G. Weight poisoning attacks on pre-trained models. arXiv:2004.06660, 2020.

[81] Wei CA, Lee Y, Chen K, Meng GZ, Lv PZ. Aliasing backdoor attacks on pre-trained models. In: Proc. of the 32nd USENIX Security Symp. Anaheim: USENIX Association, 2023. 2707–2724.

[82] Li HL, Wang YF, Xie XF, Liu Y, Wang SQ, Wan RJ, Chau LP, Kot AC. Light can hack your face! Black-box backdoor attack on face recognition systems. arXiv:2009.06996, 2020.

[83] Rakin AS, He ZZ, Fan DL. TBT: Targeted neural network attack with bit Trojan. In: Proc. of the 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020. 13195–13204. [doi: 10.1109/CVPR42600.2020.01321]

[84] Chen HL, Fu C, Zhao JS, Koushanfar F. ProFlip: Targeted Trojan attack with progressive bit flips. In: Proc. of the 2021 IEEE/CVF Int’l Conf. on Computer Vision (ICCV). Montreal: IEEE, 2021. 7698–7707. [doi: 10.1109/ICCV48922.2021.00762]

[85] Bagdasaryan E, Shmatikov V. Blind backdoors in deep learning models. In: Proc. of the 30th USENIX Security Symp. USENIX Association, 2021. 1505–1521.

[86] Saha A, Tejankar A, Koohpayegani SA, Pirsiavash H. Backdoor attacks on self-supervised learning. In: Proc. of the 2022 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022. 13327–13336. [doi: 10.1109/CVPR52688.2022.01298]

[87] Hou RT, Huang T, Yan HY, Ke LS, Tang WX. A stealthy and robust backdoor attack via frequency domain transform. World Wide Web, 2023, 26(5): 2767–2783.

[88] Wang T, Yao Y, Xu F, An SW, Tong HH, Wang T. An invisible black-box backdoor attack through frequency domain. In: Proc. of the 17th European Conf. on Computer Vision. Tel Aviv: Springer, 2022. 396–413. [doi: 10.1007/978-3-031-19778-9_23]

[89] Xiang Z, Miller DJ, Chen SH, Li X, Kesidis G. A backdoor attack against 3D point cloud classifiers. In: Proc. of the 2021 IEEE/CVF Int’l Conf. on Computer Vision (ICCV). Montreal: IEEE, 2021. 7577–7587. [doi: 10.1109/ICCV48922.2021.00750]

[90] Gao KF, Bai JW, Wu BY, Ya MX, Xia ST. Imperceptible and robust backdoor attack in 3D point cloud. IEEE Trans. on Information Forensics and Security, 2024, 19: 1267–1282.

[91] Li SF, Liu H, Dong T, Zhao BZH, Xue MH, Zhu HJ, Lu JL. Hidden backdoors in human-centric language models. In: Proc. of the 2021 ACM SIGSAC Conf. on Computer and Communications Security. ACM, 2021. 3123–3140. [doi: 10.1145/3460120.3484576]

[92] Li ZC, Li PJ, Sheng X, Yin CC, Zhou L. IMTM: Invisible multi-trigger multimodal backdoor attack. In: Proc. of the 12th National CCF Conf. on Natural Language Processing and Chinese Computing. Foshan: Springer, 2023. 533–545. [doi: 10.1007/978-3-031-44696-2_42]

[93] Mei K, Li Z, Wang ZT, Zhang Y, Ma SQ. NOTABLE: Transferable backdoor attacks against prompt-based NLP models. In: Proc. of the 61st Annual Meeting of the Association for Computational Linguistics, Vol. 1 (Long Papers). Toronto: ACL, 2023. 15551–15565. [doi: 10.18653/v1/2023.acl-long.867]

[94] Barni M, Kallas K, Tondi B. A new backdoor attack in CNNS by training set corruption without label poisoning. In: Proc. of the 2019 IEEE Int’l Conf. on Image Processing (ICIP). Taipei: IEEE, 2019. 101–105. [doi: 10.1109/ICIP.2019.8802997]

[95] Zhang Q, Ding YF, Tian YQ, Guo JM, Yuan M, Jiang Y. AdvDoor: Adversarial backdoor attack of deep learning system. In: Proc. of the 30th ACM SIGSOFT Int’l Symp. on Software Testing and Analysis. Virtual: ACM, 2021. 127–138. [doi: 10.1145/3460319.3464809]

[96] Shafahi A, Huang WR, Najibi M, Suciu O, Studer C, Dumitras T, Goldstein T. Poison frogs! Targeted clean-label poisoning attacks on neural networks. In: Proc. of the 32nd Conf. on Neural Information Processing Systems. Montréal: Curran Associates Inc., 2018. 6106–6116.

引用本文

高梦楠,陈伟,吴礼发,张伯雷.面向深度学习的后门攻击及防御研究综述.软件学报,2025,36(7):3271-3305

复制

文章指标

点击次数:508
下载次数: 248
HTML阅读次数: 215
引用次数: 0

历史

收稿日期:2024-04-27
最后修改日期:2024-07-15
录用日期:
在线发布日期: 2025-04-25
出版日期:

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码