机器学习模型安全与隐私研究综述
作者:
作者简介:

纪守领(1986-),男,博士,研究员,博士生导师,CCF专业会员,主要研究领域为人工智能与安全,数据驱动安全,IoT安全,软件与系统安全,大数据分析.
杜天宇(1996-),女,学士,主要研究领域为人工智能安全.
李进锋(1994-),男,硕士,主要研究领域为人工智能安全.
沈超(1985-),男,博士,教授,博士生导师,CCF专业会员,主要研究领域为网络与系统安全,人工智能,系统工程.
李博(1987-),女,博士,助理教授,博士生导师,主要研究领域为机器学习,安全与隐私,博弈论.

通讯作者:

纪守领,E-mail:sji@zju.edu.cn

基金项目:

国家重点研发计划(2018YFB0804102);浙江省自然科学基金(LR19F020003);浙江省科技计划(2019C01055);国家自然科学基金(61772466,U1936215,U1836202,61822309,61773310,U1736205)


Security and Privacy of Machine Learning Models: A Survey
Author:
Fund Project:

National Key Researchand Development Program of China (2018YFB0804102); Zhejiang Provincial Natural Science Foundation of China (LR19F020003); Provincial Key Research and Development Program of Zhejiang, China (2019C01055); National Natural Science Foundation of China (61772466, U1936215, U1836202, 61822309, 61773310, U1736205)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [194]
  • |
  • 相似文献
  • | | |
  • 文章评论
    摘要:

    在大数据时代下,深度学习、强化学习以及分布式学习等理论和技术取得的突破性进展,为机器学习提供了数据和算法层面强有力的支撑,同时促进了机器学习的规模化和产业化发展.然而,尽管机器学习模型在现实应用中有着出色的表现,但其本身仍然面临着诸多的安全威胁.机器学习在数据层、模型层以及应用层面临的安全和隐私威胁呈现出多样性、隐蔽性和动态演化的特点.机器学习的安全和隐私问题吸引了学术界和工业界的广泛关注,一大批学者分别从攻击和防御的角度对模型的安全和隐私问题进行了深入的研究,并且提出了一系列的攻防方法.回顾了机器学习的安全和隐私问题,并对现有的研究工作进行了系统的总结和科学的归纳,同时明确了当前研究的优势和不足.最后探讨了机器学习模型安全与隐私保护研究当前所面临的挑战以及未来潜在的研究方向,旨在为后续学者进一步推动机器学习模型安全与隐私保护研究的发展和应用提供指导.

    Abstract:

    In the era of big data, breakthroughs in theories and technologies of deep learning, reinforcement learning, and distributed learning have provided strong support for machine learning at the data and the algorithm level, as well as have promoted the development of scale and industrialization of machine learning. However, though machine learning models have excellent performance in many real-world applications, they still suffer many security and privacy threats at the data, model, and application levels, which could be characterized by diversity, concealment, and dynamic evolution. The security and privacy issues of machine learning have attracted extensive attention from academia and industry. A large number of researchers have conducted in-depth research on the security and privacy issues of models from the perspective of attack and defense, and proposed a series of attack and defense methods. In this survey, the security and privacy issues of machine learning are reviewed, existing research work is systematically and scientifically summarized, and the advantages and disadvantages of current research are clarified. Finally, the current challenges and future research directions of machine learning model security and privacy research are explored, aiming to provide guidance for follow-up researchers to further promote the development and application of machine learning model security and privacy research.

    参考文献
    [1] Song C,Ristenpart T,Shmatikov V. Machine learning models that remember too much. In:Proc. of the 2017 ACM SIGSAC Conf. on Computer and Communications Security. 2017.587-601.
    [2] Tramèr F,Zhang F,Juels A,et al. Stealing machine learning models via prediction apis. In:Proc. of the 25th {USENIX} Security Symp. ({USENIX} Security 2016). 2016.601-618.
    [3] Shen S,Tople S,Saxena P. A Uror:Defending against poisoning attacks in collaborative deep learning systems. In:Proc. of the 32nd Annual Conf. on Computer Security Applications. 2016.508-519.
    [4] Nelson B,Barreno M,Chi FJ,et al. Exploiting machine learning to subvert your spam filter. LEET,2008,8:1-9.
    [5] Jagielski M,Oprea A,Biggio B,et al. Manipulating machine learning:Poisoning attacks and countermeasures for regression learning. In:Proc. of the 2018 IEEE Symp. on Security and Privacy (SP). 2018.19-35.
    [6] Nelson B,Biggio B,Laskov P. Understanding the risk factors of learning in adversarial environments. AISec,2011,11:87-92.
    [7] Barreno M,Nelson B,Sears R,et al. Can machine learning be secure? In:Proc. of the 2006 ACM Symp. on Information,Computer and Communications Security. 2006.16-25.
    [8] Newsome J,Karp B,Song D. Paragraph:Thwarting signature learning by training maliciously. In:Proc. of the Int'l Workshop on Recent Advances in Intrusion Detection. 2006.81-105.
    [9] Rubinstein BI,Nelson B,Huang L,et al. Antidote:Understanding and defending against poisoning of anomaly detectors. In:Proc. of the 9th ACM SIGCOMM Conf. on Internet Measurement. 2009.1-14.
    [10] Xiao H,Biggio B,Brown G,et al. Is feature selection secure against training data poisoning? In:Proc. of the Int'l Conf. on Machine Learning. 2015.1689-1698.
    [11] Mei S,Zhu X. Using machine teaching to identify optimal training-set attacks on machine learners. In:Proc. of the 29th AAAI Conf. on Artificial Intelligence. 2015.
    [12] Alfeld S,Zhu X,Barford P. Data poisoning attacks against autoregressive models. In:Proc. of the 30th AAAI Conf. on Artificial Intelligence. 2016.
    [13] Muñoz-González L,Biggio B,Demontis A,et al. Towards poisoning of deep learning algorithms with back-gradient optimization. In:Proc. of the 10th ACM Workshop on Artificial Intelligence and Security. 2017.27-38.
    [14] Ma Y,Jun KS,Li L,et al. Data poisoning attacks in contextual bandits. In:Proc. of the Int'l Conf. on Decision and Game Theory for Security. 2018.186-204.
    [15] Biggio B,Russu P,Didaci L,et al. Adversarial biometric recognition:A review on biometric system security from the adversarial machine-learning perspective. IEEE Signal Processing Magazine,2015,32(5):31-41.
    [16] Fang M,Yang G,Gong NZ,et al. Poisoning attacks to graph-based recommender systems. In:Proc. of the 34th Annual Computer Security Applications Conf. 2018.381-392.
    [17] Li B,Wang Y,Singh A,et al. Data poisoning attacks on factorization-based collaborative filtering. In:Proc. of the Advances in Neural Information Processing Systems. 2016.1885-1893.
    [18] Suciu O,Marginean R,Kaya Y,et al. When does machine learning {FAIL}? Generalized transferability for evasion and poisoning attacks. In:Proc. of the 27th {USENIX} Security Symp. ({USENIX} Security 2018). 2018.1299-1316.
    [19] Barreno M,Nelson B,Joseph AD,et al. The security of machine learning. Machine Learning,2010,81(2):121-148.
    [20] Cretu GF,Stavrou A,Locasto ME,et al. Casting out demons:Sanitizing training data for anomaly sensors. In:Proc. of the 2008 IEEE Symp. on Security and Privacy (SP 2008). 2008.81-95.
    [21] Gu T,Dolan-Gavitt B,Garg S. Badnets:Identifying vulnerabilities in the machine learning model supply chain. arXiv Preprint arXiv:1708.06733,2017.
    [22] Chen X,Liu C,Li B,et al. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv Preprint arXiv:1712.05526,2017.
    [23] Liu Y,Ma S,Aafer Y,et al. Trojaning attack on neural networks. In:Proc. of the NDSS. 2018.
    [24] Adi Y,Baum C,Cisse M,et al. Turning your weakness into a strength:Watermarking deep neural networks by backdooring. In:Proc. of the 27th {USENIX} Security Symp. ({USENIX} Security 2018). 2018.1615-1631.
    [25] Liu C,Li B,Vorobeychik Y,et al. Robust linear regression against training data poisoning. In:Proc. of the 10th ACM Workshop on Artificial Intelligence and Security. 2017.91-102.
    [26] Steinhardt J,Koh PWW,Liang PS. Certified defenses for data poisoning attacks. In:Advances in Neural Information Processing Systems. 2017.3517-3529.
    [27] Baracaldo N,Chen B,Ludwig H,et al. Mitigating poisoning attacks on machine learning models:A data provenance based approach. In:Proc. of the 10th ACM Workshop on Artificial Intelligence and Security. 2017.103-110.
    [28] Aman MN,Chua KC,Sikdar B. Secure data provenance for the internet of things. In:Proc. of the 3rd ACM Int'l Workshop on IoT Privacy,Trust,and Security. 2017.11-14.
    [29] Zhang X,Zhu X,Wright S. Training set debugging using trusted items. In:Proc. of the 32nd AAAI Conf. on Artificial Intelligence. 2018.
    [30] Chen B,Carvalho W,Baracaldo N,et al. Detecting backdoor attacks on deep neural networks by activation clustering. arXiv Preprint arXiv:1811.03728,2018.
    [31] Wang B,Yao Y,Shan S,et al. Neural cleanse:Identifying and mitigating backdoor attacks in neural networks. In:Proc. of the Neural Cleanse:Identifying and Mitigating Backdoor Attacks in Neural Networks. 2019.
    [32] Goodfellow IJ,Shlens J,Szegedy C. Explaining and harnessing adversarial examples. arXiv Preprint arXiv:1412.6572,2014.
    [33] Szegedy C,Zaremba W,Sutskever I,et al. Intriguing properties of neural networks. arXiv Preprint arXiv:1312.6199,2013.
    [34] Ford N,Gilmer J,Carlini N,et al. Adversarial examples are a natural consequence of test error in noise. arXiv Preprint arXiv:1901.10513,2019.
    [35] Carlini N,Wagner D. Towards evaluating the robustness of neural networks. In:Proc. of the 2017 IEEE Symp. on Security and Privacy (SP). 2017.39-57.
    [36] Chen PY,Sharma Y,Zhang H,et al. EAD:Elastic-net attacks to deep neural networks via adversarial examples. In:Proc. of the 32nd AAAI Conf. on Artificial Intelligence. 2018.
    [37] Khrulkov V,Oseledets I. Art of singular vectors and universal adversarial perturbations. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018.8562-8570.
    [38] Kurakin A,Goodfellow I,Bengio S. Adversarial examples in the physical world. arXiv Preprint arXiv:1607.02533,2016.
    [39] Kurakin A,Goodfellow I,Bengio S. Adversarial machine learning at scale. arXiv Preprint arXiv:1611.01236,2016.
    [40] Dong Y,Liao F,Pang T,et al. Boosting adversarial attacks with momentum. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018.9185-9193.
    [41] Xie C,Zhang Z,Wang J,et al. Improving transferability of adversarial examples with input diversity. arXiv Preprint arXiv:1803.06978,2018.
    [42] Madry A,Makelov A,Schmidt L,et al. Towards deep learning models resistant to adversarial attacks. arXiv Preprint arXiv:1706.06083,2017.
    [43] Zheng T,Chen C,Ren K. Distributionally adversarial attack. arXiv Preprint arXiv:1808.05537,2018.
    [44] Papernot N,Mcdaniel P,Jha S,et al. The limitations of deep learning in adversarial settings. In:Proc. of the 2016 IEEE European Symp. on Security and Privacy (EuroS&P). 2016.372-387.
    [45] Moosavi-Dezfooli SM,Fawzi A,Frossard P. Deepfool:A simple and accurate method to fool deep neural networks. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2016.2574-2582.
    [46] Moosavi-Dezfooli SM,Fawzi A,Fawzi O,et al. Universal adversarial perturbations. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017.1765-1773.
    [47] Baluja S,Fischer I. Learning to attack:Adversarial transformation networks. AAAI,2018. https://www.semanticscholar.org/paper/Learning-to-Attack%3A-Adversarial-Transformation-Baluja-Fischer/a8b2c73f7c19f4e6e3783a5c19304025d9b7025f?p2df
    [48] Song Y,Shu R,Kushman N,et al. Constructing unrestricted adversarial examples with generative models. In:Proc. of the Advances in Neural Information Processing Systems. 2018.8312-8323.
    [49] Xiao C,Li B,Zhu JY,et al. Generating adversarial examples with adversarial networks. arXiv Preprint arXiv:1801.02610,2018.
    [50] Brown TB,Mané D,Roy A,et al. Adversarial patch. arXiv Preprint arXiv:1712.09665,2017.
    [51] Liu A,Liu X,Fan J,et al. Perceptual-sensitive GAN for generating adversarial patches. In:Proc. of the AAAI Conf. on Artificial Intelligence. 2019.[doi:https://doi.org/10.1609/aaai.v33i01.33011028]
    [52] Selvaraju RR,Cogswell M,Das A,et al. Grad-Cam:Visual explanations from deep networks via gradient-based localization. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2017.618-626.
    [53] Thys S,Van Ranst W,Goedemé T. Fooling automated surveillance cameras:Adversarial patches to attack person detection. arXiv Preprint arXiv:1904.08653,2019.
    [54] Xiao C,Zhu JY,Li B,et al. Spatially transformed adversarial examples. arXiv Preprint arXiv:1801.02612,2018.
    [55] Su J,Vargas DV,Sakurai K. One pixel attack for fooling deep neural networks. IEEE Trans. on Evolutionary Computation,2019.
    [56] Athalye A,Engstrom L,Ilyas A,et al. Synthesizing robust adversarial examples. arXiv Preprint arXiv:1707.07397,2017.
    [57] Eykholt K,Evtimov I,Fernandes E,et al. Robust physical-world attacks on deep learning models. arXiv Preprint arXiv:1707.08945,2017.
    [58] Liu Y,Chen X,Liu C,et al. Delving into transferable adversarial examples and black-box attacks. arXiv Preprint arXiv:1611.02770,2016.
    [59] Papernot N,Mcdaniel P,Goodfellow I. Transferability in machine learning:From phenomena to black-box attacks using adversarial samples. arXiv Preprint arXiv:1605.07277,2016.
    [60] Papernot N,Mcdaniel P,Goodfellow I,et al. Practical black-box attacks against machine learning. In:Proc. of the 2017 ACM on Asia Conf. on Computer and Communications Security. 2017.506-519.
    [61] Ilyas A,Engstrom L,Athalye A,et al. Black-box adversarial attacks with limited queries and information. arXiv Preprint arXiv:1804.08598,2018.
    [62] Shi Y,Wang S,Han Y. Curls&Whey:Boosting black-box adversarial attacks. arXiv Preprint arXiv:1904.01160,2019.
    [63] Chen PY,Zhang H,Sharma Y,et al. Zoo:Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In:Proc. of the 10th ACM Workshop on Artificial Intelligence and Security. 2017.15-26.
    [64] Bhagoji AN,He W,Li B,et al. Practical black-box attacks on deep neural networks using efficient query mechanisms. In:Proc. of the European Conf. on Computer Vision. 2018.158-174.
    [65] Ilyas A,Engstrom L,Madry A. Prior convictions:Black-box adversarial attacks with bandits and priors. arXiv Preprint arXiv:1807.07978,2018.
    [66] Tu CC,Ting P,Chen PY,et al. Autozoom:Autoencoder-based zeroth order optimization method for attacking black-box neural networks. arXiv Preprint arXiv:1805.11770,2018.
    [67] Brendel W,Rauber J,Bethge M. Decision-based adversarial attacks:Reliable attacks against black-box machine learning models. arXiv Preprint arXiv:1712.04248,2017.
    [68] Li Y,Li L,Wang L,et al. NATTACK:Learning the distributions of adversarial examples for an improved black-box attack on deep neural networks. arXiv Preprint arXiv:1905.00441,2019.
    [69] Zhao Z,Dua D,Singh S. Generating natural adversarial examples. arXiv Preprint arXiv:1710.11342,2017.
    [70] Jia R,Liang P. Adversarial examples for evaluating reading comprehension systems. arXiv Preprint arXiv:1707.07328,2017.
    [71] Belinkov Y,Bisk Y. Synthetic and natural noise both break neural machine translation. arXiv Preprint arXiv:1711.02173,2017.
    [72] Niu T,Bansal M. Adversarial over-sensitivity and over-stability strategies for dialogue models. arXiv Preprint arXiv:1809.02079,2018.
    [73] Hosseini H,Kannan S,Zhang B,et al. Deceiving Google's perspective API built for detecting toxic comments. arXiv Preprint arXiv:1702.08138,2017.
    [74] Papernot N,Mcdaniel P,Swami A,et al. Crafting adversarial input sequences for recurrent neural networks. In:Proc. of the MILCOM 2016-2016 IEEE Military Communications Conf. 2016.49-54.
    [75] Ebrahimi J,Rao A,Lowd D,et al. Hotflip:White-box adversarial examples for text classification. arXiv Preprint arXiv:1712.06751,2017.
    [76] Ebrahimi J,Lowd D,Dou D. On adversarial examples for character-level neural machine translation. arXiv Preprint arXiv:1806.09030,2018.
    [77] Liang B,Li H,Su M,et al. Deep text classification can be fooled. arXiv Preprint arXiv:1704.08006,2017.
    [78] Samanta S,Mehta S. Towards crafting text adversarial samples. arXiv Preprint arXiv:1707.02812,2017.
    [79] Gong Z,Wang W,Li B,et al. Adversarial texts with gradient methods. arXiv Preprint arXiv:1801.07175,2018.
    [80] Lei Q,Wu L,Chen P-Y,et al. Discrete adversarial attacks and submodular optimization with applications to text classification. arXiv:1812.00151,2019.
    [81] Li J,Ji S,Du T,et al. Textbugger:Generating adversarial text against real-world applications. arXiv Preprint arXiv:1812.05271,2018.
    [82] Wang Y,Bansal M. Robust machine comprehension models via adversarial training. arXiv Preprint arXiv:1804.06473,2018.
    [83] Grosse K,Papernot N,Manoharan P,et al. Adversarial examples for malware detection. In:Proc. of the European Symp. on Research in Computer Security. 2017.62-79.
    [84] Kreuk F,Barak A,Aviv-Reuven S,et al. Deceiving end-to-end deep learning malware detectors using adversarial examples. arXiv Preprint arXiv:1802.04528,2018.
    [85] Kolosnjaji B,Demontis A,Biggio B,et al. Adversarial malware binaries:Evading deep learning for malware detection in executables. In:Proc. of the 26th European Signal Processing Conf. (EUSIPCO). 2018.533-537.
    [86] Rosenberg I,Shabtai A,Rokach L,et al. Generic black-box end-to-end attack against state of the art API call based malware classifiers. In:Proc. of the Int'l Symp. on Research in Attacks,Intrusions,and Defenses. 2018.490-510.
    [87] Hu W,Tan Y. Generating adversarial malware examples for black-box attacks based on GAN. arXiv Preprint arXiv:1702.05983,2017.
    [88] Anderson HS,Kharkar A,Filar B,et al. Evading machine learning malware detection. In:Black Hat 2017.2017.
    [89] Liu N,Yang H,Hu X. Adversarial detection with model interpretation. In:Proc. of the 24th ACM SIGKDD Int'l Conf. on Knowledge Discovery&Data Mining. 2018.1803-1811.
    [90] Wang Q,Guo W,Zhang K,et al. Adversary resistant deep neural networks with an application to malware detection. In:Proc. of the 23rd ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. 2017.1145-1153.
    [91] Cisse M,Adi Y,Neverova N,et al. Houdini:Fooling deep structured prediction models. arXiv Preprint arXiv:1707.05373,2017.
    [92] Carlini N,Wagner D. Audio adversarial examples:Targeted attacks on speech-to-text. In:Proc. of the 2018 IEEE Security and Privacy Workshops (SPW). 2018.1-7.
    [93] Qin Y,Carlini N,Goodfellow I,et al. Imperceptible,robust,and targeted adversarial examples for automatic speech recognition. arXiv Preprint arXiv:1903.10346,2019.
    [94] Taori R,Kamsetty A,Chu B,et al. Targeted adversarial examples for black box audio systems. arXiv Preprint arXiv:1805.07820,2018.
    [95] Du T,Ji S,Li J,et al. SirenAttack:Generating adversarial audio for end-to-end acoustic systems. arXiv Preprint arXiv:1901.07846,2019.
    [96] Yuan X,Chen Y,Zhao Y,et al. Commandersong:A systematic approach for practical adversarial voice recognition. In:Proc. of the 27th {USENIX} Security Symp. ({USENIX} Security 2018). 2018.49-64.
    [97] Yang Z,Li B,Chen PY,et al. Characterizing audio adversarial examples using temporal dependency. arXiv Preprint arXiv:1809.10875,2018.
    [98] Zügner D,Akbarnejad A,Günnemann S. Adversarial attacks on neural networks for graph data. In:Proc. of the 24th ACM SIGKDD Int'l Conf. on Knowledge Discovery&Data Mining. 2018.2847-2856.
    [99] Dai H,Li H,Tian T,et al. Adversarial attack on graph structured data. arXiv Preprint arXiv:1806.02371,2018.
    [100] Chen Y,Nadji Y,Kountouras A,et al. Practical attacks against graph-based clustering. In:Proc. of the 2017 ACM SIGSAC Conf. on Computer and Communications Security. 2017.1125-1142.
    [101] Bojcheski A,Günnemann S. Adversarial attacks on node embeddings. arXiv Preprint arXiv:1809.01093,2018.
    [102] Wang B,Gong NZ. Attacking graph-based classification via manipulating the graph structure. arXiv Preprint arXiv:1903.00553,2019.
    [103] Das N,Shanbhogue M,Chen ST,et al. Shield:Fast,practical defense and vaccination for deep learning using jpeg compression. In:Proc. of the 24th ACM SIGKDD Int'l Conf. on Knowledge Discovery&Data Mining. 2018.196-204.
    [104] Guo C,Rana M,Cisse M,et al. Countering adversarial images using input transformations. arXiv Preprint arXiv:1711.00117,2017.
    [105] Xu W,Evans D,Qi Y. Feature squeezing:Detecting adversarial examples in deep neural networks. arXiv Preprint arXiv:1704.01155,2017.
    [106] Buckman J,Roy A,Raffel C,et al. Thermometer encoding:One hot way to resist adversarial examples. In:Proc. of the ICLR 2018 Conf. 2018.
    [107] Guo W,Wang Q,Zhang K,et al. Defending against adversarial samples without security through obscurity. In:Proc. of the 2018 IEEE Int'l Conf. on Data Mining (ICDM). 2018.137-146.
    [108] Prakash A,Moran N,Garber S,et al. Deflecting adversarial attacks with pixel deflection. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018.8571-8580.
    [109] Akhtar N,Liu J,Mian A. Defense against universal adversarial perturbations. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018.3389-3398.
    [110] Liao F,Liang M,Dong Y,et al. Defense against adversarial attacks using high-level representation guided denoiser. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018.1778-1787.
    [111] Shen S,Jin G,Gao K,et al. Ape-GAN:Adversarial perturbation elimination with GAN. arXiv Preprint arXiv:1707.05474,2017.
    [112] Samangouei P,Kabkab M,Chellappa R. Defense-GAN:Protecting classifiers against adversarial attacks using generative models. arXiv Preprint arXiv:1805.06605,2018.
    [113] Hwang U,Park J,Jang H,et al. PuVAE:A variational autoencoder to purify adversarial examples. arXiv Preprint arXiv:1903.00585,2019.
    [114] Dubey A,Van Der Maaten L,Yalniz Z,et al. Defense against adversarial images using Web-scale nearest-neighbor search. arXiv Preprint arXiv:1903.01612,2019.
    [115] Naseer M,Khan S,Porikli F. Local gradients smoothing:Defense against localized adversarial attacks. In:Proc. of the 2019 IEEE Winter Conf. on Applications of Computer Vision (WACV). 2019.1300-1307.
    [116] Wu X,Jang U,Chen J,et al. Reinforcing adversarial robustness using model confidence induced by adversarial training. arXiv Preprint arXiv:1711.08001,2017.
    [117] Song Y,Kim T,Nowozin S,et al. Pixeldefend:Leveraging generative models to understand and defend against adversarial examples. arXiv Preprint arXiv:1710.10766,2017.
    [118] Athalye A,Carlini N,Wagner D. Obfuscated gradients give a false sense of security:Circumventing defenses to adversarial examples. arXiv Preprint arXiv:1802.00420,2018.
    [119] Huang L,Joseph AD,Nelson B,et al. Adversarial machine learning. In:Proc. of the 4th ACM Workshop on Security and Artificial Intelligence. 2011.43-58.
    [120] He W,Wei J,Chen X,et al. Adversarial example defense:Ensembles of weak defenses are not strong. In:Proc. of the 11th {USENIX} Workshop on Offensive Technologies ({WOOT} 2017). 2017.
    [121] Liu X,Cheng M,Zhang H,et al. Towards robust neural networks via random self-ensemble. In:Proc. of the European Conf. on Computer Vision (ECCV). 2018.369-385.
    [122] Corona I,Biggio B,Contini M,et al. Deltaphish:Detecting phishing webpages in compromised websites. In:Proc. of the European Symp. on Research in Computer Security. 2017.370-388.
    [123] Biggio B,Corona I,He ZM,et al. One-and-a-half-class multiple classifier systems for secure learning against evasion attacks at test time. In:Proc. of the Int'l Workshop on Multiple Classifier Systems. 2015.168-180.
    [124] Xie C,Wang J,Zhang Z,et al. Mitigating adversarial effects through randomization. arXiv Preprint arXiv:1711.01991,2017.
    [125] Liu X,Li Y,Wu C,et al. Adv-BNN:Improved adversarial defense through robust Bayesian neural network. arXiv Preprint arXiv:1810.01279,2018.
    [126] Lecuyer M,Atlidakis V,Geambasu R,et al. Certified robustness to adversarial examples with differential privacy. arXiv Preprint arXiv:1802.03471,2018.
    [127] Gu S,Rigazio L. Towards deep neural network architectures robust to adversarial examples. arXiv Preprint arXiv:1412.5068,2014.
    [128] Szegedy C,Vanhoucke V,Ioffe S,et al. Rethinking the inception architecture for computer vision. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2016.2818-2826.
    [129] Cao X,Gong NZ. Mitigating evasion attacks to deep neural networks via region-based classification. In:Proc. of the 33rd Annual Computer Security Applications Conf. 2017.278-287.
    [130] Yan Z,Guo Y,Zhang C. Deep defense:Training DNNs with improved adversarial robustness. In:Proc. of the Advances in Neural Information Processing Systems. 2018.419-428.
    [131] Jakubovitz D,Giryes R. Improving DNN robustness to adversarial attacks using Jacobian regularization. In:Proc. of the European Conf. on Computer Vision (ECCV). 2018.514-529.
    [132] Tramèr F,Kurakin A,Papernot N,et al. Ensemble adversarial training:Attacks and defenses. arXiv Preprint arXiv:1705.07204,2017.
    [133] Kannan H,Kurakin A,Goodfellow I. Adversarial logit pairing. arXiv Preprint arXiv:1803.06373,2018.
    [134] Zhang H,Chen H,Song Z,et al. The limitations of adversarial training and the blind-spot attack. arXiv Preprint arXiv:1901.04684,2019.
    [135] Papernot N,Mcdaniel P,Wu X,et al. Distillation as a defense to adversarial perturbations against deep neural networks. In:Proc. of the 2016 IEEE Symp. on Security and Privacy (SP). 2016.582-597.
    [136] Guo Y,Zhang C,Zhang C,et al. Sparse DNNs with improved adversarial robustness. In:Proc. of the Advances in Neural Information Processing Systems. 2018.242-251.
    [137] Zhao Y,Shumailov I,Mullins R,et al. To compress or not to compress:Understanding the interactions between adversarial attacks and neural network compression. arXiv Preprint arXiv:1810.00208,2018.
    [138] Melis M,Demontis A,Biggio B,et al. Is deep learning safe for robot vision? Adversarial examples against the ICUB humanoid. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2017.751-759.
    [139] Li X,Li F. Adversarial examples detection in deep networks with convolutional filter statistics. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2017.5764-5772.
    [140] Xiao C,Deng R,Li B,et al. Characterizing adversarial examples based on spatial consistency information for semantic segmentation. In:Proc. of the European Conf. on Computer Vision (ECCV). 2018.217-234.
    [141] Tian S,Yang G,Cai Y. Detecting adversarial examples through image transformation. In:Proc. of the 32nd AAAI Conf. on Artificial Intelligence. 2018.
    [142] Lee K,Lee K,Lee H,et al. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In:Advances in Neural Information Processing Systems. 2018.7167-7177.
    [143] Li Y,Bradshaw J,Sharma Y. Are generative classifiers more robust to adversarial attacks? arXiv Preprint arXiv:1802.06552,2018.
    [144] Metzen JH,Genewein T,Fischer V,et al. On detecting adversarial perturbations. arXiv Preprint arXiv:1702.04267,2017.
    [145] Lu J,Issaranon T,Forsyth D. Safetynet:Detecting and rejecting adversarial examples robustly. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2017.446-454.
    [146] Zheng Z,Hong P. Robust detection of adversarial attacks by modeling the intrinsic properties of deep neural networks. In:Advances in Neural Information Processing Systems. 2018.7913-7922.
    [147] Meng D,Chen H. Magnet:A two-pronged defense against adversarial examples. In:Proc. of the 2017 ACM SIGSAC Conf. on Computer and Communications Security. 2017.135-147.
    [148] Ma X,Li B,Wang Y,et al. Characterizing adversarial subspaces using local intrinsic dimensionality. arXiv Preprint arXiv:1801.02613,2018.
    [149] Ghosh P,Losalka A,Black MJ. Resisting adversarial attacks using Gaussian mixture variational autoencoders. arXiv Preprint arXiv:1806.00081,2018.
    [150] Pang T,Du C,Dong Y,et al. Towards robust detection of adversarial examples. In:Advances in Neural Information Processing Systems. 2018.4579-4589.
    [151] Tao G,Ma S,Liu Y,et al. Attacks meet interpretability:Attribute-steered detection of adversarial samples. In:Advances in Neural Information Processing Systems. 2018.7717-7728.
    [152] Zhao C,Fletcher PT,Yu M,et al. The adversarial attack and detection under the fisher information metric. arXiv Preprint arXiv:1810.03806,2018.
    [153] Ma S,Liu Y,Tao G,et al. NIC:Detecting adversarial samples with neural network invariant checking. In:Proc. of the 26th Annual Network and Distributed System Security Symp. (NDSS). 2019.24-27.
    [154] Xu H,Caramanis C,Mannor S. Robustness and regularization of support vector machines. Journal of Machine Learning Research,2009,10(Jul.):1485-1510.
    [155] Demontis A,Russu P,Biggio B,et al. On security and sparsity of linear classifiers for adversarial settings. In:Proc. of the Joint IAPR Int'l Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR). 2016.322-332.
    [156] Russu P,Demontis A,Biggio B,et al. Secure kernel machines against evasion attacks. In:Proc. of the 2016 ACM Workshop on Artificial Intelligence and Security. 2016.59-69.
    [157] Lyu C,Huang K,Liang HN. A unified gradient regularization family for adversarial examples. In:Proc. of the 2015 IEEE Int'l Conf. on Data Mining. 2015.301-309.
    [158] Chen H,Zhang H,Boning D,et al. Robust decision trees against adversarial examples. arXiv Preprint arXiv:1902.10660,2019.
    [159] Raghunathan A,Steinhardt J,Liang P. Certified defenses against adversarial examples. arXiv Preprint arXiv:1801.09344,2018.
    [160] Wong E,Kolter JZ. Provable defenses against adversarial examples via the convex outer adversarial polytope. arXiv Preprint arXiv:1711.00851,2017.
    [161] Sinha A,Namkoong H,Duchi J. Certifiable distributional robustness with principled adversarial training. arXiv:1710.10571v4,2017.
    [162] Kantchelian A,Tygar J,Joseph A. Evasion and hardening of tree ensemble classifiers. In:Proc. of the Int'l Conf. on Machine Learning. 2016.2387-2396.
    [163] Brückner M,Kanzow C,Scheffer T. Static prediction games for adversarial learning problems. Journal of Machine Learning Research,2012,13(Sep.):2617-2654.
    [164] Liu W,Chawla S. Mining adversarial patterns via regularized loss minimization. Machine Learning,2010,81(1):69-83.
    [165] Wooldridge M. Does game theory work? IEEE Intelligent Systems,2012,27(6):76-80.
    [166] Carlini N,Liu C,Kos J,et al. The secret sharer:Measuring unintended neural network memorization&extracting secrets. arXiv Preprint arXiv:1802.08232,2018.
    [167] Fredrikson M,Lantz E,Jha S,et al. Privacy in pharmacogenetics:An end-to-end case study of personalized warfarin dosing. In:Proc. of the 23rd {USENIX} Security Symp. ({USENIX} Security 2014). 2014.17-32.
    [168] Fredrikson M,Jha S,Ristenpart T. Model inversion attacks that exploit confidence information and basic countermeasures. In:Proc. of the 22nd ACM SIGSAC Conf. on Computer and Communications Security. 2015.1322-1333.
    [169] Shokri R,Shmatikov V. Privacy-preserving deep learning. In:Proc. of the 22nd ACM SIGSAC Conf. on Computer and Communications Security. 2015.1310-1321.
    [170] Hitaj B,Ateniese G,Pérez-Cruz F. Deep models under the GAN:Information leakage from collaborative deep learning. In:Proc. of the 2017 ACM SIGSAC Conf. on Computer and Communications Security. 2017.603-618.
    [171] Salem A,Bhattacharya A,Backes M,et al. Updates-leak:Data set inference and reconstruction attacks in online learning. arXiv Preprint arXiv:1904.01067,2019.
    [172] Ateniese G,Felici G,Mancini LV,et al. Hacking smart machines with smarter ones:How to extract meaningful data from machine learning classifiers. arXiv Preprint arXiv:1306.4447,2013.
    [173] Ganju K,Wang Q,Yang W,et al. Property inference attacks on fully connected neural networks using permutation invariant representations. In:Proc. of the 2018 ACM SIGSAC Conf. on Computer and Communications Security. 2018.619-633.
    [174] Melis L,Song C,De Cristofaro E,et al. Exploiting unintended feature leakage in collaborative learning. In:Proc. of the 2019 IEEE Symp. on Security and Privacy (SP). 2019.
    [175] Backes M,Berrang P,Humbert M,et al. Membership privacy in MicroRNA-based studies. In:Proc. of the 2016 ACM SIGSAC Conf. on Computer and Communications Security. 2016.319-330.
    [176] Shokri R,Stronati M,Song C,et al. Membership inference attacks against machine learning models. In:Proc. of the 2017 IEEE Symp. on Security and Privacy (SP). 2017.3-18.
    [177] Salem A,Zhang Y,Humbert M,et al. ML-leaks:Model and data independent membership inference attacks and defenses on machine learning models. arXiv Preprint arXiv:1806.01246,2018.
    [178] Oh SJ,Augustin M,Schiele B,et al. Towards reverse-engineering black-box neural networks. arXiv Preprint arXiv:1711.01768,2017.
    [179] Wang B,Gong NZ. Stealing hyperparameters in machine learning. In:Proc. of the 2018 IEEE Symp. on Security and Privacy (SP). 2018.36-52.
    [180] Dwork C,Roth A. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science,2014,9(3-4):211-407.
    [181] Kairouz P,Oh S,Viswanath P. Extremal mechanisms for local differential privacy. In:Advances in Neural Information Processing Systems. 2014.2879-2887.
    [182] Erlingsson Ú,Pihur V,Korolova A. Rappor:Randomized aggregatable privacy-preserving ordinal response. In:Proc. of the 2014 ACM SIGSAC Conf. on Computer and Communications Security. 2014.1054-1067.
    [183] Liu C,Mittal P. LinkMirage:Enabling privacy-preserving analytics on social relationships. In:Proc. of the 2019 Network and Distributed System Security Symp. (NDSS). 2016.
    [184] Chaudhuri K,Monteleoni C,Sarwate AD. Differentially private empirical risk minimization. Journal of Machine Learning Research,2011,12(Mar.):1069-1109.
    [185] Abadi M,Chu A,Goodfellow I,et al. Deep learning with differential privacy. In:Proc. of the 2016 ACM SIGSAC Conf. on Computer and Communications Security. 2016.308-318.
    [186] Papernot N,Abadi M,Erlingsson U,et al. Semi-supervised knowledge transfer for deep learning from private training data. arXiv Preprint arXiv:1610.05755,2016.
    [187] Nasr M,Shokri R,Houmansadr A. Machine learning with membership privacy using adversarial regularization. In:Proc. of the 2018 ACM SIGSAC Conf. on Computer and Communications Security. 2018.634-646.
    [188] Hagestedt I,Zhang Y,Humbert M,et al. MBeacon:Privacy-preserving beacons for DNA methylation data. In:Proc. of the 2019 Network and Distributed System Security Symp. (NDSS). Internet Society,2019.
    [189] Gilad-Bachrach R,Dowlin N,Laine K,et al. Cryptonets:Applying neural networks to encrypted data with high throughput and accuracy. In:Proc. of the Int'l Conf. on Machine Learning. 2016.201-210.
    [190] Liu J,Juuti M,Lu Y,et al. Oblivious neural network predictions via minionn transformations. In:Proc. of the 2017 ACM SIGSAC Conf. on Computer and Communications Security. 2017.619-631.
    [191] Nikolaenko V,Weinsberg U,Ioannidis S,et al. Privacy-preserving ridge regression on hundreds of millions of records. In:Proc. of the 2013 IEEE Symp. on Security and Privacy. 2013.334-348.
    [192] Gascón A,Schoppmann P,Balle B,et al. Privacy-preserving distributed linear regression on high-dimensional data. Proc. on Privacy Enhancing Technologies,2017,2017(4):345-364.
    [193] Bonawitz K,Ivanov V,Kreuter B,et al. Practical secure aggregation for privacy-preserving machine learning. In:Proc. of the 2017 ACM SIGSAC Conf. on Computer and Communications Security. 2017.1175-1191.
    [194] Mohassel P,Zhang Y. Secureml:A system for scalable privacy-preserving machine learning. In:Proc. of the 2017 IEEE Symp. on Security and Privacy (SP). 2017.19-38.
    相似文献
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

纪守领,杜天宇,李进锋,沈超,李博.机器学习模型安全与隐私研究综述.软件学报,2021,32(1):41-67

复制
分享
文章指标
  • 点击次数:7996
  • 下载次数: 15930
  • HTML阅读次数: 6428
  • 引用次数: 0
历史
  • 收稿日期:2019-06-10
  • 最后修改日期:2019-10-01
  • 在线发布日期: 2020-09-10
  • 出版日期: 2021-01-06
文章二维码
您是第19766133位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号