联邦原型学习的特征图中毒攻击和双重防御机制
作者:
中图分类号:

TP309

基金项目:

国家重点研发计划(2022YFB4501200, 2022YFB3304303); 国家自然科学基金(62271128, 61972073, U2333207); 成都市重点研发支撑计划“揭榜挂帅”项目(2022-JB00-00013-GX); 四川省科技计划重点研发项目(2022ZDZX0004, 2023YFG0029, 2023YFG0150, 2022YFG0212, 2021YFS0391); 四川省科技计划“揭榜挂帅”项目(2023YFG0374, 2023YFG0373); 山东省自然科学基金(ZR2023MF045)


Feature Map Poisoning Attack and Dual Defense Mechanism for Federated Prototype Learning
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [38]
  • |
  • 相似文献 [20]
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    联邦学习是一种无需用户共享私有数据、以分布式迭代协作训练全局机器学习模型的框架. 目前流行的联邦学习方法FedProto采用抽象类原型(称为特征图)聚合, 优化模型收敛速度和泛化能力. 然而, 该方法未考虑所聚合的特征图的正确性, 而错误的特征图可能导致模型训练失效. 为此, 首先探索针对FedProto的特征图中毒攻击, 论证攻击者只需通过置乱训练数据的标签, 便可将模型的推测准确率至多降低81.72%. 为了抵御上述攻击, 进一步提出双重防御机制, 分别通过全知识蒸馏和特征图甄别排除错误的特征图. 基于真实数据集的实验表明, 防御机制可将受攻击模型的推测准确率提升1–5倍, 且仅增加2%系统运行时间.

    Abstract:

    Federated learning, a framework for training global machine learning models through distributed iterative collaboration without sharing private data, has gained prevalence. FedProto, a widely used federated learning approach, employs abstract class prototypes, termed feature maps, to enhance model convergence speed and generalization capacity. However, this approach overlooks the verification of the aggregated feature maps’ accuracy, risking model training failures due to incorrect feature maps. This study investigates a feature map poisoning attack on FedProto, revealing that malicious actors can degrade inference accuracy by up to 81.72% through tampering with the training data labels. To counter such attacks, we propose a dual defense mechanism utilizing knowledge distillation and feature map validation. Experimental results on authentic datasets demonstrate that this defense strategy can enhance the compromised model inference accuracy by a factor of 1 to 5, with only a marginal 2% increase in operational time.

    参考文献
    [1] McMahan B, Moore E, Ramage D, Hampson S, Arcas BAY. Communication-efficient learning of deep networks from decentralized data. In: Proc. of the 20th Int’l Conf. on Artificial Intelligence and Statistics. Fort Lauderdale: PMLR, 2017. 1273–1282.
    [2] 汤凌韬, 陈左宁, 张鲁飞, 吴东. 联邦学习中的隐私问题研究进展. 软件学报, 2023, 34(1): 197–229. http://www.jos.org.cn/1000-9825/6411.htm
    Tang LT, Chen ZN, Zhang LF, Wu D. Research progress of privacy issues in federated learning. Ruan Jian Xue Bao/Journal of Software, 2023, 34(1): 197–229 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6411.htm
    [3] 刘艺璇, 陈红, 刘宇涵, 李翠平. 联邦学习中的隐私保护技术. 软件学报, 2022, 33(3): 1057–1092. http://www.jos.org.cn/1000-9825/6446.htm
    Liu YX, Chen H, Liu YH, Li CP. Privacy-preserving techniques in federated learning. Ruan Jian Xue Bao/Journal of Software, 2022, 33(3): 1057–1092 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6446.htm
    [4] Abadi M, Chu A, Goodfellow I, McMahan HB, Mironov I, Talwar K, Zhang L. Deep learning with differential privacy. In: Proc. of the 2016 ACM SIGSAC Conf. on Computer and Communications Security. Vienna: ACM, 2016. 308–318. [doi: 10.1145/2976749.2978318]
    [5] Wang RJ, Lai JS, Zhang ZY, Li X, Vijayakumar P, Karuppiah M. Privacy-preserving federated learning for Internet of medical things under edge computing. IEEE Journal of Biomedical and Health Informatics, 2023, 27(2): 854–865.
    [6] Wei K, Li J, Ding M, Ma C, Su H, Zhang B, Poor HV. User-level privacy-preserving federated learning: Analysis and performance optimization. IEEE Trans. on Mobile Computing, 2022, 21(9): 3388–3401.
    [7] Bonawitz K, Ivanov V, Kreuter B, Marcedone A, Mcmahan HB, Patel S, Ramage D, Segal A, Seth K. Practical secure aggregation for privacy-preserving machine learning. In: Proc. of the 2017 ACM SIGSAC Conf. on Computer and Communications Security. 2017. 1175–1191. [doi: 10.1145/3133956.3133982]
    [8] Karimireddy SP, Kale S, Mohri M, Reddi SJ, Stich SU, Suresh AT. SCAFFOLD: Stochastic controlled averaging for federated learning. In: Proc. of the 37th Int’l Conf. on Machine Learning. PMLR, 2020. 5132–5143.
    [9] Li T, Sahu AK, Zaheer M, Sanjabi M, Talwalkar A, Smith V. Federated optimization in heterogeneous networks. In: Proc. of the 3rd MLSys Conf. Austin, 2020. 429–450.
    [10] Tan AZ, Yu H, Cui LZ, Yang Q. Towards personalized federated learning. IEEE Trans. on Neural Networks and Learning Systems, 2023, 34(12): 9587–9603.
    [11] Dinh CT, Tran NH, Nguyen TD. Personalized federated learning with moreau envelopes. In: Proc. of the 34th Int’l Conf. on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2020. 1796.
    [12] Wu Q, He KW, Chen X. Personalized federated learning for intelligent IoT applications: A cloud-edge based framework. IEEE Open Journal of the Computer Society, 2020, 1: 35–44.
    [13] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv:1503.02531, 2015.
    [14] Zhang Y, Xiang T, Hospedales TM, Lu HC. Deep mutual learning. In: Proc. of the 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 4320–4328. [doi: 10.1109/CVPR.2018.00454]
    [15] Cho JH, Hariharan B. On the efficacy of knowledge distillation. In: Proc. of the 2019 IEEE/CVF Int’l Conf. on Computer Vision. Seoul: IEEE, 2019. 4793–4801. [doi: 10.1109/ICCV.2019.00489]
    [16] Chen DF, Mei JP, Zhang HL, Wang C, Feng Y, Chen C. Knowledge distillation with the reused teacher classifier. In: Proc. of the 2022 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022. 11923–11932.
    [17] Li DL, Wang JP. FedMD: Heterogenous federated learning via model distillation. arXiv:1910.03581, 2019.
    [18] Zhu ZD, Hong JY, Zhou JY. Data-free knowledge distillation for heterogeneous federated learning. In: Proc. of the 38th Int’l Conf. on Machine Learning. PMLR, 2021. 12878–12889.
    [19] Guo QS, Wang XJ, Wu YC, Yu ZP, Liang D, Hu XL, Luo P. Online knowledge distillation via collaborative learning. In: Proc. of the 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020. 11017–11026.
    [20] Tan Y, Long GD, Liu L, Zhou TY, Lu QH, Jiang J, Zhang CQ. FedProto: Federated prototype learning across heterogeneous clients. In: Proc. of the 36th AAAI Conf. on Artificial Intelligence. AAAI Press, 2022. 8432–8440.
    [21] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521(7553): 436–444.
    [22] Ma XD, Zhu J, Lin ZH, Chen SX, Qin YJ. A state-of-the-art survey on solving non-IID data in federated learning. Future Generation Computer Systems, 2022, 135: 244–258
    [23] Chen Z, Yang C, Zhu ML, Peng Z, Yuan YX. Personalized retrogress-resilient federated learning toward imbalanced medical data. IEEE Trans. on Medical Imaging, 2022, 41(12): 3663–3674
    [24] Sun Y, Shen L, Huang T, et al. Fedspeed: Larger local interval, less communication round, and higher generalization accuracy. arXiv:2302.10429, 2023.
    [25] Koh PW, Steinhardt J, Liang P. Stronger data poisoning attacks break data sanitization defenses. Machine Learning, 2022, 111(1): 1–47.
    [26] Nuding F, Mayer R. Data poisoning in sequential and parallel federated learning. In: Proc. of the 2022 ACM on Int’l Workshop on Security and Privacy Analytics. Baltimore: ACM, 2022. 24–34. [doi: 10.1145/3510548.3519372]
    [27] Kurita K, Michel P, Neubig G. Weight poisoning attacks on pre-trained models. In: Proc. of the 58th Annual Meeting of the Association for Computational Linguistics. ACL, 2020. 2793–2806.
    [28] Chen XY, Liu C, Li B, Lu K, Song D. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv:1712.05526, 2017.
    [29] Huang H, Mu JM, Gong NZ, Li Q, Liu B, Xu MW. Data poisoning attacks to deep learning based recommender systems. In: Proc. of the 28th Annual Network and Distributed System Security Symp. 2021.
    [30] Peri N, Gupta N, Huang WR, Fowl L, Zhu C, Feizi S, Goldstein T, Dickerson JP. Deep k-NN defense against clean-label data poisoning attacks. In: Proc. of the 2020 European Conf. on Computer Vision. Glasgow: Springer, 2020. 55–70.
    [31] Chen J, Zhang XX, Zhang R, Wang C, Liu L. De-pois: An attack-agnostic defense against data poisoning attacks. IEEE Trans. on Information Forensics and Security, 2021, 16: 3412–3425.
    [32] Baracaldo N, Chen B, Ludwig H, Safavi JA. Mitigating poisoning attacks on machine learning models: A data provenance based approach. In: Proc. of the 10th ACM Workshop on Artificial Intelligence and Security. Dallas: ACM, 2017. 103–110.
    [33] Shah D, Dube P, Chakraborty S, Verma A. Adversarial training in communication constrained federated learning. arXiv:2103.01319, 2021.
    [34] Deng L. The MNIST database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 2012, 29(6): 141–142.
    [35] Li HM, Liu HC, Ji XY, Li GQ, Shi LP. CIFAR10-DVS: An event-stream dataset for object classification. Frontiers in Neuroscience, 2017, 11: 309.
    [36] Wang JB, Pei XK, Wang RJ, Zhang FL, Chen T. Federated semi-supervised learning with tolerant guidance and powerful classifier in edge scenarios. Information Sciences, 2024, 662: 120201.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

王瑞锦,王金波,张凤荔,李经纬,李增鹏,陈厅.联邦原型学习的特征图中毒攻击和双重防御机制.软件学报,2025,36(3):1355-1374

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-09-14
  • 最后修改日期:2024-01-12
  • 在线发布日期: 2024-11-20
文章二维码
您是第19799526位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号