[关键词]
[摘要]
联邦学习是顺应大数据时代和人工智能技术发展而兴起的一种协调多个参与方共同训练模型的机制.它允许各个参与方将数据保留在本地,在打破数据孤岛的同时保证参与方对数据的控制权.然而联邦学习引入了大量参数交换过程,不仅和集中式训练一样受到模型使用者的威胁,还可能受到来自不可信的参与设备的攻击,因此亟需更强的隐私手段保护各方持有的数据.分析并展望了联邦学习中的隐私保护技术的研究进展和趋势.简要介绍联邦学习的架构和类型,分析联邦学习过程中面临的隐私风险,总结重建、推断两种攻击策略,然后依据联邦学习中的隐私保护机制归纳隐私保护技术,并深入调研应用上述技术的隐私保护算法,从中心、本地、中心与本地结合这3个层面总结现有的保护策略.最后讨论联邦学习隐私保护面临的挑战并展望未来的发展方向.
[Key word]
[Abstract]
With the era of big data and the development of artificial intelligence, Federated learning (FL) emerges as a distributed machine learning approach. It allows multiple participants to train a global model collaboratively while keeping each of their training datasets in local devices. FL is created to break up data silos and preserve the privacy and security of data. However, there are still a large number of privacy risks during data exchange steps, where local data is threatened not only by model users as in centralized training but also by any dishonest participants. It is necessary to study technologies to achieve rigorous privacy-preserving approaches. The research progress and trend of privacy-preserving techniques for FL are surveyed in this paper. At first, the architecture and type of FL are introduced, then privacy risks and attacks are illustrated, including reconstruction and inference strategies. According to the mechanism of privacy preservation, the main privacy protection technologies are introduced. By applying these technologies, privacy defense strategies are presented and they are abstracted as 3 levels: local, central, local & central. Challenges and future directions of privacy-preserving in federated learning are discussed at last.
[中图分类号]
[基金项目]
国家重点研发计划(2018YFB1004401);国家自然科学基金(62072460,62076245,61772537,61772536,62172424);北京市自然科学基金(4212022);中国人民大学科学研究基金(中央高校基本科研业务费专项资金资助)(21XNH180)