国家自然科学基金(62272103, 62272102, 61872090)
高精度联邦学习模型的训练需要消耗大量的用户本地资源, 参与训练的用户能够通过私自出售联合训练的模型获得非法收益. 为实现联邦学习模型的产权保护, 利用深度学习后门技术不影响主任务精度而仅对少量触发集样本造成误分类的特征, 构建一种基于模型后门的联邦学习水印(federated learning watermark based on backdoor, FLWB)方案, 能够允许各参与训练的用户在其本地模型中分别嵌入私有水印, 再通过云端的模型聚合操作将私有后门水印映射到全局模型作为联邦学习的全局水印. 之后提出分步训练方法增强各私有后门水印在全局模型的表达效果, 使得FLWB方案能够在不影响全局模型精度的前提下容纳各参与用户的私有水印. 理论分析证明了FLWB方案的安全性, 实验验证分步训练方法能够让全局模型在仅造成1%主任务精度损失的情况下有效容纳参与训练用户的私有水印. 最后, 采用模型压缩攻击和模型微调攻击对FLWB方案进行攻击测试, 其结果表明FLWB方案在模型压缩到30%时仍能保留80%以上的水印, 在4种不同的微调攻击下能保留90%以上的水印, 具有很好的鲁棒性.
The training of high-precision federated learning models consumes a large number of users’ local resources. The users who participate in the training can gain illegal profits by selling the jointly trained model without others’ permission. In order to protect the property rights of federated learning models, this study proposes a federated learning watermark based on backdoor (FLWB) by using the feature that deep learning backdoor technology maintains the accuracy of main tasks and only causes misclassification in a small number of trigger set samples. FLWB allows users who participate in the training to embed their own private watermarks in the local model and then map the private backdoor watermarks to the global model through the model aggregation in the cloud as the global watermark for federated learning. Then a stepwise training method is designed to enhance the expression effect of private backdoor watermarks in the global model so that FLWB can accommodate the private watermarks of the users without affecting the accuracy of the global model. Theoretical analysis proves the security of FLWB, and experiments verify that the global model can effectively accommodate the private watermarks of the users who participate in the training by only causing an accuracy loss of 1% of the main tasks through the stepwise training method. Finally, FLWB is tested by model compression and fine-tuning attacks. The results show that more than 80% of the watermarks can be retained when the model is compressed to 30% by FLWB, and more than 90% of the watermarks can be retained under four different fine-tuning attacks, which indicates the excellent robustness of FLWB.