一种基于注意力联邦蒸馏的推荐方法

doi:10.13328/j.cnki.jos.006128

微信服务号

微信订阅号

首页 > 过刊浏览>2021年第32卷第12期 >3852-3868. DOI:10.13328/j.cnki.jos.006128

PDF HTML阅读 XML下载导出引用引用提醒

一种基于注意力联邦蒸馏的推荐方法
DOI:
                        10.13328/j.cnki.jos.006128
                    
CSTR:
                        
                    
作者:
                        
                        
                    
作者单位:
作者简介:谌明(1983-),男,博士,主要研究领域为人工智能,机器学习,大数据.
张蕾(1992-),男,硕士,主要研究领域为个性化推荐系统.
马天翼(1986-),男,博士,主要研究领域为机器学习,联邦学习,推荐系统.
通讯作者:谌明,E-mail:chm@zju.edu.cn
中图分类号:TP18
基金项目:

Recommendation Approach Based on Attentive Federated Distillation

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

数据隐私保护问题已成为推荐系统面临的主要挑战之一.随着《中华人民共和国网络安全法》的颁布和欧盟《通用数据保护条例》的实施，数据隐私和安全成为了世界性的趋势.联邦学习可通过不交换数据训练全局模型，不会泄露用户隐私.但是联邦学习存在每台设备数据量少、模型容易过拟合、数据稀疏导致训练好的模型很难达到较高的预测精度等问题.同时，随着5G （the 5th generation mobile communication technology）时代的到来，个人设备数据量和传输速率预计比当前提高10~100倍，因此要求模型执行效率更高.针对此问题，知识蒸馏可以将教师模型中的知识迁移到更为紧凑的学生模型中去，让学生模型能尽可能逼近或是超过教师网络，从而有效解决模型参数多和通信开销大的问题.但往往蒸馏后的学生模型在精度上会低于教师模型.提出一种面向推荐系统的联邦蒸馏方法，该方法首先在联邦蒸馏的目标函数中加入Kullback-Leibler散度和正则项，减少教师网络和学生网络间的差异性影响；引入多头注意力机制丰富编码信息，提升模型精度；并提出一个改进的自适应学习率训练策略来自动切换优化算法，选择合适的学习率，提升模型的收敛速度.实验验证了该方法的有效性：相比基准算法，模型的训练时间缩短52%，模型的准确率提升了13%，平均误差减少17%，NDCG值提升了10%.

Abstract:

Data privacy protection has become one of the major challenges of recommendation systems. With the release of the Cybersecurity Law of the People's Republic of China and the general data protection regulation in the European Union, data privacy and security have become a worldwide concern. Federated learning can train the global model without exchanging user data, thus protecting users' privacy. Nevertheless, federated learning is still facing many issues, such as the small size of local data in each device, over-fitting of local model, and the data sparsity, which makes it difficult to reach higher accuracy. Meanwhile, with the advent of 5G (the 5th generation mobile communication technology) era, the data volume and transmission rate of personal devices are expected to be 10 to 100 times higher than the current ones, which requires higher model efficiency. Knowledge distillation can transfer the knowledge from the teacher model to a more compact student model so that the student model can approach or surpass the performance of teacher model, thus effectively solve the problems of large model parameter and high communication cost. However, the accuracy of student model is lower than teacher model after knowledge distillation. Therefore, a federated distillation approach is proposed with attentional mechanisms for recommendation systems. First, the method introduces Kullback-Leibler divergence and regularization term to the objective function of federated distillation to reduce the impact of heterogeneity between teacher network and student network; then it introduces multi-head attention mechanism to improve model accuracy by adding information to the embeddings. Finally, an improved adaptive training mechanism is introduced for learning rate to automatically switch optimizers and choose appropriate learning rates, thus increasing convergence speed of model. Experiment results validate efficiency of the proposed methods: compared to the baselines, the training time of the proposed model is reduced by 52%, the accuracy is increased by 13%, the average error is reduced by 17%, and the NDCG is increased by 10%.

参考文献

相似文献

引证文献

引用本文

谌明,张蕾,马天翼.一种基于注意力联邦蒸馏的推荐方法.软件学报,2021,32(12):3852-3868

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2020-01-18
最后修改日期:2020-04-18
录用日期:
在线发布日期: 2021-12-02
出版日期: 2021-12-06

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码