Recommendation Approach Based on Attentive Federated Distillation
Author:
Affiliation:

Clc Number:

TP18

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Data privacy protection has become one of the major challenges of recommendation systems. With the release of the Cybersecurity Law of the People's Republic of China and the general data protection regulation in the European Union, data privacy and security have become a worldwide concern. Federated learning can train the global model without exchanging user data, thus protecting users' privacy. Nevertheless, federated learning is still facing many issues, such as the small size of local data in each device, over-fitting of local model, and the data sparsity, which makes it difficult to reach higher accuracy. Meanwhile, with the advent of 5G (the 5th generation mobile communication technology) era, the data volume and transmission rate of personal devices are expected to be 10 to 100 times higher than the current ones, which requires higher model efficiency. Knowledge distillation can transfer the knowledge from the teacher model to a more compact student model so that the student model can approach or surpass the performance of teacher model, thus effectively solve the problems of large model parameter and high communication cost. However, the accuracy of student model is lower than teacher model after knowledge distillation. Therefore, a federated distillation approach is proposed with attentional mechanisms for recommendation systems. First, the method introduces Kullback-Leibler divergence and regularization term to the objective function of federated distillation to reduce the impact of heterogeneity between teacher network and student network; then it introduces multi-head attention mechanism to improve model accuracy by adding information to the embeddings. Finally, an improved adaptive training mechanism is introduced for learning rate to automatically switch optimizers and choose appropriate learning rates, thus increasing convergence speed of model. Experiment results validate efficiency of the proposed methods: compared to the baselines, the training time of the proposed model is reduced by 52%, the accuracy is increased by 13%, the average error is reduced by 17%, and the NDCG is increased by 10%.

    Reference
    Related
    Cited by
Get Citation

谌明,张蕾,马天翼.一种基于注意力联邦蒸馏的推荐方法.软件学报,2021,32(12):3852-3868

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:January 18,2020
  • Revised:April 18,2020
  • Adopted:
  • Online: December 02,2021
  • Published: December 06,2021
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063