基于深度置信网络的广告点击率预估的优化
作者:
作者单位:

作者简介:

陈杰浩(1984-),男,广东潮州市人,博士,高级实验师,主要研究领域为复杂信息系统,大数据应用;张钦(1994-),男,硕士,主要研究领域为计算广告;王树良(1974-),男,博士后,教授,博士生导师,主要研究领域为空间数据挖掘;史继筠(1991-),女,硕士,主要研究领域为计算广告;赵子芊(1993-),男,硕士,主要研究领域为计算广告.

通讯作者:

陈杰浩,E-mail:cjh@bit.edu.cn

中图分类号:

TP18

基金项目:


Click-through Rate Prediction Based on Deep Belief Nets and Its Optimization
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    随着互联网广告的飞速发展,如何预测目标用户对互联网广告的点击率(click-through rate,简称CTR),成为精确广告推荐投放的关键技术,并成为计算广告领域的研究热点和深度神经网络的应用热点.为了提高广告点击率预估的精确度,提出了基于深度置信网络的广告点击率预估模型,并通过基于Kaggle数据挖掘平台数据集的1 000万条随机数据的实验,研究不同的隐藏层层数和隐含节点数目对预测结果的影响.为了解决深度置信网络在数据规模较大的工业界解决方案中的训练效率问题,通过实验证明:广告点击率预估中,深度置信网络的损失函数存在大量的驻点,并且这些驻点对网络训练效率有极大的影响.为了提高模型效率,从发掘网络损失函数特性入手,进一步提出了基于随机梯度下降算法和改进型粒子群算法的融合算法,以优化网络训练.融合算法在迭代步长小于阈值时可以跳出驻点平面,继续正常迭代.实验结果表明,与传统的基于梯度提升决策树和逻辑回归的广告点击率预估模型以及模糊深度神经网络模型相比,基于深度置信网络的预估模型具有更好的预估精度,在均方误差、曲线下面积和对数损失函数指标上分别提升2.39%,9.70%,2.46%和1.24%,7.61%,1.30%;使用融合方法训练深度置信网络,训练效率提高30%~70%.

    Abstract:

    With the rapid development of Internet advertising, how to predict the target user's click-through rate of Internet advertisement has become a key technology for accurate advertising and has become a hot topic in the field of computational advertising and the application of deep neural networks. To improve the accuracy of CTR (click-through rate) prediction, this work proposed a prediction model based on deep belief nets and studied the influence of the number of hidden layers and the number of units in each layer on prediction results by taking experiments on the 10 million samples in the dataset provided by Kaggle Data Mining platform. In order to solve the problem of training efficiency of deep belief nets in large-scale industrial solutions, this study took wide experiments to prove that there are a lot of stagnation points in the loss function of deep belief nets and it has great negative effect on the training process. To improve the efficiency of training, starting from the characteristics of network loss function, this study further proposed a network optimization fusion model based on stochastic gradient descent algorithm and improved particle swarm optimization algorithm. The fusion algorithm can jump out of the stagnation ground and continue the normal training process. The experiment results show that compared with the traditional prediction model based on gradient boost regression tree and logistic regression, and the deep learning model based on fuzzy deep neural network, the proposed training model has better accuracy in prediction and performs 2.39%, 9.70%, 2.46% and 1.24%, 7.61%, 1.30% better in mean squared error, area under curves, and LogLoss. The fusion method will improve the training efficiency of deep belief nets at the level of 30%~70%.

    参考文献
    相似文献
    引证文献
引用本文

陈杰浩,张钦,王树良,史继筠,赵子芊.基于深度置信网络的广告点击率预估的优化.软件学报,2019,30(12):3665-3682

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2018-06-22
  • 最后修改日期:2018-08-10
  • 录用日期:
  • 在线发布日期: 2019-12-05
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号