[关键词]
[摘要]
社交网络中,消息的爆发预测属于社交网络流行动态分析的范畴,是社会计算领域的研究热点之一.通过利用基于深度循环神经网络对社交消息的传播过程进行建模,提出了SMOP(social messages outbreak prediction model based on recurrent neural network)模型.与传统的基于机器学习的模型相比,SMOP直接对消息转发的到达过程进行建模,避免了传统方法中繁琐的特征工程;与基于点随机过程的模型相比,SMOP可以自动学习消息传播过程的速率函数,不需要手动定义消息传播速率的特征函数,具有较强的数据场景适应性.另外,SMOP采用了时间向量和用户向量的输入表示方法,将时间的周期性和用户的兴趣偏好建模到传播过程之中,提升了SMOP的预测效果.在Twitter和新浪微博数据集上的实验结果均表明,SMOP具有优良的数据适应能力,可以在消息传播的早期(0.5h),以较高的F1值预测某条社交消息是否爆发,验证了模型的有效性.
[Key word]
[Abstract]
Outbreak prediction in social networks is a part of popularity dynamic analysis of social networks, and it is an active research topic in the domain of social computing. This study proposes a social messages outbreak prediction model based on recurrent neural network (SMOP) by modeling the message propagation process. Compared with the traditional models on machine learning, SMOP directly models the arrival process of message without the need of tedious feature engineering in traditional methods. When it comes to point process models, SMOP is able to automatically learn the rate functions of propagation process, making it adaptable to a variety of scenarios. Moreover, time vector and user vector, which contain the periodicity of time and the user profile, are used as input to improve the performance of outbreak prediction. Experimental results on real word data sets such as Twitter and Sina Weibo show that SMOP has excellent data adaptability, and it is able to predict whether a message would outbreak with higher F1 score in the beginning of the message spread (within 0.5h).
[中图分类号]
[基金项目]
国家重点基础研究发展计划(973)(2012CB316303,2014CB340401);国家高技术研究发展计划(863)(2015AA015803,2014AA015204);中国科学院重点部署项目(KGZD-EW-T03-2);国家自然科学基金(61232010,61572473,61303156,61502447);国家242信息安全计划(2015F028);山东省自主创新及成果转化专项(2014CGZH1103);欧盟第七科技框架计划(FP7)(PIRSES-GA-2012-318939)