基于端到端分布式框架的符号网络预测方法
作者:
作者简介:

赵衎衎(1991-),男,陕西渭南人,博士生,主要研究领域为推荐系统,大数据分析;李翠平(1971-),女,博士,教授,博士生导师,CCF杰出会员,主要研究领域为社交网络分析,社会推荐,大数据分析及挖掘;张静(1984-),女,博士,讲师,CCF专业会员,主要研究领域为数据挖掘,社会网络挖掘;陈红(1965-),女,博士,教授,博士生导师,CCF杰出会员,主要研究领域为数据库技术,新硬件平台下的高性能计算;张良富(1991-),男,博士生,主要研究领域为机器学习,数据挖掘.

通讯作者:

李翠平,E-mail:licuiping@ruc.edu.cn

中图分类号:

TP311

基金项目:

国家自然科学基金(61772537,61772536,61702522,61532021);国家重点研发计划(2016YFB1000702)


Signed Network Prediction Method Based on the Client-to-Client Distributed Framework
Author:
Fund Project:

National Natural Science Foundation of China (61772537, 61772536, 61702522, 61532021);National Key Research & Develop Plan (2016YFB1000702)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [43]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    社交网络中的链接关系根据其潜在的含义可分为正关系和负关系.若对网络中的链接关系进行正负标注,则可形成一个符号网络.符号网络在社会学、信息学、生物学等多个领域存在广泛应用.针对符号网络中链接关系的正负预测,已经成为当前研究的热点之一.在大数据背景下,随着符号网络规模的日益扩大,符号预测算法的可伸缩性问题日益突出.一些研究者提出了分布式环境下的符号预测方法,使得算法的可伸缩性问题部分得到缓解.但是由于大多数算法采用了服务器-客户端方式的分布式框架,导致问题并没有得到根本上的解决.提出了一种端到端分布式框架(client to client distributed framework,简称C2CDF),相比传统服务器-客户端架构的集中通信模式,C2CDF的各个节点间地位平等,不存在集中通信,集群的带宽瓶颈和压力得以减轻.通过在社交网络正负符号预测、广告点击率预测及森林类型预测这3个不同真实数据集上的实验结果表明:C2CDF能够在拥有更高准确性的同时,获得2.3倍~3.3倍的加速比,而且拥有良好的泛化性,不仅应用在了社交网络正负符号预测方面,也能作用于广告点击预测等其他领域.

    Abstract:

    The edges of a network can be divided into positive and negative relationships according to their potential meanings. When the edges of a network are signed with plus or minus signs respectively, a signed network can be formed. Signed networks are widely used in many fields such as sociology, informatics and biology. Hence, the sign prediction problem in signed networks has become one of research hot spots. In large dataset, the scalability of sign prediction algorithm is still a great challenge. There are many related works in the distributed design of signed network prediction methods, however, the computation efficiency is still limited by the fundamental server/client framework. This paper proposes client to client distributed framework (C2CDF). Compared with traditional server/client framework, C2CDF is a completely new client-to-client framework which can release the bandwidth pressure by abandoning the server node and allowing the communications between the client nodes. The Experiments on sign prediction in signed social networks, prediction in click-through rate and prediction in forest type show that C2CDF is a general approach which can not only be applied in sign prediction in signed network but also be used in the other prediction areas. In these three datasets, C2CDF can achieve better performance than FM inferred by the traditional SGD algorithm. C2CDF also achieves a 2.3-3.3x speed-up over the method implemented under the server/client framework while obtains a better accuracy performance than the method compared against.

    参考文献
    [1] Guha R, Kumar R, Raghavan P, Tomkins A. Propagation of trust and distrust. In:Proc. of the WWW 2004. 2004. 403-412.[doi:10.1145/988672.988727]
    [2] Kunegis J, Schmidt S, Lommatzsch A, Lerner J, Luca E, Albayrak S. Spectral analysis of signed graphs for clustering, prediction and visualization. In:Proc. of the SDM 2010. 2010. 559-559.[doi:10.1137/1.9781611972801.49]
    [3] Hsieh CJ, Chiang KY, Dhillon IS. Low rank modeling of signed networks. In:Proc. of the SIGKDD 2012. 2012. 507-515.[doi:10.1145/2339530.2339612]
    [4] Kunegis J, Lommatzsch A, Bauckhage C. The slashdotzoo:Mining a social network with negative edges. In:Proc. of the WWW 2009. 2009. 741-750.[doi:10.1145/1526709.1526809]
    [5] Agrawal P, Garg VK, Narayanam R. Link label prediction in signed social networks. In:Proc. of the IJCAI 2013. 2013.
    [6] Leskovec J, Huttenlocher D, Kleinberg J. Predicting positive and negative links in online social networks. In:Proc. of the WWW 2010. 2010. 641-650.[doi:10.1145/1772690.1772756]
    [7] Chiang KY, Natarajan N, Tewari A, Dhillon I. Exploiting longer cycles for link prediction in signed networks. In:Proc. of the CIKM 2011. 2011. 1157-1162.[doi:10.1145/2063576.2063742]
    [8] Yang SH, Smola AJ, Long B, Zha H, Chang Y. Friend or frenemy? Predicting signed ties in social networks. In:Proc. of the SIGIR 2012. 2012. 555-564.[doi:10.1145/2348283.2348359]
    [9] Ye J, Cheng H, Zhu Z, Chen M. Predicting positive and negative links in signed social networks by transfer learning. In:Proc. of the WWW 2013. 2013. 1477-1488.[doi:10.1145/2488388.2488517]
    [10] Dubois T, Golbeck J, Srinivasan A. Predicting trust and distrust in social networks. In:Proc. of the 3rd Int'l Conf. on Privacy, Security, Risk and Trust. 2012. 418-424.[doi:10.1109/PASSAT/SocialCom.2011.56]
    [11] Borzymek P, Sydow M. Trust and distrust prediction in social network with combined graphical and review-based attributes. In:Proc. of the Kes Int'l Conf. on Agent and Multi-Agent Systems:Technologies and Applications. 2010. 122-131.[doi:10.1007/978-3-642-13480-7_14]
    [12] Freudenthaler C, Schmidt-Thieme L, Rendle S. Bayesian factorization machines. 2011. https://wenku.baidu.com/view/5cf5080f581b6bd97f19ea23.html
    [13] Loni B, Shi Y, Larson M, Hanjalic A. Cross-Domain collaborative filtering with factorization machines. In:Proc. of the ECIR 2011, 2014. 656-661.[doi:10.1007/978-3-319-06028-6_72]
    [14] Rendle S. Factorization machines. In:Proc. of the ICDM 2010. 2010. 995-1000.[doi:10.1109/ICDM.2010.127]
    [15] Tsai MF, Wang CJ, Lin ZL. Social influencer analysis with factorization machines. In:Proc. of the WebSci 2015. 2015. 50-50.[doi:10.1145/2786451.2786490]
    [16] Wang S, Du C, Zhao K, Li C, Li Y, Zheng Y, Wang Z, Chen H. Random partition factorization machines for context-aware recommendations. In:Proc. of the WAIM 2016. 2016. 219-230.[doi:10.1007/978-3-319-39937-9_17]
    [17] Li M, Liu Z, Smola AJ, Wang YX. Difacto:Distributed factorization machines. In:Proc. of the WSDM 2016. 2016. 377-386.[doi:10.1145/2835776.2835781]
    [18] Zhong E, Shi Y, Liu N, Rajan S. Scaling factorization machines with parameter server. In:Proc. of the CIKM 2016. 2016. 1583-1592.[doi:10.1145/2983323.2983364]
    [19] Welling M, The YW. Bayesian learning via stochastic gradient langevin dynamics. In:Proc. of the ICML 2011. 2011. 681-688.
    [20] He Q, Xin J. Hybrid deterministic-stochastic gradient langevin dynamics for bayesian learning. Communications in Information and Systems, 2012,12(3):221-232.[doi:10.4310/CIS.2012.v12.n3.a3]
    [21] Ahn S, Korattikara A, Liu N, Rajan S, Welling M. Large-Scale distributed bayesian matrix factorization using stochastic gradient MCMC. In:Proc. of the SIGKDD 2015. 2015. 9-18.[doi:10.1145/2783258.2783373]
    [22] Koren Y. Factorization meets the neighborhood:A multifaceted collaborative filtering model. In:Proc. of the SIGKDD 2008. 2008. 426-434.[doi:10.1145/1401890.1401944]
    [23] Koren Y, Bell R, Volinsky C. Matrix factorization techniques for recommender systems. Computer, 2009,42(8):30-37.[doi:10.1109/MC.2009.263]
    [24] Rendle S, Schmidt-Thieme L. Pairwise interaction tensor factorization for personalized tag recommendation. In:Proc. of the WSDM 2010. 2010. 81-90.[doi:10.1145/1718487.1718498]
    [25] Rendle S, Freudenthaler C, Schmidt-Thieme L. Factorizing personalized Markov chains for next-basket recommendation. In:Proc. of the WWW 2010. 2010. 811-820.[doi:10.1145/1772690.1772773]
    [26] Ahn S, Shahbaba B, Welling M. Distributed stochastic gradient mcmc. In:Proc. of the ICML 2014. 2014. 1044-1052.
    [27] Sun H, Wang W, Shi Z. Parallel factorization machine recommended algorithm based on mapreduce. In:Proc. of the SKG 2014. 2014. 120-123.[doi:10.1109/SKG.2014.26]
    [28] Li M, Andersen DG, Smola A, Yu K. Communication efficient distributed machine learning with the parameter server. In:Proc. of the NIPS 2014. 2014. 19-27.
    [29] Hewitt C, Bishop P, Steiger R. A universal modular actor formalism for artificial intelligence. In:Proc. of the IJCAI'73. 1973. 235-245.
    [30] Hong L, Doumith AS, Davison BD. Co-Factorization machines:Modeling user interests and predicting individual decisions in Twitter. In:Proc. of the WSDM 2013. 2013. 557-566.[doi:10.1145/2433396.2433467]
    [31] Rendle S. Social network and click-through prediction with factorization machines. In:Proc. of the KDD 2012. 2012.
    [32] Rendle S, Gantner Z, Freudenthaler C, Schmidt-Thieme L. Fast context-aware recommendations with factorization machines. In:Proc. of the SIGIR 2011. 2011. 635-644.[doi:10.1145/2009916.2010002]
    [33] Rendle S. Scaling factorization machines to relational data. Proc. of the VLDB Endowment, 2013,6(5):337-348.[doi:10.14778/2535573.2488340]
    [34] Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I. Spark:Cluster computing with working sets. In:Proc. of the HotCloud 2010. 2010. 10-10.
    [35] Low Y, Bickson D, Gonzalez J, Guestrin C, Kyrola A, Hellerstein J. Distributed graphLab:A framework for machine learning and data mining in the cloud. Proc. of the VLDB Endowment, 2012. 716-727.[doi:10.14778/2212351.2212354]
    [36] Malewicz G, Austern MH, Bik AJ, Dehnert JC, Horn I, Leiser N, Czajkowski G. Pregel:A system for large-scale graph processing. In:Proc. of the SIGMOD 2010. 2010. 135-146.[doi:10.1145/1582716.1582723]
    [37] Xing EP, Ho Q, Dai W, Kim JK, Wei J, Lee S, Zheng X, Xie P, Kumar A, Yu Y. Petuum:A new platform for distributed machine learning on big data. In:Proc. of the SIGKDD 2015. 2015. 1335-1344.[doi:10.1109/TBDATA.2015.2472014]
    [38] Dean J, Corrado GS, Monga R, Chen K, Devin M, Le QV, Mao MZ, Ranzato M, Senior A, Tucker P. Large scale distributed deep networks. In:Proc. of the NIPS 2012. 2012. 1223-1231.
    [39] Jiang J, Yu L, Jiang J, Liu Y, Cui B. Angel:A new large-scale machine learning system. National Science Review, 2017.[doi:10.1093/nsr/nwx018]
    [40] Li M, Andersen DG, Park JW, Smola AJ, Ahmed A, Josifovski V, Long J, Shekita EJ, Su BY. Scaling distributed machine learning with the parameter server. In:Proc. of the OSDI 2014. 2014. 583-598.[doi:10.1145/2640087.2644155]
    [41] Lan MW, Li CP, Wang SQ, Zhao KK, Lin ZX, Zou BY, Chen H. Survey of sign prediction algorithm in signed social networks. Journal of Computer Research and Development, 2015,52(2):410-422(in Chinese with English abstract).[doi:10.7544/issn1000-1239.2015.20140210]
    附中文参考文献:
    [41] 蓝梦微,李翠平,王绍卿,赵衎衎,林志侠,邹本友,陈红.符号社会网络中正负关系预测算法研究综述.计算机研究与发展,2015, 52(2):410-422.[doi:10.7544/issn1000-1239.2015.20140210]
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

赵衎衎,张静,张良富,李翠平,陈红.基于端到端分布式框架的符号网络预测方法.软件学报,2018,29(3):614-626

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2017-07-31
  • 最后修改日期:2017-09-05
  • 在线发布日期: 2017-12-05
文章二维码
您是第19938456位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号