一种融合节点先验信息的图表示学习方法
作者:
作者简介:

温雯(1981-),女,江西赣州人,博士,副教授,CCF专业会员,主要研究领域为数据挖掘,机器学习;郝志峰(1968-),男,博士,教授,博士生导师,主要研究领域为数据挖掘,机器学习;黄家明(1992-),男,硕士,CCF学生会员,主要研究领域为数据挖掘,机器学习;王丽娟(1978-),女,博士,副教授,CCF专业会员,主要研究领域为数据挖掘,机器学习;蔡瑞初(1983-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为因果关系,数据挖掘,机器学习.

通讯作者:

黄家明,E-mail:hjm_dmir@hotmail.com

中图分类号:

TP18

基金项目:

NSFC-广东联合基金(U1501254);国家自然科学基金(61472089,61572143,61502108)


Graph Embedding by Incorporating Prior Knowledge on Vertex Information
Author:
Fund Project:

NSFC-Guangdong Joint Found (U1501254);Natural Science Foundation of China (61472089, 61572143, 61502108)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [34]
  • |
  • 相似文献 [20]
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    图表示学习是实现各类图挖掘任务的基础.现实中的图数据不仅包含复杂的网络结构,还包括多样化的节点信息.如何将网络结构和节点信息更加有效地融入图的表示学习中,是一个重要的问题.为了解决这一问题,基于深度学习,提出了融合节点先验信息的图表示学习方法.该方法将节点特征作为先验知识,要求学习到的表示向量同时保持图数据中的网络结构相似性和节点特征相似性.该方法的时间复杂度为O(|V|),其中,|V|为图节点数量,表明该方法适用于大规模图数据分析.同时,在多个数据集上的实验结果表明:所提出的方法相比目前流行的几种基线方法,在分类任务上能够获得良好而稳定的优势.

    Abstract:

    Graph embedding is a fundamental technique for graph data mining. The real-world graphs not only consist of complex network structures, but also contain diverse vertex information. How to integrate the network structure and vertex information into the graph embedding procedure is a big challenge. To deal with this challenge, a graph embedding method, which is based on deep leaning technique while taking into account the prior knowledge on vertices information, is proposed in this paper. The basic idea of the proposed method is to regard the vertex features as the prior knowledge, and learn the representation vector through optimizing an objective function that simultaneously keeps the similarity of network structure and vertex features. The time complexity of the proposed method is O(|V|), where|V|is the count of vertices in the graph. This indicates the proposed method is suitable for large-scale graph analysis. Experiments on several data sets demonstrate that, compared with the state-of-art baselines, the proposed method is able to achieve favorable and stable results for the task of node classification.

    参考文献
    [1] Sen P, Namata G, Bilgic M, Getoor L, Galligher B, Eliassi-Rad T. Collective classification in network data. AI Magazine, 2008, 29(3):93.[doi:10.1609/aimag.v29i3.2157]
    [2] Zhou DY, Huang JY, Scholkopf B. Learning from labeled and unlabeled data on a directed graph. In:Raedt LD, Wrobe S, eds. Proc. of the 22nd Int'l Conf. on Machine Learning. 2005. 1036-1043.[doi:10.1145/1102351.1102482]
    [3] Bhagat S, Cormode G, Muthukrishnan S. Node classification in social networks. In:Aggarwal C, ed. Proc. of the Social Network Data Analytics. Boston:Springer-Verlag, 2011. 115-148.[doi:10.1007/978-1-4419-8462-3_5]
    [4] Tu C, Liu Z, Sun M. Inferring correspondences from multiple sources for microblog user tags. In:Huang H, Liu T, Zhang HP, Tang J, eds. Proc. of the Chinese National Conf. on Social Media Processing. Berlin, Heidelberg:Springer-Verlag, 2014. 1-12.[doi:10. 1007/978-3-662-45558-6_1]
    [5] Xing QL, Liu L, Liu YQ, Zhang M, Ma SP. Study on user tags in Weibo. Ruan Jian Xue Bao/Journal of Software, 2015,26(7):1626-1637(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4655.htm[doi:10.13328/j.cnki.jos.004655]
    [6] Bhuyan MH, Bhattacharyya DK, Kalita JK. Network anomaly detection:Methods, systems and tools. IEEE Communications Surveys & Tutorials, 2014,16(1):303-336.[doi:10.1109/SURV.2013.052213.00046]
    [7] Lü L, Zhou T. Link prediction in complex networks:A survey. Physica A:Statistical Mechanics and Its Applications, 2011,390(6):1150-1170.[doi:10.1016/j.physa.2010.11.027]
    [8] Liben-Nowell D, Kleinberg J. The link-prediction problem for social networks. Journal of the Association for Information Science and Technology, 2007,58(7):1019-1031.[doi:10.1002/asi.20591]
    [9] Yu X, Ren X, Sun Y, Gu Q, Sturt B, Khandelwal U, Norick B, Han J. Personalized entity recommendation:A heterogeneous information network approach. In:Carterette B, ed. Proc. of the 7th ACM Int'l Conf. on Web Search and Data Mining. New York:ACM Press, 2014. 283-292.[doi:10.1145/2556195.2556259]
    [10] Meng XW, Liu SD, Zhang YJ, Hu X. Research on social recommender systems. Ruan Jian Xue Bao/Journal of Software, 2015, 26(6):1356-1372(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4831.htm[doi:10.13328/j.cnki.jos.004831]
    [11] Liu ZY, Sun MS, Lin YK, Xie RB. Knowledge representation learning:A review. Ji Suan Ji Yan Jiu Yu Fa Zhan/Journal of Computer Research and Development, 2016,53(2):247-261(in Chinese with English abstract).[doi:10.7544/issn1000-1239.2016. 20160020]
    [12] Roweis ST, Saul LK. Nonlinear dimensionality reduction by locally linear embedding. Science, 2000,290(5500):2323-2326.[doi:10.1126/science.290.5500.2323]
    [13] Belkin M, Niyogi P. Laplacian eigenmaps and spectral techniques for embedding and clustering. In:Suzanna B, Sebastian T, Klaus O, eds. Proc. of the Advances in Neural Information Processing Systems. Cambridge:MIT Press, 2002. 585-591.
    [14] Chen M, Yang Q, Tang X. Directed graph embedding. In:Rajeev S, Mehta H, Bagga RK, eds. Proc. of the IJCAI. San Francisco:Morgan Kaufmann Publishers Inc., 2007. 2707-2712.
    [15] Tang L, Liu H. Relational learning via latent social dimensions. In:Osmar RZ, ed. Proc. of the 15th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. New York:ACM Press, 2009. 817-826.[doi:10.1145/1557019.1557109]
    [16] Le TM, Lauw HW. Probabilistic latent document network embedding. In:Kumar R, ed. Proc. of the IEEE Int'l Conf. on Data Mining (ICDM). Shenzhen:IEEE, 2014. 270-279.[doi:10.1109/ICDM.2014.119]
    [17] Jacob Y, Denoyer L, Gallinari P. Learning latent representations of nodes for classifying in heterogeneous social networks. In:Carterette B, ed. Proc. of the 7th ACM Int'l Conf. on Web Search and Data Mining. New York:ACM Press, 2014. 373-382.[doi:10.1145/2556195.2556225]
    [18] Bourigault S, Lagnier C, Lamprier S, Denoyer L, Gallinari P. Learning social network embeddings for predicting information diffusion. In:Carterette B, ed. Proc. of the 7th ACM Int'l Conf. on Web Search and Data Mining. New York, 2014. 393-402.[doi:10.1145/2556195.2556216]
    [19] Nallapati RM, Ahmed A, Xing EP, Cohen WW. Joint latent topic models for text and citations. In:Li Y, Liu B, Sarawagi S, eds. Proc. of the 14th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. New York:ACM Press, 2008. 542-550.[doi:10.1145/1401890.1401957]
    [20] Chang J, Blei D. Relational topic models for document networks. In:Dyk DV, Welling M, eds. Proc. of the Int'l Conf. on Artificial Intelligence and Statistics. Florida:PMLR, 2009. 81-88.
    [21] Le T, Lauw HW. Probabilistic latent document network embedding. In:Kumar R, ed. Proc. of the 2014 IEEE Int'l Conf. on Data Mining (ICDM). Shenzhen:IEEE, 2014. 270-279.[doi:10.1109/ICDM.2014.119]
    [22] Perozzi B, Al-Rfou R, Skiena S. Deepwalk:Online learning of social representations. In:Perlich C, Saka E, Shen D, Lee K, Li Y, eds. Proc. of the 20th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. New York:ACM Press, 2014. 701-710.[doi:10.1145/2623330.2623732]
    [23] Grover A, Leskovec J. node2vec:Scalable feature learning for networks. In:Proc. of the 22nd ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. ACM Press, 2016. 855-864.[doi:10.1145/2939672.2939754]
    [24] Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q. Line:Large-Scale information network embedding. In:Gangemi A, ed. Proc. of the 24th Int'l World Wide Web Conf. on Steering Committee Republic and Canton of Geneva. 2015. 1067-1077.[doi:10.1145/2736277.2741093]
    [25] Yang C, Liu Z, Zhao D, Sun M, Chang EY. Network representation learning with rich text information. In:Yang Q, Wooldridge M, eds. Proc. of the IJCAI. 2015. 2111-2117.
    [26] Yang Z, Cohen WW, Salakhutdinov R. Revisiting semi-supervised learning with graph embeddings. In:Balcan MF, Weinberger KQ, eds. Proc. of the 33rd Int'l Conf. on Machine Learning (ICML 2016). New York:JMLR.org, 2016. 40-48.
    [27] Chojnacki W, Brooks MJ. A note on the locally linear embedding algorithm. Int'l Journal of Pattern Recognition and Artificial Intelligence, 2009,23(8):1739-1752.[doi:10.1142/S0218001409007752]
    [28] Newman ME. Modularity and community structure in networks. Proc. of the National Academy of Sciences, 2006,103(23):8577-8582.[doi:10.1073/pnas.0601602103]
    [29] Iwata T, Saito K, Ueda N, Stromsten S, Griffiths TL, Tenenbaum JB. Parametrice mbedding for class visualization. Neural Computation, 2007,19(9):2536-2556.[doi:10.1162/neco.2007.19.9.2536]
    [30] Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. In:Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, eds. Proc. of the Advances in Neural Information Processing Systems. MIT Press, 2013. 3111-3119.
    附中文参考文献:
    [5] 邢千里,刘列,刘奕群,张敏,马少平.微博中用户标签的研究.软件学报,2015,26(7):1626-1637. http://www.jos.org.cn/1000-9825/4655.htm[doi:10.13328/j.cnki.jos.004655]
    [10] 孟祥武,刘树栋,张玉洁,胡勋.社会化推荐系统研究.软件学报,2015,26(6):1356-1372. http://www.jos.org.cn/1000-9825/4831.htm[doi:10.13328/j.cnki.jos.004831]
    [11] 刘知远,孙茂松,林衍凯,谢若冰.知识表示学习研究进展.计算机研究与发展,2016,53(2):247-261.[doi:10.7544/issn1000-1239. 2016.20160020]
    引证文献
引用本文

温雯,黄家明,蔡瑞初,郝志峰,王丽娟.一种融合节点先验信息的图表示学习方法.软件学报,2018,29(3):786-798

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2017-07-24
  • 最后修改日期:2017-09-05
  • 在线发布日期: 2017-12-05
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号