互信息与多条元路径融合的异质网络表示学习方法
作者:
作者简介:

贾霄生(1996-),男,硕士生,主要研究领域为学术网络分析,异质信息网络表示学习.;赵中英(1983-),女,博士,副教授,博士生导师,CCF高级会员,主要研究领域为网络表示学习,社交网络分析与挖掘.;李超(1984-),男,博士,副教授,博士生导师,CCF高级会员,主要研究领域为异质图神经网络分析,自然语言处理,表示学习.;栾文静(1987-),女,博士,讲师,主要研究领域为基于位置的社交网络,推荐系统,机器学习.;梁永全(1983-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为分布式人工智能,数据挖掘,机器学习,多媒体信息智能处理.

通讯作者:

赵中英,Email:zzysuin@163.com

中图分类号:

TP18

基金项目:

国家自然科学基金(62072288,61702306)


Heterogeneous Network Representation Learning Method Fusing Mutual Information and Multiple Meta-paths
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [48]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    异质信息网络能够对真实世界的诸多复杂应用场景进行建模,其表示学习研究也得到了众多学者的广泛关注.现有的异质网络表示学习方法大多基于元路径来捕获网络中的结构和语义信息,已经在后续的网络分析任务中取得很好的效果.然而,此类方法忽略了元路径的内部节点信息和不同元路径实例的重要性;仅能捕捉到节点的局部信息.因此,提出互信息与多条元路径融合的异质网络表示学习方法.首先,利用一种称为关系旋转编码的元路径内部编码方式,基于相邻节点和元路径上下文节点捕获异质信息网络的结构和语义信息,采用注意力机制来建模各元路径实例的重要性;然后,提出一种互信息最大化与多条元路径融合的无监督异质网络表示学习方法,使用互信息捕获全局信息以及全局信息和局部信息之间的联系.最后,在两个真实数据集上进行实验,并与当前主流的算法进行比较分析.结果表明,所提方法在节点分类和聚类任务上性能都有提升,甚至和一些半监督算法相比也表现出强劲性能.

    Abstract:

    Heterogeneous information networks can be used for modeling several applications in the real world. Their representation learning has received extensive attention from scholars. Most of the representation learning methods extract structural and semantic information based on meta-paths and their effectiveness in network analysis have been proved. However, these methods ignore the node internal information and different degrees of importance of meta-path instances. Besides, they can capture only the local node information. Thus, this study proposes a heterogeneous network representation learning method fusing mutual information and multiple meta-paths. First, a meta-path internal encoding method called relational rotation encoding is used, which captures the structural and semantic information of the heterogeneous information network according to adjacent nodes and meta-path context nodes. It uses an attention mechanism to model the importance of each meta-path instance. Then, an unsupervised heterogeneous network representation learning method fusing mutual information maximization and multiple meta-paths is proposed and mutual information can capture both global and local information. Finally, experiments are conducted on two real datasets. Compared with the current mainstream algorithms as well as some semi-supervised algorithms, the results show that the proposed method has better performance on node classification and clustering.

    参考文献
    [1] Zhao ZY, Zhou H, Qi L, Chang L, Zhou MC. Inductive representation learning via CNN for partially-unseen attributed networks. IEEE Trans. on Network Science and Engineering, 2021, 8(1): 695–706. [doi: 10.1109/TNSE.2020.3048902
    [2] Zhao ZY, Zhou H, Li C, Tang J, Zeng QT. DeepEmLAN: Deep embedding learning for attributed networks. Information Sciences, 2021, 543: 382–397. [doi: 10.1016/j.ins.2020.07.001
    [3] Bourigault S, Lamprier S, Gallinari P. Representation learning for information diffusion through social networks: An embedded cascade model. In: Proc. of the 9th ACM Int’l Conf. on Web Search and Data Mining. San Francisco: ACM, 2016. 573–582.
    [4] Tan QY, Liu NH, Hu X. Deep representation learning for social network analysis. Frontiers in Big Data, 2019, 2: 2. [doi: 10.3389/fdata.2019.00002
    [5] Wang Q, Mao ZD, Wang B, Guo L. Knowledge graph embedding: A survey of approaches and applications. IEEE Trans. on Knowledge and Data Engineering, 2017, 29(12): 2724–2743. [doi: 10.1109/TKDE.2017.2754499
    [6] Ji GL, He SZ, Xu LH, Liu K, Zhao J. Knowledge graph embedding via dynamic mapping matrix. In: Proc. of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th Int’l Joint Conf. on Natural Language Processing (Vol.1: Long Papers). Beijing: Association for Computational Linguistics, 2015. 687–696.
    [7] Alshahrani M, Khan MA, Maddouri O, Kinjo AR, Queralt-Rosinach N, Hoehndorf R. Neuro-symbolic representation learning on biological knowledge graphs. Bioinformatics, 2017, 33(17): 2723–2730. [doi: 10.1093/bioinformatics/btx275
    [8] Jin ST, Zeng XX, Xia F, Huang W, Liu XR. Application of deep learning methods in biological networks. Briefings in Bioinformatics, 2021, 22(2): 1902–1917. [doi: 10.1093/bib/bbaa043
    [9] Dong YX, Chawla NV, Swami A. Metapath2vec: Scalable representation learning for heterogeneous networks. In: Proc. of the 23rd ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. Halifax: ACM, 2017. 135–144.
    [10] Park C, Kim D, Han JW, Yu H. Unsupervised attributed multiplex network embedding. In: Proc. of the AAAI Conf. on Artificial Intelligence, 2020, 34(4): 5371–5378. [doi: 10.1609/aaai.v34i04.5985
    [11] Wang X, Ji HY, Shi C, Wang B, Ye YF, Cui P, Yu PS. Heterogeneous graph attention network. In: Proc. of the World Wide Web Conf. San Francisco: ACM, 2019. 2022–2032.
    [12] Ren YX, Liu B, Huang C, Dai P, Bo LF, Zhang JW. Heterogeneous deep graph infomax. arXiv:1911.08538, 2020.
    [13] Pazzani MJ, Billsus D. Content-based recommendation systems. In: Brusilovsky P, Kobsa A, Nejdl W, eds. The Adaptive Web. Lecture Notes in Computer Science, Vol. 4321. Berlin, Heidelberg: Springer, 2007. 325–341.
    [14] Deshpande M, Karypis G. Item-based top-N recommendation algorithms. ACM Trans. on Information Systems (TOIS), 2004, 22(1): 143–177. [doi: 10.1145/963770.963776
    [15] Shi C, Hu BB, Zhao WX, Yu PS. Heterogeneous information network embedding for recommendation. IEEE Trans. on Knowledge and Data Engineering, 2019, 31(2): 357–370. [doi: 10.1109/TKDE.2018.2833443
    [16] McCormick C. Word2Vec tutorial-the skip-gram model. 2016. http://mccormickml.com/2016/04/19/word2vec-tutorial-the-skip-gram-model.
    [17] Wang X, Zhang YD, Shi C. Hyperbolic heterogeneous information network embedding. Proc. of the 2019 AAAI Conf. on Artificial Intelligence, 2019, 33(1): 5337–5344. [doi: 10.1609/aaai.v33i01.33015337
    [18] Wang LL, Gao CY, Huang CH, Liu RB, Ma WC, Vosoughi S. Embedding heterogeneous networks into hyperbolic space without meta-path. Proc. of the 2021 AAAI Conf. on Artificial Intelligence, 2021, 35(11): 10147–10155
    [19] Wang DX, Cui P, Zhu WW. Structural deep network embedding. In: Proc. of the 22nd ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. San Francisco: ACM, 2016. 1225–1234.
    [20] Chairatanakul N, Liu X, Murata T. PGRA: Projected graph relation-feature attention network for heterogeneous information network embedding. Information Sciences, 2021, 570: 769–794. [doi: 10.1016/j.ins.2021.04.070
    [21] Veličković P, Fedus W, Hamilton WL, Liò P, Bengio Y, Hjelm RD. Deep graph infomax. arXiv:1809.10341, 2018.
    [22] Hjelm RD, Fedorov A, Lavoie-Marchildon S, Grewal K, Bachman P, Trischler A, Bengio Y. Learning deep representations by mutual information estimation and maximization. arXiv:1808.06670, 2019.
    [23] Belghazi MI, Baratin A, Rajeshwar S, Ozair S, Bengio Y, Courville A, Hjelm RD. Mutual information neural estimation. In: Proc. of the 35th Int’l Conf. on Machine Learning. Stockholm: PMLR, 2018. 531–540.
    [24] Perozzi B, Al-Rfou R, Skiena S. Deepwalk: Online learning of social representations. In: Proc. of the 20th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. New York: ACM, 2014. 701–710.
    [25] Grover A, Leskovec J. Node2vec: Scalable feature learning for networks. In: Proc. of the 22nd ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. San Francisco: ACM, 2016. 855–864.
    [26] Yang C, Liu ZY, Zhao DL, Sun MS, Chang EY. Network representation learning with rich text information. In: Proc. of the 24th Int’l Conf. on Artificial Intelligence. Buenos Aires: AAAI Press, 2015. 21112117.
    [27] Fu TY, Lee WC, Lei Z. Hin2vec: Explore meta-paths in heterogeneous information networks for representation learning. In: Proc. of the 2017 ACM on Conf. on Information and Knowledge Management. Singapore: ACM, 2017. 1797–1806.
    [28] Wang RJ, Shi C, Zhao TY, Wang X, Ye YF. Heterogeneous information network embedding with adversarial disentangler. IEEE Trans. on Knowledge and Data Engineering, 2022, 34(7): 3225–3240.
    [29] Zhang XY, Chen LH. mSHINE: A multiple-meta-paths simultaneous learning framework for heterogeneous information network embedding. arXiv:2104.02433, 2021.
    [30] Yang YM, Guan ZY, Li JX, Zhao W, Cui JT, Wang Q. Interpretable and efficient heterogeneous graph convolutional network. IEEE Trans. on Knowledge and Data Engineering, 2021: 1–14.
    [31] Wang HW, Wang J, Wang JL, Zhao M, Zhang WN, Zhang FZ, Xie X, Guo MY. GraphGAN: Graph representation learning with generative adversarial nets. Proc. of the 2018 AAAI Conf. on Artificial Intelligence, 2018, 32(1): 2508–2515
    [32] Wan GJ, Du B, Pan SR, Haffari G. Reinforcement learning based meta-path discovery in large-scale heterogeneous information networks. Proc. of the 2020 AAAI Conf. on Artificial Intelligence, 2020, 34(4): 6094–6101. [doi: 10.1609/aaai.v34i04.6073
    [33] Zhang CX, Song DJ, Huang C, Swami A, Chawla NV. Heterogeneous graph neural network. In: Proc. of the 25th ACM SIGKDD Int’l Conf. on Knowledge Discovery & Data Mining. Anchorage: ACM, 2019. 793–803.
    [34] Wang HW, Zhang FZ, Hou M, Xie X, Guo MY, Liu Q. SHINE: Signed heterogeneous information network embedding for sentiment link prediction. In: Proc. of the 11th ACM Int’l Conf. on Web Search and Data Mining. Marina Del Rey: ACM, 2018. 592–600.
    [35] Hu BB, Fang Y, Shi C. Adversarial learning on heterogeneous information networks. In: Proc. of the 25th ACM SIGKDD Int’l Conf. on Knowledge Discovery & Data Mining. Anchorage: ACM, 2019. 120–129.
    [36] Fu XY, Zhang JN, Meng ZQ, King I. MAGNN: Metapath aggregated graph neural network for heterogeneous graph embedding. In: Proc. of the 2020 Web Conf. Taipei: ACM, 2020. 2331–2341.
    [37] Li X, Ding DH, Kao B, Sun YZ, Mamoulis N. Leveraging meta-path contexts for classification in heterogeneous information networks. In: Proc. of the 37th IEEE Int’l Conf. on Data Engineering (ICDE). Chania: IEEE, 2021. 912–923.
    [38] Hong HT, Guo HT, Lin YC, Yang XQ, Li Z, Ye JP. An attention-based graph neural network for heterogeneous structural learning. Proc. of the 2020 AAAI Conf. on Artificial Intelligence, 2020, 34(4): 4132–4139. [doi: 10.1609/aaai.v34i04.5833
    [39] Zhou J, Cui GQ, Hu SD, Zhang ZY, Yang C, Liu ZY, Wang LF, Li CC, Sun MS. Graph neural networks: A review of methods and applications. arXiv:1812.08434, 2021.
    [40] Mnih V, Heess N, Graves A, Kavukcuoglu K. Recurrent models of visual attention. arXiv:1406.6247, 2014.
    [41] Yang YH, Wu L, Hong RC, Zhang K, Wang M. Enhanced graph learning for collaborative filtering via mutual information maximization. In: Proc. of the 44th Int’l ACM SIGIR Conf. on Research and Development in Information Retrieval. Montréal: ACM, 2021. 71–80.
    [42] Sun YZ, Han JW, Yan XF, Yu PS, Wu TY. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. Proc. of the VLDB Endowment, 2011, 4(11): 992–1003. [doi: 10.14778/3402707.3402736
    [43] Sun ZQ, Deng ZH, Nie JY, Tang J. Rotate: Knowledge graph embedding by relational rotation in complex space. arXiv:1902.10197, 2019.
    [44] Edmunds DE, Evans WD. Spectral Theory and Differential Operators. Oxford: Oxford University Press, 2018.
    [45] Donsker MD, Varadhan SRS. Asymptotic evaluation of certain Markov process expectations for large time, I. Communications on Pure and Applied Mathematics, 1975, 28(1): 1–47. [doi: 10.1002/cpa.3160280102
    [46] Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. arXiv:1609.02907, 2017.
    [47] Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. Graph attention networks. arXiv:1710.10903, 2018.
    [48] MacQueen JB. Some methods for classification and analysis of multivariate observations. In: Proc. of the 5th Berkeley Symp. on Mathematical Statistics and Probability. Berkeley: University of California Press, 1967. 281–297.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

贾霄生,赵中英,李超,栾文静,梁永全.互信息与多条元路径融合的异质网络表示学习方法.软件学报,2023,34(7):3256-3271

复制
分享
文章指标
  • 点击次数:1193
  • 下载次数: 3307
  • HTML阅读次数: 1657
  • 引用次数: 0
历史
  • 收稿日期:2021-06-05
  • 最后修改日期:2021-09-05
  • 在线发布日期: 2022-03-24
  • 出版日期: 2023-07-06
文章二维码
您是第19987095位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号