时序图节点嵌入策略的研究
作者:
作者简介:

吴安彪(1993-),男,博士生,CCF学生会员,主要研究领域为图数据库,图神经网络.
袁野(1981-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为大数据管理,数据库理论与系统.
马玉亮(1990-),男,博士,主要研究领域为图数据库,基于位置的社交网络(LBSN)挖掘.
王国仁(1966-),男,博士,教授,博士生导师,CCF杰出会员,主要研究领域为不确定数据管理,数据密集型计算,可视媒体数据管理与分析,非结构化数据管理,分布式查询处理与优化技术,生物信息学.

通讯作者:

袁野,E-mail:yuanye@mail.neu.edu.cn

基金项目:

国家自然科学基金(61932004,62002054,61732003,61729201);中央高校基本科研基金(N181605012);中国博士后科学基金(2020M670780)


Node Embedding Research over Temporal Graph
Author:
Fund Project:

National Natural Science Foundation of China (61932004, 62002054, 61732003, 61729201); Research Funds for the Central Universities (N181605012); China Postdoctoral Science Foundation Funded Project (2020M670780)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [35]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    相较于传统的图数据分析方法,图嵌入算法是一种面向图节点的新型图数据分析策略.其旨在通过将图节点向量化表达,进而在节点向量基础上,利用神经网络相关技术,更有效地进行图数据分析或挖掘工作,如在节点分类、链接预测及交通流预测等经典问题上效果显著.虽然研究者们在图嵌入方面已取得了诸多成果,但是面向时序图的节点嵌入问题却未被充分重视.在先前研究工作的基础上,结合信息在时序图中的传播特性,提出一种对时序图节点进行自适应嵌入表达的方法ATGEB (adaptive temporal graph embedding).首先,为了解决不同类型时序图节点活跃程度不同的问题,通过设计一种自适应方式对其活跃时刻进行聚类;而后,在此基础上设计一种游走模型,用以保存节点对之间的时间关系,并将节点游走序列保存在双向多叉树上,进而可以更快速地得到节点时间相关的游走序列;最后,在基于节点游走特性和图拓扑结构的基础上对节点向量进行重要节点采样,以便在尽可能短的时间内训练出满足需求的网络模型.通过充分的实验证明:面向时序图的嵌入策略相较于现流行的嵌入方法,在时序图时序中节点间时序可达性检测以及节点分类等问题上得出了更好的实验效果.

    Abstract:

    Compared with the traditional graph data analysis method, graph embedding algorithm provides a new graph data analysis strategy. It aims to encoder graph nodes into vectors to perform graph data analysis or mining tasks more effectively by using neural network related technologies. And some classic tasks have been improved significantly by graph embedding methods, such as node classification, link prediction, and traffic flow prediction. Although plenty of works have been proposed by former researchers in graph embedding, the nodes embedding problem over temporal graph has been seldom studied. This study proposed an adaptive temporal graph embedding, ATGED, attempting to encoder temporal graph nodes into vectors by combining previous research works and the information propagation characteristics together. First, an adaptive cluster method is proposed by solving the situation that nodes active frequency is different in different types of graph. Then, a new node walk strategy is designed in order to store the time sequence between nodes, and also the walking list will be stored in bidirectional multi-tree in walking process to get complete walking lists fast. Last, based on the basic walking characteristics and graph topology, an important node sampling strategy is proposed to train the satisfied neural network as soon as possible. Sufficient experiments demonstrate that the proposed method surpasses existing embedding methods in terms of node clustering, reachability prediction, and node classification in temporal graphs.

    参考文献
    [1] Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G. The graph neural network model. IEEE Trans. on Neural Networks, 2009,20(1):61-80.
    [2] Perozzi B, Al-Rfou R, Skiena S. DeepWalk:Online learning of social representations. In:Proc. of the 20th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. 2014. 701-710.
    [3] Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. arXiv:1301.3781v3.
    [4] Tang J, Qu M, Wang MZ, Zhang M, Yan J, Mei QZ. LINE:Large-scale information network embedding. In:Proc. of the 24th Int'l Conf. on World Wide Web. 2015. 1067-1077.
    [5] Tang J, Qu M, Mei QZ. PTE:Predictive text embedding through large-scale heterogeneous text networks. In:Proc. of the 21st ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. 2015. 1165-1174.
    [6] Grover A, Leskovec J. node2vec:Scalable feature learning for networks. In:Proc. of the 22nd ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. 2016. 855-864.
    [7] Leonardo FRR, Pedro HPS, Daniel RF. struc2vec:Learning node representations from structural identity. In:Proc. of the 23rd ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. 2017. 385-394.
    [8] Qiu JZ, Dong YX, Ma H, Li J, Wang KS, Tang J. Network embedding as matrix factorization:Unifying DeepWalk, LINE, PTE, and node2vec. In:Proc. of the 11th ACM Int'l Conf. on Web Search and Data Mining. 2018. 459-467.
    [9] Hamilton W, Ying Z, Leskovec J. Inductive representation learning on large graphs. In:Advances in Neural Information Processing Systems. 2017. 1024-1034.
    [10] Kipf TN, Welling M. Semi-Supervised classification with graph convolutional networks. In:Proc. of the ICLR (Poster). 2017.
    [11] Velickovic P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. Graph attention networks. In:Proc. of the ICLR (Poster). 2018.
    [12] Wang YS, Yuan Y, Ma YL, Wang GR. Time-dependent graphs:Definitions, applications, and algorithms. Data Science and Engineering, 2019,4(4):352-366.
    [13] Takaguchi T, Yano Y, Yoshida Y. Coverage centralities for temporal networks. European Physical Journal B, 2016,89(2):35.
    [14] Frand D, Masoud TO, Jörg-Rüdiger S. Shortest paths in FIFO time-dependent networks. Algorithmica, 2012,62(1-2):416-435.
    [15] Rossi L, Musolesi M, Torsello A. On the k-anonymization of time-varying and multi-layer social graphs. In:Proc. of the 9th Int'l Conf. on Web and Social Media. 2015. 377-386.
    [16] Przytycka TM, Singh M, Slonim DK. Toward the dynamic interactome:It's about time. Briefings in Bioinformatics, 2010,11(1):15-29.
    [17] Han JD, Bertin N, Hao T, et al. Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature, 2004,430(6995):88-93.
    [18] Lèbre S, Becq J, Devaux F, et al. Statistical inference of the time-varying structure of gene-regulation networks. BMC Systems Biology, 2010,4(1):1-16.
    [19] Wu H, Cheng J, Ke Y, et al. Efficient algorithms for temporal path computation. IEEE Trans. on Knowledge & Data Engineering, 2016,28(11):2927-2942.
    [20] Li J, Han ZC, Cheng H, Su J, Wang PY, Zhang JF, Pan LJ. Predicting path failure in time-evolving graphs. In:Proc. of the 25th ACM SIGKDD Int'l Conf. on Knowledge Discovery & Data Mining. 2019. 1279-1289.
    [21] Hu JL, Yang B, Guo CJ, Jensen CS, Xiong H. Stochastic origin-destination matrix forecasting using dual-stage graph convolutional, recurrent neural networks. In:Proc. of the IEEE Int'l Conf. on Data Engineering. 2020. 1417-1428.
    [22] Kumar S, Hamilton WL, Leskovec J, Jurafsky D. Community interaction and conflict on the Web. In:Proc. of the World Wide Web Conf. 2018. 933-943.
    [23] Panzarasa P, Opsahl T, Carley KM. Patterns and dynamics of users' behavior and interaction:Network analysis of an online community. Journal of the American Society for Information Science and Technology, 2009,60(5):911-932.
    [24] Bai C, Kumar S, Leskovec J, Metzger M, Nunamaker JF, Subrahmanian VS. Predicting visual focus of attention in multi-person discussion videos. In:Proc. of the Int'l Joint Conf. on Artificial Intelligence. 2019. 4504-4510.
    [25] Wu HH, Huang YZ, Cheng J, Li JF, Ke YP. Reachability and time-based path queries in temporal graphs. In:Proc. of the IEEE Int'l Conf. on Data Engineering. 2016. 145-156.
    [26] Yuan Y, Lian X, Wang GR, Ma YL, Wang YS. Constrained shortest path query in a large time-dependent graph. Proc. of the VLDB Endow, 2019,12(10):1058-1070.
    [27] Yuan Y, Lian X, Wang GR, Chen L, Ma YL, Wang YS. Weight-Constrained route planning over time-dependent graphs. In:Proc. of the IEEE Int'l Conf. on Data Engineering. 2019. 914-925.
    [28] Bron C, Kerbosch J. Finding all cliques of an undirected graph (algorithm 457). Commun. ACM, 1973,16(9):575-576.
    [29] Chen W, Wang YJ, Yang SY. Efficient influence maximization in social networks. In:Proc. of the 15th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. 2009. 199-207.
    [30] Kumar S, Spezzano F, Subrahmanian VS, Faloutsos C. Edge weight prediction in weighted signed networks. In:Proc. of the IEEE Int'l Conf. on Data Mining. 2016. 221-230.
    [31] Kumar S, Hooi B, Makhija D, Kumar M, Subrahmanian VS, Faloutsos C. REV2:Fraudulent user prediction in rating platforms. In:Proc. of the ACM Int'l Conf. on Web Search and Data Mining. 2018. 333-341.
    [32] Paranjape A, Benson AR, Leskovec J. Motifs in temporal networks. In:Proc. of the 10th ACM Int'l Conf. on Web Search and Data Mining. 2017. 601-610.
    [33] Nguyen GH, Lee JB, rossi RA, Ahmed NK, Koh E, Kim S. Continuous-Time dynamic network embeddings. In:Companion Proc. of the Web Conf. 2018. 2018. 969-976.
    [34] Wang Y, Jian X, Yang ZH. Query optimal k-plex based community in graphs. Data Science and Engineering, 2017,2(4):257-273.
    [35] Fan WF, Hu CM. Big graph analyses:From queries to dependencies and association rules. Data Science and Engineering, 2017,2(1):36-55.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

吴安彪,袁野,马玉亮,王国仁.时序图节点嵌入策略的研究.软件学报,2021,32(3):650-668

复制
分享
文章指标
  • 点击次数:2343
  • 下载次数: 7044
  • HTML阅读次数: 3937
  • 引用次数: 0
历史
  • 收稿日期:2020-07-19
  • 最后修改日期:2020-09-03
  • 在线发布日期: 2021-01-21
  • 出版日期: 2021-03-06
文章二维码
您是第20541741位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号