Community Detection Algorithm Based on Node Embedding Vector Representation
Author:
Affiliation:

Fund Project:

National Natural Science Foundation of China (61170112, 61532006); Natural Science Foundation of Beijing, China (4172016, KZ201410011014)

  • Article
  • | |
  • Metrics
  • |
  • Reference [67]
  • |
  • Related
  • | | |
  • Comments
    Abstract:

    Community detection is very important in theoretical and practical for complex research. According to the principle of distributed word vector, a community detection algorithm based on node embedding vector (CDNEV) is proposed in this study. In order to construct the distributed vector of network nodes, a heuristic random walk model is put forward. The node sequence obtained by the heuristic random walk model is used as the context for nodes, and the distributed vector of nodes is learned by SkipGram model. Based on the distributed vector of nodes that are selected from the local node as the center of the K-Means clustering algorithm center, all nodes in a network are clustered with K-Means algorithm, and the community structure are conclude by clustering result. Based on real complex networks and artificial networks used in other state-of-the-art algorithms, comprehensive experiments are conducted. For comparison purpose, typical community detection algorithms are selected to be evaluated. On real networks, the F1 value of CDNEV algorithm is increased 19% on average. The F1 value can be increased by 15% on artificial networks. Experimental results demonstrate that both accuracy and efficiency of CDNEV algorithm outperform other state-of-the-art algorithms.

    Reference
    [1] Newman MEJ, Watts DJ. Renormalization group analysis of the small-world network model. Physics Letters A, 1999,263(4):341-346.
    [2] Shao F, Jiang GP. Optimal traffic routing strategy based on community structure. Acta Physica Sinica, 2011,60(7):078902(in Chinese with English abstract).
    [3] Perozzi B, Al-Rfou R, Skiena S. Deepwalk:Online learning of social representations. In:Proc. of the 20th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. ACM, 2014. 701-710.
    [4] Bedi P, Sharma C. Community detection in social networks. Wiley Interdisciplinary Reviews Data Mining & Knowledge Discovery, 2016,496-500(3).
    [5] Fortunato S, Hric D. Community detection in networks:A user guide. Physics Reports, 2016,659:1-44.
    [6] Newman MEJ. Modularity and community structure in networks. Proc. of the National Academy of Sciences, 2006,103(23):8577-8582.
    [7] Shiga M, Takigawa I, Mamitsuka H. A spectral clustering approach to optimally combining numericalvectors with a modular network. In:Proc. of the 13th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. ACM, 2007. 647-656.
    [8] Fu LD, Gao L, Ma XK. A centrality measure based on spectral optimization of modularity density. Science China Information Sciences, 2012,42(5):550-560(in Chinese with English abstract).
    [9] Riolo MA, Newman MEJ. First-principles multiway spectral partitioning of graphs. Journal of Complex Networks, 2014,2(2):121-140.
    [10] Liu R, Feng S, Shi R, et al. Weighted graph clustering for community detection of large social networks. Procedia Computer Science, 2014,31:85-94.
    [11] Wang Z, Chen Z, Zhao Y, et al. A community detection algorithm based on topology potential and spectral clustering. The Scientific World Journal, 2014,2014:329325.[doi:10.1155/2014/329325]
    [12] Jin H, Wang S, Li C. Community detection in complex networks by density-based clustering. Physica A:Statistical Mechanics and its Applications, 2013,392(19):4606-4618.
    [13] Lin W, Kong X, Yu PS, et al. Community detection in incomplete information networks. In:Proc. of the 21st Int'l Conf. on World Wide Web. ACM, 2012. 341-350.
    [14] Gennip Y, Hunter B, Ahn R, et al. Community detection using spectral clustering on sparse geosocial data. SIAM Journal on Applied Mathematics, 2013,73(1):67-83.
    [15] Kernighan BW, Lin S. An efficient heuristic procedure for partitioning graphs. Bell System Technical Journal, 1970,49(2):291-307.
    [16] Newman MEJ. Fast algorithm for detecting community structure in networks. Physical Review E, 2004,69(6):066133.
    [17] Pan Y, Li DH, Liu JG, et al. Detecting community structure in complex networks via node similarity. Physica A:Statistical Mechanics and its Applications, 2010,389(14):2849-2857.
    [18] Zanjani AAH, Darooneh AH. Finding communities in linear time by developing the seeds. Physical Review E, 2011,84(3):036109.
    [19] Wang XY, Zhao ZX. Partitioning community structure in complex networks based on node dependent degree. Acta Physica Sinica, 2014,63(17):178901(in Chinese with English abstract).
    [20] Clauset A. Finding local community structure in networks. Physical review E, 2005,72(2):026132.
    [21] Lancichinetti A, Fortunato S, Kertész J. Detecting the overlapping and hierarchical community structure in complex networks. New Journal of Physics, 2009,11(3):033015.
    [22] Lee C, Reid F, McDaid A, et al. Detecting highly overlapping community structure by greedy clique expansion. arXiv Preprint arXiv:1002.1827, 2010.
    [23] Xu X, Yuruk N, Feng Z, et al. SCAN:A structural clustering algorithm for networks. In:Proc. of the ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. ACM, 2007. 824-833.
    [24] Huang J, Sun H, Han J, et al. SHRINK:A structural clustering algorithm for detecting hierarchical communities in networks. In:Proc. of the ACM Conf. on Information and Knowledge Management, CIKM 2010. Toronto:DBLP, 2010. 219-228.
    [25] Girvan M, Newman MEJ. Community structure in social and biological networks. Proc. of the National Academy of Sciences, 2002, 99(12):7821-7826.
    [26] Blondel VD, Guillaume JL, Lambiotte R, et al. Fast unfolding of communities in large networks. Journal of Statistical Mechanics:Theory and Experiment, 2008,(10):P10008.
    [27] Raghavan UN, Albert R, Kumara S. Near linear time algorithm to detect community structures in large-scale networks. Physical Review E, 2007,76(3):036106.
    [28] Sun H, Liu J, Huang J, et al. CenLP:A centrality-based label propagation algorithm for community detection in networks. Physica A:Statistical Mechanics & Its Applications, 2015,436:767-780.
    [29] Tsourakakis C, Gkantsidis C, Radunovic B, et al. Fennel:Streaming graph partitioning for massive scale graphs. In:Proc. of the 7th ACM Int'l Conf. on Web Search and Data Mining. ACM, 2014. 333-342.
    [30] Yuan C, Chai Y. Method for local community mining in the complex networks. Acta Automatica Sinica, 2014,40(5):921-934(in Chinese with English abstract).
    [31] Albert R, DasGupta B, Mobasheri N. Topological implications of negative curvature for biological and social networks. Physical Review E, 2014,89(3):032811.
    [32] Gan WY, Nan HE, Li DY, et al. Community discovery method in networks based on topological potential. Ruan Jian Xue Bao/Journal of Software, 2009,20(8):2241-2254(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/3318.htm[doi:10.3724/SP.J.1001.2009.03318]
    [33] Wu F, Huberman BA. Finding communities in linear time:A physics approach. The European Physical Journal B-Condensed Matter and Complex Systems, 2004,38(2):331-338.
    [34] He D X, Xu Z, Zuo W, et al. Community mining in complex networks-Clustering combination based genetic algorithm. Acta Automatica Sinica, 2010,36(8):1160-1170(in Chinese with English abstract).
    [35] Okamoto H. Local detection of communities by attractor neural-network dynamics. In:Artificial Neural Networks. Springer Int'l Publishing, 2015. 115-125.
    [36] Zhang ZY. Community structure detection in social networks based on dictionary learning. Science China (Information Sciences), 2011,41(11):1343-1355(in Chinese with English abstract).
    [37] Tang L, Liu H. Scalable learning of collective behavior based on sparse social dimensions. In:Proc. of the 18th ACM Conf. on Information and Knowledge Management. ACM, 2009. 1107-1116.
    [38] Tang J, Qu M, Wang M, et al. Line:Large-scale information network embedding. In:Proc. of the 24th Int'l Conf. on World Wide Web. Int'l World Wide Web Conferences Steering Committee, 2015. 1067-1077.
    [39] Cao S, Lu W, Xu Q. GraRep:Learning graph representations with global structural information. In:Proc. of the 24th ACM Int'l on Conf. on Information and Knowledge Management. ACM, 2015. 891-900.
    [40] Wang DX, Cui P, Zhu WW. Structural Deep Network Embedding. In:Proc. of the 22nd ACM SIGKDD Int'l Conf. 2016. 1225-1234.[doi:10.1145/2939672.2939753]
    [41] Li J, Ritter A, Jurafsky D. Learning multi-faceted representations of individuals from heterogeneous evidence using neural networks. arXiv Preprint arXiv:1510.05198, 2015.
    [42] Zhou N, Zhao WX, Zhang X, et al. A general multi-context embedding model for mining human trajectory data. IEEE Trans. on Knowledge & Data Engineering, 2016,28(8):1945-1958.
    [43] Yang C, Liu Z, Zhao D, et al. Network representation learning with rich text information. In:Proc. of the 24th Int'l Joint Conf. on Artificial Intelligence. Buenos Aires, 2015. 2111-2117.
    [44] Chen J, Zhang Q, Huang X. Incorporate group information to enhance network embedding. In:Proc. of the ACM Int'l on Conf. on Information and Knowledge Management. ACM, 2016. 1901-1904.
    [45] Tang J, Qu M, Mei Q. PTE:Predictive text embedding through large-scale heterogeneous text networks. In:Proc. of the 21st ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. ACM, 2015. 1165-1174.
    [46] Bengio Y, Ducharme R, Vincent P, et al. A neural probabilistic language model. The Journal of Machine Learning Research, 2003, 3:1137-1155.
    [47] Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space. arXiv Preprint arXiv:1301.3781, 2013.
    [48] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality. In:Advances in Neural Information Processing Systems. 2013. 3111-3119.
    [49] Mnih A, Hinton GE. A scalable hierarchical distributed language model. In:Advances in Neural Information Processing Systems. 2009. 1081-1088.
    [50] Morin F, Bengio Y. Hierarchical probabilistic neural network language model. In:Proc. of the Int'l Workshop on Artificial Intelligence and Statistics. 2005. 246-252.
    [51] Bottou L. Stochastic gradient learning in neural networks. Proc. of Neuro-Nımes, 1991,91(8).
    [52] Chen Q, Wu TT, Fang M. Detecting local community structures in complex networks based on local degree central nodes. Physica A:Statistical Mechanics and Its Applications, 2013,392(3):529-537.
    [53] Newman MEJ. Finding community structure in networks using the eigenvectors of matrices. Physical Review E, 2006,74(3):036104.
    [54] Rosvall M, Axelsson D, Bergstrom CT. The map equation. The European Physical Journal Special Topics, 2010,178(1):13-23.
    [55] Reichardt J, Bornholdt S. Statistical mechanics of community detection. Physical Review E, 2006,74(1):016110.
    [56] Pons P, Latapy M. Computing communities in large networks using random walks. In:Computer and Information Sciences-ISCIS 2005. Berlin, Heidelberg:Springer-Verlag, 2005. 284-293.
    [57] Yang J, Leskovec J. Defining and evaluating network communities based on ground-truth. Knowledge & Information Systems, 2015,42(1):181-213.
    [58] Lancichinetti A, Fortunato S, Radicchi F. Benchmark graphs for testing community detection algorithms. Physical Review E, 2008, 78(4):046110.
    [59] Lancichinetti A, Fortunato S. Community detection algorithms:A comparative analysis. Physical Review E, 2009,80(5):056117.
    附中文参考文献:
    [2] 邵斐,蒋国平.基于社团结构的负载传输优化策略研究.物理学报,2011,60(7):078902.
    [8] 付立东,高琳,马小科.基于社团检测的复杂网络中心性方法.中国科学(信息科学),2012,42(5):550-560.
    [19] 王兴元,赵仲祥.基于节点间依赖度的社团结构划分方法.物理学报,2014,63(17):178901.
    [30] 袁超,柴毅.复杂网络的局部社团结构挖掘算法.自动化学报,2014,40(5):921-934.
    [32] 淦文燕,赫南,李德毅,王建民.一种基于拓扑势的网络社区发现方法.软件学报,2009,20(8):2241-2254. http://www.jos.org.cn/1000-9825/3318.htm[doi:10.3724/SP.J.1001.2009.03318]
    [34] 何东晓,周栩,王佐,等.复杂网络社区挖掘——基于聚类融合的遗传算法.自动化学报,2010,36(8):1160-1170.
    [36] 张忠元.基于字典学习的网络社团结构探测算法.中国科学(信息科学),2011,41(11):1343-1355.
    Related
    Cited by
Get Citation

韩忠明,刘雯,李梦琪,郑晨烨,谭旭升,段大高.基于节点向量表达的复杂网络社团划分算法.软件学报,2019,30(4):1045-1061

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:October 09,2016
  • Revised:June 09,2017
  • Online: April 01,2019
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063