Department of Computer Science and Technology, Nanjing University, Nanjing 210093, China;School of Computer Engineering, Huaihai Institute of Technology, Lianyungang 222005, China 在期刊界中查找 在百度中查找 在本站中查找
Micro-Blog cyberspace is a booming multiple mode network of numerous overlapping communities covering huge amount of users and topics relating to the nature, the society and the everyday life. Based on in depth analysis on the entities and inherent relationships among the network, this paper purposes a user-topic relation dominated structural module for overlapping community representation and detection, and also infuses the follow relationship along with the blog-forward and blog-comment relationship into the module. By introducing a virtual community into the actual communities of the network, the paper also puts forward an improved global belongingness matrix as user's role representation which has the ability to properly describe a user's degree of participation and importance in the network. Experimental results on Sina's micro-blog dataset show that the new method is favorable and efficient for finding meaningful communities from the micro-blog. Furthermore, the proposed module and algorithms can be adapted in various ways for similar social network analysis and helpful for community evolution research.
[1] Java A, Song XD, Finin T, Tseng B. Why we Twitter: An analysis of a microblogging community. In: Zhang H, etal., eds. Proc. of the WebKDD/ SNA-KDD. LNCS 5439, Berlin, Heidelberg: Springer-Verlag, 2009. 118-138. [doi: 10.1007/978-3-642-00528-2_7]
[2] Kivran-Swaine F, Govindan P, Naaman M. The impact of network structure on breaking ties in online social networks: Unfollowing on Twitter. In: Desney ST, ed. Proc. of the Annual Conf. on Human Factors in Computing Systems. New York: ACM Press, 2011. 1101-1104. [doi: 10.1145/1978942.1979105]
[3] Zhang Y, Wu Y, Yang Q. Community discovery in Twitter based on user interests. Journal of Computational Information Systems, 2012,8(3):991-1000.
[4] Tang L, Liu H, Zhang JP. Identifying evolving groups in dynamic multimode networks. IEEE Trans. on Knowledge and Data Engineering, 2012,24(1):72-85. [doi: 10.1109/TKDE.2011.159]
[5] Yu LB, Ding C. Network community discovery: Solving modularity clustering via normalized cut. In: Brefeld U, ed. Proc. of the 8th Workshop on Mining and Learning with Graphs. New York: ACM Press, 2010. 34-36. [doi: 10.1145/1830252.1830257]
[6] Huberman BA, Romero DM, Wu F. Social networks that matter: Twitter under the microscope. ArXiv e-prints. http://arxiv.org/abs/ 0812.1045. [doi: 10.2139/ssrn.1313405]
[7] Gao Q, Qu Q, Zhang XH. Mining social relationships in micro-blogging systems. In: Ant OA, Panayiotis Z, eds. Book: Online Communities and Social Computing. Berlin, Heidelberg: Springer-Verlag, 2011. 110-119. [doi: 10.1007/978-3-642-21796-8_12]
[8] Palla G, Derienyi I, Farkas I, Vicsek T. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 2005,435(7043):814-818. [doi: 10.1038/nature03607]
[9] Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008,10:10008. [doi: 10.1088/1742-5468/2008/10/p10008]
[10] Gregory S. Finding overlapping communities in networks by label propagation. New Journal of Physics, 2010,12(10):103018. [doi: 10.1088/1367-2630/12/10/103018]
[11] Wang XF, Tang L, Gao HJ, Liu H. Discovering overlapping groups in social media. In: Geoffrey I, ed. Proc. of the 10th IEEE Int'l Conf. on Data Mining. IEEE Computer Society, 2010. 569-578. [doi: 10.1109/ICDM.2010.48]
[12] Lancichinetti A, Radicchi F, Ramasco J. Finding statistically significant communities in networks. PLoS One, 2011,6(4):e18961. [doi: 10.1371/journal.pone.0018961]
[13] Gruzd A, Wellman B, Takhteyev YJ, Fortunate S. Imagining Twitter as an imagined community. American Behavioral Scientist, 2011,55(10): 1294-1318. [doi: 10.1177/0002764211409378]
[14] Hazlewood WR, Makice K, Ryan W. Twitterspace: A co-developed display using Twitter to enhance community awareness. In: Simonsen J, ed. Proc. of the Participatory Design Conf. The Trustees of Indiana University, 2008. 230-234. [doi: 10.1145/1795234. 1795284]
[15] Meeder B, Karrer B, Sayedi A, Ravi R, Borgs C, Chayes J. We know who you followed last summer: Inferring social link creation times in Twitter. In: Sadagopan S, ed. Proc. of the 20th Int'l Conf. on World Wide Web. New York: ACM Press, 2011. 517-526. [doi: 10.1145/1963405. 1963479]
[16] Lin C, Lin C, Li JX, Wang DD, Chen Y, Li T. Generating event storylines from microblogs. In: Chen XW, ed. Proc. of the 21st ACM inter. Conf. on Information and knowledge management. Maui. ACM Press, 2012. 175-184. [doi: 10.1145/2396761. 2396788]
[17] Lin C, Lin C, Lin ZY, Quan Z. Hybrid pseudo relevance feedback for microblog retrieval. Journal of Information Science, 2013, 39(6):773-788.
[18] Yuan Y, Yang CM. Empirical analysis of all kinds of social networks and their relationships formed by information communication among microblog users. Library and Information Service, 2011,55(12):11-25 (in Chinese with English abstract).
[19] Teutle ARM. Twitter: Network properties analysis. In: Palomares RA, ed. Proc. of the Int'l Conf. on 20th Electronics, Communications and Computer. Cholula: IEEE, 2010. 180-186. [doi: 10.1109/CONIELECOMP.2010.5440773]
[20] Gupta M, Gao J, Sun YZ, Han JW. Integrating community matching and outlier detection for mining evolutionary community outliers. In: Yang Q, ed. Proc. of the 18th ACM SIGKDD Int'l Conf. on Knowledge Discovery & Data Mining. New York: ACM Press, 2012. 859-867. [doi: 10.1145/2339530.2339667]
[21] Jakobsson M, Rosenberg NA. CLUMPP: A cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics, 2007,23(14):1801-1806. [doi: 10.1093/bioinformatics/btm233]
[22] Salton G, Buckley C. Term-Weighting approaches in automatic text retrieval. Information Processing & Management, 1988,24(5): 513-523. [doi: 10.1016/0306-4573(88)90021-0]
[23] Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. Journal of Machine Learning Research, 2003,3:993-1022. [doi: 10.1162/ jmlr.2003.3.4-5.993]
[24] Adar E, Teevan J, Dumais ST. Large scale analysis of Web revisitation patterns. In: Czerwinski M, ed. Proc. of the ACM Conf. on Human Factors in Computing Systems (CHI 2008). Florence: ACM Press, 2008. 1197-1206. [doi: 10.1145/1357054.1357241]
[25] Pavan M, Pelillo M. Dominant sets and hierarchical clustering. In: Proc. of the 9th IEEE Int'l Conf. on Computer Vision. Nice: IEEE, 2003. 362-369. [doi: 10.1109/ICCV.2003.1238367]
[26] Pavan M, Pelillo M. Dominant sets and pairwise clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2007,29(1): 167-172. [doi: 10.1109/TPAMI.2007.250608]