微博网络上的重叠社群发现与全局表示

doi:10.13328/j.cnki.jos.004721

微信服务号

微信订阅号

2025年5月1日 12:49 星期四

首页 > 过刊浏览>2014年第25卷第12期 >2824-2836. DOI:10.13328/j.cnki.jos.004721

PDF HTML阅读 XML下载导出引用引用提醒

微博网络上的重叠社群发现与全局表示
DOI:
                        10.13328/j.cnki.jos.004721
                    
CSTR:
                        
                    
作者:
                        胡云胡云
南京大学 计算机科学与技术系, 江苏 南京 210093;淮海工学院 计算机工程学院, 江苏 连云港 222005
在期刊界中查找
在百度中查找
在本站中查找
王崇骏王崇骏
南京大学 计算机科学与技术系, 江苏 南京 210093
在期刊界中查找
在百度中查找
在本站中查找
吴骏吴骏
南京大学 计算机科学与技术系, 江苏 南京 210093
在期刊界中查找
在百度中查找
在本站中查找
谢俊元谢俊元
南京大学 计算机科学与技术系, 江苏 南京 210093
在期刊界中查找
在百度中查找
在本站中查找
李慧李慧
淮海工学院 计算机工程学院, 江苏 连云港 222005
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金(61403156,61375069,61105069);国家博士后基金(2011M500846);江苏省自然科学基金(11KJB520001,13KJB520002);江苏省科技支撑计划(BE2012181)

Overlapping Community Discovery and Global Representation on MicroBlog Network

Author:

HU Yun
HU Yun
Department of Computer Science and Technology, Nanjing University, Nanjing 210093, China;School of Computer Engineering, Huaihai Institute of Technology, Lianyungang 222005, China
在期刊界中查找
在百度中查找
在本站中查找
WANG Chong-Jun
WANG Chong-Jun
Department of Computer Science and Technology, Nanjing University, Nanjing 210093, China
在期刊界中查找
在百度中查找
在本站中查找
WU Jun
WU Jun
Department of Computer Science and Technology, Nanjing University, Nanjing 210093, China
在期刊界中查找
在百度中查找
在本站中查找
XIE Jun-Yuan
XIE Jun-Yuan
Department of Computer Science and Technology, Nanjing University, Nanjing 210093, China
在期刊界中查找
在百度中查找
在本站中查找
LI Hui
LI Hui
School of Computer Engineering, Huaihai Institute of Technology, Lianyungang 222005, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [27]

相似文献

引证文献

资源附件

文章评论

摘要:

微博网络是新兴的覆盖海量用户、涉及广泛话题并具有复杂重叠社群结构的多模网络.在深入研究微博网络各类实体和属性内在联系的基础上,提出了以用户-话题关系为主要划分原则的重叠社群表达模型及相应的社群结构发现算法.该方法不仅考虑网络中的用户-话题关系,还融合了这一网络特有的用户关注关系、博文评论与转发关系等所形成的复合网络关系.同时,改进了传统的社群隶属矩阵表述模型,通过引入虚拟社群,使隶属矩阵不仅合理反映个体对社群的隶属度,同时标识了个体在社群中的核心度.通过基于新浪微博数据集的实验验证,结果表明:该模型与方法能够高效合理地刻画该数据集包含的重叠社群结构,实验结果具有良好的可解释性,所提出的模型和算法可以有效地应用于类似多模网络社群划分和演化分析研究中.

关键词:微博网络;实体关系模型;重叠社群;隶属矩阵;虚拟社群

Abstract:

Micro-Blog cyberspace is a booming multiple mode network of numerous overlapping communities covering huge amount of users and topics relating to the nature, the society and the everyday life. Based on in depth analysis on the entities and inherent relationships among the network, this paper purposes a user-topic relation dominated structural module for overlapping community representation and detection, and also infuses the follow relationship along with the blog-forward and blog-comment relationship into the module. By introducing a virtual community into the actual communities of the network, the paper also puts forward an improved global belongingness matrix as user's role representation which has the ability to properly describe a user's degree of participation and importance in the network. Experimental results on Sina's micro-blog dataset show that the new method is favorable and efficient for finding meaningful communities from the micro-blog. Furthermore, the proposed module and algorithms can be adapted in various ways for similar social network analysis and helpful for community evolution research.

Key words:microblog network;entity relationship module;overlapping community;belongingness matrix;virtual community

参考文献

[1] Java A, Song XD, Finin T, Tseng B. Why we Twitter: An analysis of a microblogging community. In: Zhang H, et al., eds. Proc. of the WebKDD/ SNA-KDD. LNCS 5439, Berlin, Heidelberg: Springer-Verlag, 2009. 118-138. [doi: 10.1007/978-3-642-00528-2_7]

[2] Kivran-Swaine F, Govindan P, Naaman M. The impact of network structure on breaking ties in online social networks: Unfollowing on Twitter. In: Desney ST, ed. Proc. of the Annual Conf. on Human Factors in Computing Systems. New York: ACM Press, 2011. 1101-1104. [doi: 10.1145/1978942.1979105]

[3] Zhang Y, Wu Y, Yang Q. Community discovery in Twitter based on user interests. Journal of Computational Information Systems, 2012,8(3):991-1000.

[4] Tang L, Liu H, Zhang JP. Identifying evolving groups in dynamic multimode networks. IEEE Trans. on Knowledge and Data Engineering, 2012,24(1):72-85. [doi: 10.1109/TKDE.2011.159]

[5] Yu LB, Ding C. Network community discovery: Solving modularity clustering via normalized cut. In: Brefeld U, ed. Proc. of the 8th Workshop on Mining and Learning with Graphs. New York: ACM Press, 2010. 34-36. [doi: 10.1145/1830252.1830257]

[6] Huberman BA, Romero DM, Wu F. Social networks that matter: Twitter under the microscope. ArXiv e-prints. http://arxiv.org/abs/ 0812.1045. [doi: 10.2139/ssrn.1313405]

[7] Gao Q, Qu Q, Zhang XH. Mining social relationships in micro-blogging systems. In: Ant OA, Panayiotis Z, eds. Book: Online Communities and Social Computing. Berlin, Heidelberg: Springer-Verlag, 2011. 110-119. [doi: 10.1007/978-3-642-21796-8_12]

[8] Palla G, Derienyi I, Farkas I, Vicsek T. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 2005,435(7043):814-818. [doi: 10.1038/nature03607]

[9] Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008,10:10008. [doi: 10.1088/1742-5468/2008/10/p10008]

[10] Gregory S. Finding overlapping communities in networks by label propagation. New Journal of Physics, 2010,12(10):103018. [doi: 10.1088/1367-2630/12/10/103018]

[11] Wang XF, Tang L, Gao HJ, Liu H. Discovering overlapping groups in social media. In: Geoffrey I, ed. Proc. of the 10th IEEE Int'l Conf. on Data Mining. IEEE Computer Society, 2010. 569-578. [doi: 10.1109/ICDM.2010.48]

[12] Lancichinetti A, Radicchi F, Ramasco J. Finding statistically significant communities in networks. PLoS One, 2011,6(4):e18961. [doi: 10.1371/journal.pone.0018961]

[13] Gruzd A, Wellman B, Takhteyev YJ, Fortunate S. Imagining Twitter as an imagined community. American Behavioral Scientist, 2011,55(10): 1294-1318. [doi: 10.1177/0002764211409378]

[14] Hazlewood WR, Makice K, Ryan W. Twitterspace: A co-developed display using Twitter to enhance community awareness. In: Simonsen J, ed. Proc. of the Participatory Design Conf. The Trustees of Indiana University, 2008. 230-234. [doi: 10.1145/1795234. 1795284]

[15] Meeder B, Karrer B, Sayedi A, Ravi R, Borgs C, Chayes J. We know who you followed last summer: Inferring social link creation times in Twitter. In: Sadagopan S, ed. Proc. of the 20th Int'l Conf. on World Wide Web. New York: ACM Press, 2011. 517-526. [doi: 10.1145/1963405. 1963479]

[16] Lin C, Lin C, Li JX, Wang DD, Chen Y, Li T. Generating event storylines from microblogs. In: Chen XW, ed. Proc. of the 21st ACM inter. Conf. on Information and knowledge management. Maui. ACM Press, 2012. 175-184. [doi: 10.1145/2396761. 2396788]

[17] Lin C, Lin C, Lin ZY, Quan Z. Hybrid pseudo relevance feedback for microblog retrieval. Journal of Information Science, 2013, 39(6):773-788.

[18] Yuan Y, Yang CM. Empirical analysis of all kinds of social networks and their relationships formed by information communication among microblog users. Library and Information Service, 2011,55(12):11-25 (in Chinese with English abstract).

[19] Teutle ARM. Twitter: Network properties analysis. In: Palomares RA, ed. Proc. of the Int'l Conf. on 20th Electronics, Communications and Computer. Cholula: IEEE, 2010. 180-186. [doi: 10.1109/CONIELECOMP.2010.5440773]

[20] Gupta M, Gao J, Sun YZ, Han JW. Integrating community matching and outlier detection for mining evolutionary community outliers. In: Yang Q, ed. Proc. of the 18th ACM SIGKDD Int'l Conf. on Knowledge Discovery & Data Mining. New York: ACM Press, 2012. 859-867. [doi: 10.1145/2339530.2339667]

[21] Jakobsson M, Rosenberg NA. CLUMPP: A cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics, 2007,23(14):1801-1806. [doi: 10.1093/bioinformatics/btm233]

[22] Salton G, Buckley C. Term-Weighting approaches in automatic text retrieval. Information Processing & Management, 1988,24(5): 513-523. [doi: 10.1016/0306-4573(88)90021-0]

[23] Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. Journal of Machine Learning Research, 2003,3:993-1022. [doi: 10.1162/ jmlr.2003.3.4-5.993]

[24] Adar E, Teevan J, Dumais ST. Large scale analysis of Web revisitation patterns. In: Czerwinski M, ed. Proc. of the ACM Conf. on Human Factors in Computing Systems (CHI 2008). Florence: ACM Press, 2008. 1197-1206. [doi: 10.1145/1357054.1357241]

[25] Pavan M, Pelillo M. Dominant sets and hierarchical clustering. In: Proc. of the 9th IEEE Int'l Conf. on Computer Vision. Nice: IEEE, 2003. 362-369. [doi: 10.1109/ICCV.2003.1238367]

[26] Pavan M, Pelillo M. Dominant sets and pairwise clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2007,29(1): 167-172. [doi: 10.1109/TPAMI.2007.250608]

[27] http://www.datatang.com/data/45081

引用本文

胡云,王崇骏,吴骏,谢俊元,李慧.微博网络上的重叠社群发现与全局表示.软件学报,2014,25(12):2824-2836

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2014-04-10
最后修改日期:2014-08-21
录用日期:
在线发布日期: 2014-12-04
出版日期:

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码