基于词相关性特征的社交网络突发事件检测方法
作者:
作者单位:

作者简介:

通讯作者:

王扬,E-mail:987505730@qq.com

中图分类号:

基金项目:

国家自然科学基金面上项目(61472136,61772196);湖南省自然科学基金面上项目(2020JJ4249);湖南省社会科学基金重点项目(2016ZDB006);湖南省社会科学成果评审委员会课题重点项目(湘社评19ZD1005);湖南省学位与研究生教育改革研究项目(2020JGYB234)


Social network emergency detection method based on word correlation characteristics
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    如何在社交媒体上检测数据流中的突发事件是自然语言处理中的一个热门研究主题,但是,当前用于提取突发事件的方法存在精度低和效率低的问题.为了解决这些问题,本文提出一种基于词相关性特征的突发事件检测方法,能从社会网络数据流中快速地检测出突发事件,以便相关的决策者可以及时有效地采取相关措施进行处理,使突发事件的负面影响能够被尽量降低,维护社会的安定.首先,通过噪声过滤和情绪过滤,我们得到了充满负面情绪的微博文本.然后,根据时间信息,对微博数据进行时间切片,计算每个时间窗口中该数据的每个单词的单词频率特征、用户影响力和单词频率增长率特征,运用突发度计算方法来提取突发词;根据word2vec模型合并相似词,利用突发词的特征相似性构成突发词关系图.最后,运用多归属谱聚类算法对单词关系图进行最优划分,并在时间窗滑过时关注异常词语,通过子图中词语突发度的变化而引起的结构变化对突发事件进行判断.由实验结果知,突发事件检测方法在实时博文数据流中具有很好的事件检测效果,与已有的方法相比,本文提出的突发事件检测方法可以满足突发事件检测的需求,不仅能检测到子事件的详细信息,而且事件的相关信息也能被准确地检测出来.

    Abstract:

    How to detect sudden events in data streams on social media is a popular research topic in natural language processing. However, current methods for extracting emergencies have problems of low accuracy and low efficiency. In order to solve these problems, this paper proposes an emergency detection method based on the characteristics of word correlation, which can quickly detect emergency events from the social network data stream, so that relevant decision makers can take timely and effective measures to deal with, making the negative impact of emergencies can be reduced as much as possible to maintain social stability. First of all, through noise filtering and emotion filtering, we get microblog texts full of negative emotions. Then, based on the time information, time slice the Weibo data to calculate the word frequency characteristics, user influence and word frequency growth rate characteristics of each word of the data in each time window, and use the burst calculation method to extract the burst word. According to the word2vec model, similar words are merged, and the characteristic similarity of the burst words is used to form a burst word relationship graph. Finally, the multi-attribute spectral clustering algorithm is used to optimally divide the word relationship graph, and pay attention to abnormal words when the time window slides, and to judge the sudden events through the structural changes caused by the sudden changes of the words in the sub-graph. It is known from the experimental results that the emergency event detection method has a better event detection effect in the real-time blog post data stream. Compared with the existing methods, the emergency detection method proposed in this paper can meet the needs of emergency detection. Not only can it detect the detailed information of sub-events, but also the relevant information of events can be accurately detected.

    参考文献
    相似文献
    引证文献
引用本文

蒋伟进,王扬.基于词相关性特征的社交网络突发事件检测方法.软件学报,,():0

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2021-10-20
  • 出版日期:
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号