基于异构社交上下文的多视图微博主题检测
作者:
作者单位:

作者简介:

贺瑞芳(1979-),女,博士,教授,博士生导师,CCF专业会员,主要研究领域为自然语言处理,社会媒体挖掘,机器学习;王浩成(1997-),男,硕士,主要研究领域为社会媒体话题检测;刘宏宇(1997-),女,硕士,主要研究领域为社会媒体话题检测;王博(1979-),男,博士,副教授,主要研究领域为自然语言处理,个性化推荐,心理计算

通讯作者:

王博,Bo_wang@tju.edu.cn

中图分类号:

TP18

基金项目:

国家自然科学基金(61976154); 国家重点研发计划(2019YFC1521200)


Multi-view Microblog Topic Detection Based on Heterogeneous Social Context
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    社交媒体主题检测旨在从大规模短帖子中挖掘潜在的主题信息. 由于帖子形式简短、表达非正规化, 且社交媒体中用户交互复杂多样, 使得该任务具有一定的挑战性. 前人工作仅考虑了帖子的文本内容, 或者同时对同构情境下的社交上下文进行建模, 忽略了社交网络的异构性. 然而, 不同的用户交互方式, 如转发, 评论等, 可能意味着不同的行为模式和兴趣偏好, 其反映了对主题的不同的关注与理解; 此外, 不同用户对同一主题的发展和演化具有不同影响, 社区中处于引领地位的权威用户相对于普通用户对主题推断会产生更重要的作用. 因此, 提出一种新的多视图主题模型(multi-view topic model, MVTM), 通过编码微博会话网络中的异构社交上下文来推断更加完整、连贯的主题. 首先根据用户之间的交互关系构建一个属性多元异构会话网络, 并将其分解为具有不同交互语义的多个视图; 接着, 考虑不同交互方式与不同用户的重要性, 借助邻居级注意力和交互级注意力机制, 得到特定视图的嵌入表示; 最后, 设计一个多视图驱动的神经变分推理方法, 以捕捉不同视图之间的深层关联, 并自适应地平衡它们的一致性和独立性, 从而产生更连贯的主题. 在3个月新浪微博数据集上的实验结果证明所提方法的有效性.

    Abstract:

    Social media topic detection aims to mine latent topic information from large-scale short posts. It is a challenging task as posts are short in form and informal in expression and user interactions in social media are complex and diverse. Previous studies only consider the textual content of posts or simultaneously model social contexts in homogeneous situations, ignoring the heterogeneity of social networks. However, different types of user interactions, such as forwarding and commenting, could suggest different behavior patterns and interest preferences and reflect different attention to the topic and understanding of the topic. In addition, different users have different influences on the development and evolution of the same topic. Specifically, compared with ordinary users, the leading authoritative users in a community play a more important role in topic inference. For the above reasons, this study proposes a novel multi-view topic model (MVTM) to infer more complete and coherent topics by encoding heterogeneous social contexts in the microblog conversation network. For this purpose, an attributed multiplex heterogeneous conversation network is built according to the interaction relationships among users and decomposed into multiple views with different interaction semantics. Then, the embedded representation of specific views is obtained by leveraging neighbor-level and interaction-level attention mechanisms, with due consideration given to different types of interactions and the importance of different users. Finally, a multi-view neural variational inference method is designed to capture the deep correlations among different views and adaptively balance their consistency and independence, thereby obtaining more coherent topics. Experiments are conducted on a Sina Weibo dataset covering three months, and the results reveal the effectiveness of the proposed method.

    参考文献
    相似文献
    引证文献
引用本文

贺瑞芳,王浩成,刘宏宇,王博.基于异构社交上下文的多视图微博主题检测.软件学报,2023,34(11):5162-5178

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-09-26
  • 最后修改日期:2022-04-13
  • 录用日期:
  • 在线发布日期: 2023-05-18
  • 出版日期: 2023-11-06
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号