基于去噪图自编码器的无监督社交媒体文本摘要

doi:10.13328/j.cnki.jos.007199

微信服务号

微信订阅号

2025年5月1日 13:24 星期四

首页 > 过刊浏览>2025年第36卷第5期 >2130-2150. DOI:10.13328/j.cnki.jos.007199

PDF HTML阅读 XML下载导出引用引用提醒

基于去噪图自编码器的无监督社交媒体文本摘要
DOI:
                        10.13328/j.cnki.jos.007199
                    
CSTR:
                        32375.14.jos.007199
                    
作者:
                        贺瑞芳贺瑞芳
天津大学 智能与计算学部, 天津 300350;天津市认知计算与应用重点实验室, 天津 300350
在期刊界中查找
在百度中查找
在本站中查找
赵堂龙赵堂龙
天津大学 智能与计算学部, 天津 300350;天津市认知计算与应用重点实验室, 天津 300350
在期刊界中查找
在百度中查找
在本站中查找
刘焕宇刘焕宇
天津大学 智能与计算学部, 天津 300350;天津市认知计算与应用重点实验室, 天津 300350
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:TP18
基金项目:国家自然科学基金(62376192, 62376188)

Denoising Graph Auto-encoder for Unsupervised Social Media Text Summarization

Author:

HE Rui-Fang
HE Rui-Fang
College of Intelligence and Computing, Tianjin University, Tianjin 300350, China;Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300350, China
在期刊界中查找
在百度中查找
在本站中查找
ZHAO Tang-Long
ZHAO Tang-Long
College of Intelligence and Computing, Tianjin University, Tianjin 300350, China;Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300350, China
在期刊界中查找
在百度中查找
在本站中查找
LIU Huan-Yu
LIU Huan-Yu
College of Intelligence and Computing, Tianjin University, Tianjin 300350, China;Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300350, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

社交媒体文本摘要旨在为面向特定话题的大规模社交媒体短文本(称为帖子)产生简明扼要的摘要描述. 考虑帖子表达内容短小、非正式等特点, 传统方法面临特征稀疏与信息不足的挑战. 近期研究利用帖子间的社交关系学习更好的帖子表示并去除冗余信息, 但其忽略了真实社交媒体情景中存在的不可靠噪声关系, 使得模型会误导帖子的重要性与多样性判断. 因此, 提出一种无监督模型DSNSum, 其通过去除社交网络中的噪声关系来改善摘要性能. 首先, 对真实社交关系网络中的噪声关系进行了统计验证; 其次, 根据社会学理论设计两个噪声函数, 并构建一种去噪图自编码器(denoising graph auto-encoder, DGAE), 以降低噪声关系的影响, 并学习融合可信社交关系的帖子表示; 最终, 通过稀疏重构框架选择保持覆盖性、重要性及多样性的帖子构成一定长度的摘要. 在两个真实社交媒体(Twitter与新浪微博)共计22个话题上的实验结果证明了所提模型的有效性, 也为后续相关领域的研究提供了新的思路.

关键词:社交媒体文本摘要;图表示学习;图神经网络;去噪自编码器

Abstract:

Social media text summarization aims to provide concise summaries for large-scale social media short texts (referred to as posts) targeting specific topics. Given the brief and informal contents of posts, traditional methods confront the challenges of sparse features and insufficient information. Recent research endeavors have leveraged social relationships among posts to refine post contents and remove redundant information, but these efforts neglect the presence of unreliable noise relationships in real social media contexts, leading to erroneous assessments of post importance and diversity. Therefore, this study proposes a novel unsupervised model DSNSum, which improves summarization performance by removing noise relationships in the social networks. Firstly, the noise relationships in real social relationship networks are statistically verified. Secondly, two noise functions are designed based on sociological theories, and a denoising graph auto-encoder (DGAE) is constructed to mitigate the influence of noise relationships and cultivate post contents of credible social relationships. Finally, a sparse reconstruction framework is utilized to select posts that maintain coverage, importance, and diversity to form a summary of a certain length. Experimental results on a total of 22 topics from two real social media platforms (Twitter and Sina Weibo) demonstrate the efficacy of the proposed model and provide new insights for subsequent research in related fields.

Key words:social media text summarization;graph representation learning;graph neural network (GNN);denoising auto-encoder (DAE)

引用本文

贺瑞芳,赵堂龙,刘焕宇.基于去噪图自编码器的无监督社交媒体文本摘要.软件学报,2025,36(5):2130-2150

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2023-07-05
最后修改日期:2023-11-22
录用日期:
在线发布日期: 2024-06-20
出版日期:

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码