大语言模型驱动的可信政务问答技术
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP391

基金项目:

国家自然科学基金(62072190)


Towards Trustworthy Government Q&A based on Large Language Model
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    政务问答系统能实时处理政务咨询, 在降低人工咨询压力的同时提高了企业和群众的办事效率. 政务问答系统的服务场景多样且重视回答表述的准确规范, 现有方法或基于预设知识库产生回答, 或基于规模有限的语言模型生成回答, 均无法在多服务场景下有效理解咨询并生成准确且可解释的可信回答. 为此, 提出一种基于大语言模型的政务问答技术以实现可信政务回答. 所提方法以政务大语言模型为内容理解和生成的核心模块并由分析引导模块和领域知识库模块辅助. 政务大语言模型生成咨询回答时参考分析引导模块提供的咨询分析结果和领域知识库模块提供的咨询相关领域知识, 并针对咨询生成内容表述与事实一致的准确回答. 生成回答时参考的信息可作为回答依据提升回答的可解释性. 为构建方法涉及的相关模块并测试其有效性, 收集并整理了一个包含多层次多粒度政务公开信息的综合性数据集, 其中包含1901篇文档和10503条问答对数据. 最后, 通过实验分析验证了基于该方法实现的原型系统能在多服务场景下针对用户咨询生成表述准确且可解释的可信咨询回答.

    Abstract:

    The government Q&A system can handle user queries in real-time, improving the efficiency of businesses and the public, while reducing the pressure of manual consultation. However, the service scenarios of the government Q&A system are diverse and require accurate and standardized expression of answers. Existing methods, which either utilize preset knowledge bases to generate answers or language models with limited scale, are unable to effectively understand consultations and generate trustworthy answers that are accurate and interpretable across multiple service scenarios. Therefore, this study proposes a government Q&A system based on a large language model to provide trustworthy government responses. The method employs a large language model specific to government service as the core module for content understanding and answer generation, assisted by an analysis guidance module and a domain knowledge base module. When generating answers, the large language model references the consulting analysis results provided by the analysis guidance module and the domain knowledge offered by the domain knowledge base module to produce answers that are accurate and consistent with the facts. The reference information during answer generation serves as a foundation to enhance the interpretability of the answers. A comprehensive dataset, containing multi-level and multi-granularity government public information, is collected and organized to construct the modules involved in the method and to test their effectiveness. This dataset includes 1901 documents and 10503 question-answer pairs. Finally, experiments verify that the prototype system, implemented based on the proposed method, can generate accurate and interpretable answers for user inquiries in multiple service scenarios, proving the effectiveness of each module in the system.

    参考文献
    相似文献
    引证文献
引用本文

王骞玥,胡晋武,王宇丰,胡宇,高浩然,邱舟强,谭明奎.大语言模型驱动的可信政务问答技术.软件学报,,():1-19

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-07-06
  • 最后修改日期:2024-11-16
  • 录用日期:
  • 在线发布日期: 2025-10-29
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号