[关键词]
[摘要]
自动关键词抽取是从文本或文本集合中自动抽取主题性或重要性的词或短语,是文本检索、文本摘要等许多文本挖掘任务的基础性和必要性的工作.探讨了关键词和自动关键词抽取的内涵,从语言学、认知科学、复杂性科学、心理学和社会科学等多个方面研究了自动关键词抽取的理论基础.从宏观、中观和微观角度,回顾和分析了自动关键词抽取的发展、技术和方法.针对目前广泛应用的自动关键词抽取方法,包括统计法、基于主题的方法、基于网络图的方法等,总结了其关键技术和研究进展.对自动关键词抽取的评价方式进行了分析,对自动关键词抽取面临的挑战和研究趋势进行了预测.
[Key word]
[Abstract]
Automatic keyword extraction is to extract topical and important words or phrases form document or document set. It is a basic and necessary work in text mining tasks such as text retrieval and text summarization. This paper discusses the connotation of keyword extraction and automatic keyword extraction. In the light of linguistics, cognitive science, complexity science, psychology and social science, this paper studies the theoretical basis of automatic keyword extraction. From macro, meso and micro perspectives, the development, techniques and methods of automatic keyword extraction are reviewed and analyzed. This paper summarizes the current key technologies and research progress of automatic keyword extraction methods, including statistical methods, topic based methods, and network based methods. The evaluation approach of automatic keyword extraction is analyzed, and the challenges and trends of automatic keyword extraction are also predicted.
[中图分类号]
[基金项目]
国家自然科学基金(61272260,61273320)