[关键词]
[摘要]
搜索引擎用户经常提交意图模糊的查询,从而导致搜索失败.为此,提出一种检索交互方式——智能查询推荐,它可以自动辨别查询是否语义明确,并对模糊查询建立体现其不同语义概念的分类目录,这个目录将帮助用户快速定位到合适查询.为了实现智能查询推荐,提出了一种基于自然语言小世界性质的查询语义识别算法——TECH(term concept hunting).TECH 综合利用了物理学领域社区发现知识和计算机领域信息检索技术,给出了一种可扩展的算法框架.实验结果表明,与传统查询推荐方式相比,用户更喜欢智能查询推荐;TECH 能够有效地辨识模糊查询的不同语义概念,并统计显著优于3 个知名的对比系统.
[Key word]
[Abstract]
Search engine queries are often too vague to achieve relevant results. This paper presents an intelligent query approach that can distinguish vague queries and organize the related queries of each vague query into a concept hierarchy. Through the concept hierarchy, users can quickly find proper queries for their informational needs. The TECH (term concept hunting) is proposed, based on the small world of human languages. TECH utilizes both the community detection algorithms in the physical field and IR techniques in the computer science field to generate an extensible framework. Experimental results show that compared with the traditional listing query suggestion manner, users prefer the intelligent query suggestion. TECH can effectively distinguish vague queries and significantly outperforms the other three state-of-the-art hierarchical building systems statistically.
[中图分类号]
[基金项目]
国家自然科学基金(60603094, 60776797); 国家重点基础研究发展计划(973)(2007CB311103); 国家高技术研究发展计划(863)(2006AA010105); 北京市自然科学基金(4082030)