文本检索的查询性能预测

微信服务号

微信订阅号

2025年7月28日 0:15 星期一

首页 > 过刊浏览>2008年第19卷第2期 >291-300

文本检索的查询性能预测
DOI:
                        
                    
CSTR:
                        
                    
作者:
                        郎 皓郎 皓
中国科学院 计算技术研究所,北京 100080
在期刊界中查找
在百度中查找
在本站中查找
王 斌王 斌
中国科学院 计算技术研究所,北京 100080
在期刊界中查找
在百度中查找
在本站中查找
李锦涛李锦涛
中国科学院 计算技术研究所,北京 100080
在期刊界中查找
在百度中查找
在本站中查找
丁 凡丁 凡
中国科学院 计算技术研究所,北京 100080
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:Supported by the National Natural Science Foundation of China under Grant No.60603094 (国家自然科学基金); the National Basic Research Program of China under Grant No.2004CB318109 (国家重点基础研究发展计划(973)); the Beijing Science and Technology Planning Program of China under Grant No.D0106008040291 (北京市科技计划)

Predicting Query Performance for Text Retrieval

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [21]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

目前,查询性能预测(predicting query performance,简称PQP)已经被认为是检索系统最重要的功能之一.近几年的研究和实验表明,PQP技术在文本检索领域有着广阔的发展前景和拓展空间.对文本检索中的PQP进行综述,重点论述其主要方法和关键技术.首先介绍了常用的实验语料和评价体系;然后介绍了影响查询性能的各方面因素;之后,按照基于检索前和检索后的分类体系概述了目前主要的PQP方法;简介了PQP在几个方面的应用;最后讨论了PQP所面临的一些挑战.

关键词:信息检索;查询性能预测

Abstract:

Predicting query performance (PQP) has recently been recognized by the IR (information retrieval) community as an important capability for IR systems. In recent years, research work carried out by many groups has confirmed that predicting query performance is a good method to figure out the robustness problem of the IR system and useful to give feedback to users, search engines and database creators. In this paper, the basic predicting query performance approaches for text retrieval are surveyed. The data for experiments and the methods for evaluation are introduced, the contributions of different factors to overall retrieval variability across queries are presented, the main PQP approaches are described from Pre-Retrieval to Post-Retrieval aspects, and some applications of PQP are presented. Finally, several primary challenges and open issues in PQP are summarized.

Key words:information retrieval; query performance prediction

参考文献

[1]Yates B,Neto R.Modern Information Retrieval.New York:ACM Press,1999.

[2]Harman D,Buckley C.The NRRC reliable information access (RIA) workshop.In:Sanderson M,Jarveln K,Allan J,Bruza P,eds.Proc.of the 27th Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.Sheffield:ACM Press,2004.528-529.

[3]Voorhees EM.Overview of the TREC 2004 robust retrieval track.In:Online Proc.of the 2004 Text Retrieval Conf.(TREC 2004).2004.http://trec.nist.gov/pubs/trec13/t13_proceedings.html

[4]Yom-Tov E,Fine S,Carmel D,Darlow A.Learning to estimate query difficulty:Including applications to missing content detection and distributed information retrieval.In:Proc.of the 28th Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.Salvador:ACM Press,2005.512-519.

[5]Vinay V,Cox IJ,Milic-Frayling N,Wood K.On ranking the effectiveness of searches.In:Proc.of the 29th Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.New York:ACM Press,2006.398-404.

[6]Zhou Y,Croft WB.Ranking robustness:A novel framework to predict query performance.In:Proc.of the 15th ACM Int'l Conf.on Information and Knowledge Management.Arlington:ACM Press,2006.567-574.

[7]Cronen-Townsend S,Zhou Y,Croft WB.Predicting query performance.In:Proc.of the 25th Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.Tampere:ACM Press,2002.299-306.

[8]Gibbons JD,Chakraborty S.Nonparametric Statistical Inference.3rd ed.,New York:Marcel Dekker,1992.

[9]Kreyszig E.Advanced Engineering Mathematics.John Wiley & Sons,Inc.,1997.

[10]Carmel D,Yom-Tov E,Darlow A,Pelleg D.What makes a query difficult? In:Proc.of the 29th Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.New York:ACM Press,2006.390-397.

[11]Tomlinson S.Robust,Web and terabyte retrieval with hummingbird search server at TREC 2004.In:Online Proc.of the 2004 Text Retrieval Conf.(TREC 2004).2004.http://trec.nist.gov/pubs/trec13/papers/humingbird.robust.web.tera.pdf

[12]He B,Ounis I.Inferring query performance using pre-retrieval predictors.In:Apostolico A,Melucci M,eds.String Processing and Information Retrieval,11th Int'l Conf.,SPIRE 2004.LNCS 3246,2004.43-54.

[13]Plachouras V,He B,Ounis I.University of Glasgow at TREC 2004:Experiment in Web,robust,and terabyte tracks with terrier.In:Online Proc.of the 2004 Text Retrieval Conf.(TREC 2004).2004.http://ir.dcs.gla.ac.uk/terrier/publications/glasgowTrec2004.pdf

[14]Mothe J,Tanguy L.Linguistic features to predict query difficulty.In:Proc.of the 29th Annual Int'l ACM SIGIR 2005 Workshop on Predicting Query Difficulty-Methods and Applications.http://www.haifa.il.ibm.com/sigir05-qp/index.html

[15]Lü XQ,Lai ZG,Sun B,Yu SW.Evaluation of topic difficulty.Journal of Tsinghua University (Science and Technology),2005,45(S1):1833-1837 (in Chinese with English abstract).

[16]Kwok KL,Grunfeld L,Dinstl N,Deng P.TREC 2005 robust track experiments using PIRCS.In:Online Proc.of the 2005 Text Retrieval Conf.(TREC 2005).2005.http://trec.nist.gov/pubs/trec14/papers/queensc-kwok.robust.pdf

[17]Amati G,Carpineto C,Romano G.Query difficulty,robustness and selective application of query expansion.In:Proc.of the ECIR 2004.127-137.

[18]Diaz F,Jones R.Using temporal profiles of queries for precision prediction.In:Proc.of the 27th Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.Sheffield:ACM Press,2004.18-24.

[19]van Rijsbergen CJ.Information Retrieval.2nd ed.,London:Butterworths,1979.

[20]Xu JX,Croft WB.Query expansion using local and global document analysis.In:Proc.of the 19th Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.Zürich:ACM Press,1996.4-11.

[21]Callan JP,Lu ZH,Croft WB.Searching distributed collections with inference networks.In:Proc.of the 18th Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.Seattle:ACM Press,1995.21-28.

引用本文

郎皓,王斌,李锦涛,丁凡.文本检索的查询性能预测.软件学报,2008,19(2):291-300

复制

文章指标

点击次数:8254
下载次数: 8818
HTML阅读次数: 0
引用次数: 0

历史

收稿日期:2007-06-20
最后修改日期:2007-08-24
录用日期:
在线发布日期:
出版日期:

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码