文本检索的查询性能预测
作者:
基金项目:

Supported by the National Natural Science Foundation of China under Grant No.60603094 (国家自然科学基金); the National Basic Research Program of China under Grant No.2004CB318109 (国家重点基础研究发展计划(973)); the Beijing Science and Technology Planning Program of China under Grant No.D0106008040291 (北京市科技计划)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [21]
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    目前,查询性能预测(predicting query performance,简称PQP)已经被认为是检索系统最重要的功能之一.近几年的研究和实验表明,PQP技术在文本检索领域有着广阔的发展前景和拓展空间.对文本检索中的PQP进行综述,重点论述其主要方法和关键技术.首先介绍了常用的实验语料和评价体系;然后介绍了影响查询性能的各方面因素;之后,按照基于检索前和检索后的分类体系概述了目前主要的PQP方法;简介了PQP在几个方面的应用;最后讨论了PQP所面临的一些挑战.

    Abstract:

    Predicting query performance (PQP) has recently been recognized by the IR (information retrieval) community as an important capability for IR systems. In recent years, research work carried out by many groups has confirmed that predicting query performance is a good method to figure out the robustness problem of the IR system and useful to give feedback to users, search engines and database creators. In this paper, the basic predicting query performance approaches for text retrieval are surveyed. The data for experiments and the methods for evaluation are introduced, the contributions of different factors to overall retrieval variability across queries are presented, the main PQP approaches are described from Pre-Retrieval to Post-Retrieval aspects, and some applications of PQP are presented. Finally, several primary challenges and open issues in PQP are summarized.

    参考文献
    [1]Yates B,Neto R.Modern Information Retrieval.New York:ACM Press,1999.
    [2]Harman D,Buckley C.The NRRC reliable information access (RIA) workshop.In:Sanderson M,Jarveln K,Allan J,Bruza P,eds.Proc.of the 27th Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.Sheffield:ACM Press,2004.528-529.
    [3]Voorhees EM.Overview of the TREC 2004 robust retrieval track.In:Online Proc.of the 2004 Text Retrieval Conf.(TREC 2004).2004.http://trec.nist.gov/pubs/trec13/t13_proceedings.html
    [4]Yom-Tov E,Fine S,Carmel D,Darlow A.Learning to estimate query difficulty:Including applications to missing content detection and distributed information retrieval.In:Proc.of the 28th Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.Salvador:ACM Press,2005.512-519.
    [5]Vinay V,Cox IJ,Milic-Frayling N,Wood K.On ranking the effectiveness of searches.In:Proc.of the 29th Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.New York:ACM Press,2006.398-404.
    [6]Zhou Y,Croft WB.Ranking robustness:A novel framework to predict query performance.In:Proc.of the 15th ACM Int'l Conf.on Information and Knowledge Management.Arlington:ACM Press,2006.567-574.
    [7]Cronen-Townsend S,Zhou Y,Croft WB.Predicting query performance.In:Proc.of the 25th Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.Tampere:ACM Press,2002.299-306.
    [8]Gibbons JD,Chakraborty S.Nonparametric Statistical Inference.3rd ed.,New York:Marcel Dekker,1992.
    [9]Kreyszig E.Advanced Engineering Mathematics.John Wiley & Sons,Inc.,1997.
    [10]Carmel D,Yom-Tov E,Darlow A,Pelleg D.What makes a query difficult? In:Proc.of the 29th Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.New York:ACM Press,2006.390-397.
    [11]Tomlinson S.Robust,Web and terabyte retrieval with hummingbird search server at TREC 2004.In:Online Proc.of the 2004 Text Retrieval Conf.(TREC 2004).2004.http://trec.nist.gov/pubs/trec13/papers/humingbird.robust.web.tera.pdf
    [12]He B,Ounis I.Inferring query performance using pre-retrieval predictors.In:Apostolico A,Melucci M,eds.String Processing and Information Retrieval,11th Int'l Conf.,SPIRE 2004.LNCS 3246,2004.43-54.
    [13]Plachouras V,He B,Ounis I.University of Glasgow at TREC 2004:Experiment in Web,robust,and terabyte tracks with terrier.In:Online Proc.of the 2004 Text Retrieval Conf.(TREC 2004).2004.http://ir.dcs.gla.ac.uk/terrier/publications/glasgowTrec2004.pdf
    [14]Mothe J,Tanguy L.Linguistic features to predict query difficulty.In:Proc.of the 29th Annual Int'l ACM SIGIR 2005 Workshop on Predicting Query Difficulty-Methods and Applications.http://www.haifa.il.ibm.com/sigir05-qp/index.html
    [15]Lü XQ,Lai ZG,Sun B,Yu SW.Evaluation of topic difficulty.Journal of Tsinghua University (Science and Technology),2005,45(S1):1833-1837 (in Chinese with English abstract).
    [16]Kwok KL,Grunfeld L,Dinstl N,Deng P.TREC 2005 robust track experiments using PIRCS.In:Online Proc.of the 2005 Text Retrieval Conf.(TREC 2005).2005.http://trec.nist.gov/pubs/trec14/papers/queensc-kwok.robust.pdf
    [17]Amati G,Carpineto C,Romano G.Query difficulty,robustness and selective application of query expansion.In:Proc.of the ECIR 2004.127-137.
    [18]Diaz F,Jones R.Using temporal profiles of queries for precision prediction.In:Proc.of the 27th Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.Sheffield:ACM Press,2004.18-24.
    [19]van Rijsbergen CJ.Information Retrieval.2nd ed.,London:Butterworths,1979.
    [20]Xu JX,Croft WB.Query expansion using local and global document analysis.In:Proc.of the 19th Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.Zürich:ACM Press,1996.4-11.
    [21]Callan JP,Lu ZH,Croft WB.Searching distributed collections with inference networks.In:Proc.of the 18th Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.Seattle:ACM Press,1995.21-28.
    相似文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

郎 皓,王 斌,李锦涛,丁 凡.文本检索的查询性能预测.软件学报,2008,19(2):291-300

复制
分享
文章指标
  • 点击次数:8189
  • 下载次数: 8504
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:2007-06-20
  • 最后修改日期:2007-08-24
文章二维码
您是第19728437位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号