语义查询扩展中词语-概念相关度的计算

微信服务号

微信订阅号

2025年4月1日 7:58 星期二

首页 > 过刊浏览>2008年第19卷第8期 >2043-2053

语义查询扩展中词语-概念相关度的计算
DOI:
                        
                    
CSTR:
                        
                    
作者:
                        田 萱田 萱
教育部数据工程与知识工程重点实验室,北京 100872; 中国人民大学 信息学院,北京 100872; 北京林业大学 信息学院,北京 100083
在期刊界中查找
在百度中查找
在本站中查找
杜小勇杜小勇
教育部数据工程与知识工程重点实验室,北京 100872; 中国人民大学 信息学院,北京 100872
在期刊界中查找
在百度中查找
在本站中查找
李海华李海华
教育部数据工程与知识工程重点实验室,北京 100872; 中国人民大学 信息学院,北京 100872
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:Supported by the National Natural Science Foundation of China under Grant Nos.60496325, 60573092 (国家自然科学基金)

Computing Term-Concept Association in Semantic-Based Query Expansion

Author:

TIAN Xuan
TIAN Xuan

在期刊界中查找
在百度中查找
在本站中查找
DU Xiao-Yong
DU Xiao-Yong

在期刊界中查找
在百度中查找
在本站中查找
LI Hai-Hua
LI Hai-Hua

在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [29]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

在基于语义的查询扩展中,为了找到描述查询需求语义的相关概念,词语-概念相关度的计算是语义查询扩展中的关键一步.针对词语-概念相关度的计算,提出一种K2CM(keyword to concept method)方法.K2CM方法从词语-文档-概念所属程度和词语-概念共现程度两个方面来计算词语-概念相关度.词语-文档-概念所属程度来源于标注的文档集中词语对概念的所属关系,即词语出现在若干文档中而文档被标注了若干概念.词语-概念共现程度是在词语概念对的共现性基础上增加了词语概念对的文本距离和文档分布特征的考虑.3种不同类型数据集上的语义检索实验结果表明,与传统方法相比,基于K2CM的语义查询扩展可以提高查询效果.

关键词:语义查询扩展;概念;本体;词语-概念相关度

Abstract:

In semantic-based query expansion, computing term-concept association is a key step in finding associated concepts to describe the needed query. A method called K2CM (keyword to concept method) is proposed to compute the term-concept association. In K2CM, the attaching relationship among term, document and concept together with term-concept co-occurrence relationship are introduced to compute term-concept association. The attaching relationship derives from the fact that a term is attached to some concepts in annotated corpus, where a term is in some documents and the documents are labeled with some concepts. For term-concept co-occurrence relationship, it is enhanced by the text distance and the distribution feature of term-concept pair in corpus. Experimental results of semantic-based search on three different corpuses show that compared with classical methods, semantic-based query expansion on the basis of K2CM can improve search effectiveness.

Key words:semantic-based query expansion; concept; ontology; term-concept association

参考文献

[1]Baeza-Yates R,Ribeiro-Neto B.Modern Information Retrieval.New York:Addison-Wesley-Longman,1999.

[2]Furnas GW,Landauer TK,Gomez LM,Dumais ST.The vocabulary problem in Human-System communication.Communications of the ACM,1987,30(11):964-971.

[3]Qiu YG,Frei HP.Concept based query expansion.In:Korfhage R,Rasmussen E,Willett P,eds.Proc.of the 16th annual Int'l ACM SIGIR Conf.on research and development in information retrieval.Pittsburgh:ACM Press,1993.160-169.

[4]Chang Y,Ounis I,Kim M.Query reformulation using automatically generated query concepts from a document space.Information Processing and Management,2006,42:453-468.

[5]Jing YF,Croft WB.An association thesaurus for information retrieval.Technical Report,UM-CS-1994-017,Amherst:University of Massachusetts,1994.

[6]van Rijsbergen CJ.Information retrieval.Department of Computing Science,University of Glasgow,1979.http://www.dcs.gla.ac.uk/Keith/Preface.html.

[7]Xu JX,Croft WB.Improving the effectiveness of information retrieval with local context analysis.ACM Trans.on Information Systems,2000,18(1):79-112.

[8]Mitra M,Singhal A,Buckley C.Improving automatic query expansion.In:Croft W B,Moffat A,Wilkinson R,Zobel J,eds.Proc.of the 21st Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.Melbourne:ACM Press,1998.206-214.

[9]Salton G,Buckley C.Improving retrieval performance by relevance feedback.Journal of the American Society for Information Science,1990,41(4):288-297.

[10]Cui H,Wen JR,Li MQ.A statistical query expansion model based on query logs.Journal of Software,2003,14(9):1593-1599 (in Chinese with English abstract).http://www.jos.org.cn/1000-9825/19/9.htm

[11]Ido D,Lillian L,Fernando CNP.Similarity-Based models of word cooccurrence probabilities.Machine Learning,1999,34(1-3):43-69.

[12]Zazo áF,Figuerola CG,Berrocal JLA,Rodriguez E.Reformulation of queries using similarity thesauri.Information Processing and Management,2005,41(5):1163-1173.

[13]Zhang M,Song RH,Ma SP.Document Refinternet based on semantic query expansion.Chinese Journal of Computers,2004,27(10):1395-1401 (in Chinese with English abstract).

[14]Lin DK.Automatic retrieval and clustering of similar words.In:Boitet C,Whitelock P,eds.Proc.of the 17th Int'l Conf.on Computational Linguistics.Montreal:Association for Computational Linguistics,1998.79-112.

[15]Kim JW,Candan KS.CP/CV:Concept similarity mining without frequency information from domain describing taxonomies.In:Yu PS,Tsotras VJ,Fox EA,Liu B,eds.Proc.of the 15th ACM Int'l Conf.on Information And Knowledge Management.Arlington:ACM Press,2006.483-492.

[16]Jang MG,Myaeng SH,Park SY.Using mutual information to resolve query translation ambiguities and query term weighting.In:Dale R,Church K,eds.Proc.of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics.College Park:Association for Computational Linguistics,1999.223-229.

[17]Gao JF,Zhou M,Nie JY,He HZ,Chen WJ.Resolving query translation ambiguity using a decaying Co-Occurrence model and syntactic dependence relations.In:Jarvelin K,Chairs P,Baeza-Yates R,Myaeng SH,eds.Proc.Of the 25th Annual Int'l ACM SIGIR Conf.On Research and Development in Information Retrieval.Tampere:ACM Press,2002.183-190.

[18]Gregory G.Use of syntactic context to produce term association lists for text retrieval.In:Belkin N,Ingwersen P,Pejtersen AM,eds.Proc.of the 15th Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.Copenhagen:ACM Press,1992.89-97.

[19]Loh S,Wives LK,de Oliveira JPM.Concept-Based knowledge discovery in texts extracted from the Web.ACM SIGKDD Explorations Newsletter,2000,2(1):29-39.

[20]Fraenkel AS,Klein ST.Information retrieval from annotated texts.Journal of the American Society for Information Science,1999,50(10):845-854.

[21]Sun RX,Ong CH,Chua TS.Mining dependency relations for query expansion in passage retrieval.In:Efthimiadis EN,Dumais ST,Hawking D,Jarvelin K,eds.Proc.Of the 29th Annual Int'l ACM SIGIR Conf.On Research and Development in Information Retrieval.Seattle:ACM Press,2006.382-389.

[22]Lu S,Bai S.Quantitative analysis of context field in natural language Processing.Chinese Journal of Computers,2001,24(7):742-747 (in Chinese with English abstract).

[23]Xu JX,Croft WB.Query expansion using local and global document analysis.In:Frei HP,Harman D,Schable P,Wilkinson R,eds.Proc.Of the 19th Annual Int'l ACM SIGIR Conf.On Research and Development in Information Retrieval.Zürich:ACM Press,1996.4-11.

[24]Martin T,Ralf S,Gerhard W.Efficient and self-tuning incremental query expansion for Top-K query Processing.In:Baeza-Yates R,Ziviani N,eds.Proc.of the 28th Annual Int'l ACM SIGIR Conf.on Research and Development in Information Retrieval.Salvador:ACM Press,2005.242-249.

[25]Green SJ.Building hypertext links by computing semantic similarity.IEEE Trans.on Knowledge and Data Engineering,1999,11(5):713-730.

[26]Kandola JS,Shawe-Taylor J,Cristianini N.Learning semantic similarity.In:Becker S,Thrun S,Obermayer K,eds.Advances in Neural Information Processing Systems 15 (Neural Information Processing Systems,NIPS 2002).Vancouver:MIT Press,2002.657-664.

[27]Varelas G,Voutsakis E,Raftopoulou P,Petrakis EGM,Milios EE.Semantic similarity methods in WordNet and Their Application to Information Retrieval on the Web.In:Bonifati A,Lee D,eds.Proc.of the 7th Annual ACM Int'l Workshop on Web Information and Data Management.Bremen:ACM Press,2005.10-16.

[28]Fang H,Tao T,Zhai CX.A formal study of information retrieval heuristics.In:Sanderson M,Jarvelin K,Allan J,Bruza P,eds.Proc.Of the 27th annual Int'l ACM SIGIR Conf.On Research and Development in Information Retrieval.Sheffield:ACM Press,2004.49-56.

[29]Lin J,Demner-Fushman D.The role of knowledge in conceptual retrieval:A study in the domain of clinical medicine.In:Efthimiadis EN,Dumais ST,Hawking D,Jarvelin K,eds.Proc.Of the 29th Annual Int'l ACM SIGIR Conf.On Research and Development in Information Retrieval.Seattle:ACM Press,2006.99-106.

引用本文

田萱,杜小勇,李海华.语义查询扩展中词语-概念相关度的计算.软件学报,2008,19(8):2043-2053

复制

文章指标

点击次数:5534
下载次数: 7733
HTML阅读次数: 0
引用次数: 0

历史

收稿日期:2007-02-14
最后修改日期:2007-08-24
录用日期:
在线发布日期:
出版日期:

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码