纯Peer to Peer环境下有效的Top-k查询
作者:
基金项目:

Supported bvthe National Natural Science Foundation of China under Grant No.604963205,60473069(国家自然科学基金);the National High-Tech Research and Development Plan of China under Grant No.2002AA4Z3130(国家高技术研究发展计划(863));the National Grand Fundamental Research 973 Program of China under Grant No.2001CCA03000(国家重点基础研究发展规划(973));the Key Science-Technology Project of Beijing of China under Grant No.H030130040011(北京市科技计划重大项目)


Efficient Top-k Query Processing in Pure Peer-to-Peer Network
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [20]
  • |
  • 相似文献 [20]
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    目前大多数的Peer-to-Peer(P2P)系统只支持基于文件标识的搜索,用户不能根据文件的内容进行搜索.Top-k查询被广泛地应用于搜索引擎中,获得了巨大的成功.可是,由于P2P系统是一个动态的、分散的系统,在纯的P2P环境下进行top-k查询是具有挑战性的.提出了一种基于直方图的分层top-k查询算法.首先,采用层次化的方法实现分布式的top-k查询,将结果的合并和排序分散到P2P网络中的各个节点上,充分利用了网络中的资源.其次,根据节点返回的结果为节点构建直方图,利用直方图估计节点可能的分数上限,对节点进行选择,提高了查询效率.实验证明,top-k查询提高了查询效果,而直方图则提高了查询效率.

    Abstract:

    Most of the existing peer-to-peer (P2P) systems only support simple title-based search, and users cannot search the data based on their content. Top-k query is widely used in the search engine and gains great success. However, Processing top-k query in pure P2P network is very challenging because a P2P system is a dynamic and decentralized system. An efficient hierarchical top-k query processing algorithm based on histogram is proposed. First, a distributed query processing model for top-k query is proposed. It does top-k query in a hierarchical way. Ranking and merging of documents are distributed across the peers, which takes full advantage of the computing resource of the network. Next, a histogram is constructed for each peer according to the top k results returned by the peer, and used to estimate the possible upper bound of the score for the peer. By the histogram information, the most possible peers are selected to send the query, so as to greatly improve the search efficiency. Experimental results show that the top-k query improves the query effectiveness, and the histogram improves the query efficiency.

    参考文献
    [1]Napster Home Page. http://www.napster.com/
    [2]Gnutella Home Page. http://www.gnutella.com/
    [3]Stoica I, Morris R, Karger D, Kaashoek MF, Balakrishnan H. Chord: A scalable peer-to-peer lookup service for internet applications. ACM SIGCOMM Computer Communication Review, 2001,31 (4): 149-160.
    [4]Baeza-Yates R, Ribeiro-Neto B. Modem Information Retrieval. Boston: Addison Wesley, 1999.27-30.
    [5]Palmer CR, Steffan JG. Generating network topologies that obey power law. In: Proc. of the GLOBECOM. San Francisco: IEEE,2000. 434-438. http://citeseer. ist.psu.edu/palmer00generating.html
    [6]Ripeanu M. Peer-to-Peer architecture case study: Gnutella network. Technical Report, TR-2001-26, University of Chicago, 2001.
    [7]Buckley C. Implementation of the SMART information retrieval system. Technical Report, TR35-686, Cornell University, 1985.
    [8]Yu C, Philip G, Meng WY. Distributed top-n query processing with possibly uncooperative local systems. In: Freytag JC,Lockemann PC, Abiteboul S, Carey MJ, Selinger PG, Heuer A, eds. Proc. of the 29th Int'l Conf. on Very Large Data Bases. Berlin:Morgan Kaufmann Publishers, 2003.117-128.
    [9]Gravano L, Garcia-Molina H, Tomasic A. The effectiveness of GLOSS for the text database discovery problem. In: Snodgrass RT,Winslett M, eds. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. New York: ACM Press, 1994. 126-137.
    [10]Callan JP, Lu ZH, Croft WB. Searching distributed collections with inference networks. In: Proc. of the 18th Annual Int'l ACM SIGIR Conf. on Research and Development in Information Retrieval. Seattle: ACM Press, 1995.21-28.
    [11]张炜,李建中,高宏,潘海为.基于Peer-to-Peer的多媒体数据库K-NN查询处理.计算机科学,2002,29(8):190-193.
    [11]Zhang W, Li JZ, Gao H, Pan HW. K-NN query processing in peer-to-peer multimedia database. Computer Science,2002,29(8):190-193.
    [12]Crespo A, Garcia-Molina H. Routing indices for peer-to-peer systems. In: Proc. of the 22nd Int'l Conf. on Distributed Computing Systems. Vienna: IEEE Computer Society, 2002.23-34. http://dbpubs.stanford.edu:8090/pub/2001-48
    [13]Yang B, Garcia-Molina H. Efficient search in peer-to-peer networks. In: Proc. of the 22nd Int'l Conf. on Distributed Computing Systems. Vienna: IEEE Computer Society, 2002.5-14. http://dbpubs.stanford.edu:8090/pub/2001-47
    [14]Cuenca-Acuna FM, Nguyen TD. Text-Based content search and retrieval in ad hoc P2P communities. Technical Report,DCS-TR-483, Department of Computer Science, Rutgers University, 2002.
    [15]Tang CQ, Yu ZH, Mahalingam M. pSearch: Information retrieval in structured overlays. ACM SIGCOMM Computer Communication Review, 2003,33(1):89-94.
    [16]Ratnasamy S, Francis P, Handley M, Karp RM, Schenker S. A scalable content-addressable network. ACM SIGCOMM Computer Communication Review, 2001,31 (4): 161-172.
    [17]Ling B, Lu ZG, Ng W, Ooi BC, Tan KL, Zhou AY. A content-based resource location mechanism in PeerIS. In: Proc. of the 3rd Int'l Conf. on Web Information Systems Engineering. Singapore: IEEE, 2002. 279-289. http://citeseer. ist.psu.edu/ling02contentbased.html
    [18]黄维雄,黄铭钧,陈建利,王晓宇,凌波,周傲英.一种基于自配置策略的新型peer-to-peer平台系统.软件学报,2003,14(2):237-246.http://www.jos.org.cn/1000-9825/14/237.htm
    [18]Ng W, Ooi BC, Tan KL, Wang XY, Ling B, Zhou AY. A novel peer-to-peer system based on self-configuration. Journal of Software, 2003,14(2):237-246 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/14/237.htm
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

何盈捷,王珊,杜小勇.纯Peer to Peer环境下有效的Top-k查询.软件学报,2005,16(4):540-552

复制
分享
文章指标
  • 点击次数:4493
  • 下载次数: 5811
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:2003-12-02
  • 最后修改日期:2004-04-01
文章二维码
您是第19867314位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号