Supported bvthe National Natural Science Foundation of China under Grant No.604963205,60473069(国家自然科学基金);the National High-Tech Research and Development Plan of China under Grant No.2002AA4Z3130(国家高技术研究发展计划(863));the National Grand Fundamental Research 973 Program of China under Grant No.2001CCA03000(国家重点基础研究发展规划(973));the Key Science-Technology Project of Beijing of China under Grant No.H030130040011(北京市科技计划重大项目)
Most of the existing peer-to-peer (P2P) systems only support simple title-based search, and users cannot search the data based on their content. Top-k query is widely used in the search engine and gains great success. However, Processing top-k query in pure P2P network is very challenging because a P2P system is a dynamic and decentralized system. An efficient hierarchical top-k query processing algorithm based on histogram is proposed. First, a distributed query processing model for top-k query is proposed. It does top-k query in a hierarchical way. Ranking and merging of documents are distributed across the peers, which takes full advantage of the computing resource of the network. Next, a histogram is constructed for each peer according to the top k results returned by the peer, and used to estimate the possible upper bound of the score for the peer. By the histogram information, the most possible peers are selected to send the query, so as to greatly improve the search efficiency. Experimental results show that the top-k query improves the query effectiveness, and the histogram improves the query efficiency.
何盈捷,王珊,杜小勇.纯Peer to Peer环境下有效的Top-k查询.软件学报,2005,16(4):540-552复制