一种云环境下的大数据Top-K查询方法

doi:10.13328/j.cnki.jos.004564

微信服务号

微信订阅号

2025年8月8日 0:57 星期五

首页 > 过刊浏览>2014年第25卷第4期 >813-825. DOI:10.13328/j.cnki.jos.004564

PDF HTML阅读 XML下载导出引用引用提醒

一种云环境下的大数据Top-K查询方法
DOI:
                        10.13328/j.cnki.jos.004564
                    
CSTR:
                        
                    
作者:
                        慈祥慈祥
中国人民大学 信息学院, 北京 100872
在期刊界中查找
在百度中查找
在本站中查找
马友忠马友忠
中国人民大学 信息学院, 北京 100872
在期刊界中查找
在百度中查找
在本站中查找
孟小峰孟小峰
中国人民大学 信息学院, 北京 100872
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金（61379050，91224008）；国家高技术研究发展计划（863）（2013AA013204）；高等学校博士学科点专项科研基金（20130004130001）

Method for Top-K Query on Big Data in Cloud

Author:

CI Xiang
CI Xiang
School of Information, Renmin University of China, Beijing 100872, China
在期刊界中查找
在百度中查找
在本站中查找
MA You-Zhong
MA You-Zhong
School of Information, Renmin University of China, Beijing 100872, China
在期刊界中查找
在百度中查找
在本站中查找
MENG Xiao-Feng
MENG Xiao-Feng
School of Information, Renmin University of China, Beijing 100872, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [19]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

Top-K查询在搜索引擎、电子商务等领域有着广泛的应用.Top-K查询从海量数据中返回最符合用户需求的前K个结果，主要目的是消除信息过载带来的负面影响.大数据背景下的Top-K查询，给数据管理和分析等方面带来新的挑战.结合MapReduce的特点，从数据划分、数据筛选等方面对云环境下的大数据Top-K查询问题进行深入研究.实验结果表明，该方法具有良好的性能和扩展性.

关键词:Top-K查询;云计算;MapReduce

Abstract:

Top-K query has been widely used in lots of modern applications such as search engine and e-commerce. Top-K query returns the most relative results for user from massive data, and its main purpose is to eliminate the negative effect of information overload. Top-K query on big data has brought new challenges to data management and analysis. In light of features of MapReduce, this paper presents an in-depth study of Top-K query on big data from the perspective of data partitioning and data filtering. Experimental results show that the proposed approaches have better performance and scalability.

Key words:top-K query;cloud;MapReduce

参考文献

[1] Fagin R. Combining fuzzy information from multiple systems. Journal of Computer and System Sciences, 1999,58(1):83-99. [doi: 10.1006/jcss.1998.1600]

[2] Fagin R, Lotem A, Naor M. Optimal aggregation algorithms for middleware. Journal of Computer and System Sciences, 2003,66(4): 614-656. [doi: 10.1016/S0022-0000(03)00026-6]

[3] Güntzer U, Balke W, Kießling W. Towards efficient multi-feature queries in heterogeneous environments. In: Proc. of the Int'l Conf. on Information Technology: Coding and Computing (ITCC 2001). Piscataway: IEEE, 2001. 622-628. [doi: 10.1109/ITCC. 2001.918866]

[4] Chang KCC, Hwang SW. Minimal probing: Supporting expensive predicates for top-k queries. In: Proc. of the 2002 ACM SIGMOD Int'l Conf. on Management of Data. New York: ACM Press, 2002. 346-357. [doi: 10.1145/564691.564731]

[5] Bruno N, Chaudhuri S, Gravano L. Top-K selection queries over relational databases: Mapping strategies and performance evaluation. ACM Trans. on Database Systems, 2002,27(2):153-187. [doi: 10.1145/568518.568519]

[6] Ilyas IF, Aref WG, Elmagarmid AK. Supporting top-k join queries in relational databases. In: Proc. of the 29th Int'l Conf. on Very Large Databases. San Fransisco: Morgan Kaufmann Publishers, 2003. 207-221. [doi: 10.1007/s00778-004-0128-2]

[7] Vlachou A, Doulkeridis C, Kotidis Y, Nørvåg K. Reverse top-k queries. In: Proc. of the 26th IEEE Int'l Conf. on Data Engineering. Piscataway: IEEE, 2010. 365-376. [doi: 10.1109/ICDE.2010.5447890]

[8] Vlachou A, Doulkeridis C, Kotidis Y, Nørvåg K. Monochromatic and bichromatic reverse top-k queries. IEEE Trans. on Knowledge and Data Engineering, 2011,23(8):1215-1229. [doi: 10.1109/TKDE.2011.50]

[9] Marian A, Bruno N, Gravano L. Evaluating top-k queries over Web-accessible databases. ACM Trans. on Database Systems, 2004, 29(2):319-362. [doi: 10.1145/1005566.1005569]

[10] Cao P, Wang Z. Efficient top-K query calculation in distributed networks. In: Proc. of the 23th Annual ACM Symp. on Principles of Distributed Computing. New York: ACM Press, 2004. 206-215. [doi: 10.1145/1011767.1011798]

[11] Michel S, Triantafillou P, Weikum G. KLEE: A framework for distributed top-k query algorithms. In: Proc. of the 31st Int'l Conf. on Very Large Data Bases. New York: ACM Press, 2005. 637-648. http://dl.acm.org/citation.cfm?id=1083667

[12] Zhao KP, Tao YF, Zhou SG. Efficient top-k processing in large-scaled distributed environments. Data and Knowledge Engineering, 2007,63(2):315-335. [doi: 10.1016/j.datak.2007.03.012]

[13] Dedzoe WK, Lamarre P, Akbarinia R, Valduriez P. ASAP top-k query processing in unstructured P2P systems. In: Proc. of the 10th IEEE Int'l Conf. on Peer-to-Peer Computing. Piscataway: IEEE, 2010. 1-10. [doi: 10.1109/P2P.2010.5569974]

[14] Vlachou A, Doulkeridis C, Nørvåg K, Vazirgiannis M. On efficient top-k query processing in highly distributed environments. In: Proc. of the 2008 ACM SIGMOD Int'l Conf. on Management of Data. New York: ACM Press, 2008. 753-764. [doi: 10.1145/ 1376616.1376692]

[15] Vlachou A, Doulkeridis C, Nørvåg K. Distributed top-k query processing by exploiting skyline summaries. Distributed and Parallel Databases, 2012,30(3-4):239-271. [doi: 10.1007/s10619-012-7094-2]

[16] Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters. Communications of the ACM, 2008,51(1):107-113. [doi: 10.1145/1327452.1327492]

[17] Candan KS, Kim JW, Nagarkar P, Nagendra M, Yu RW. RanKloud: Scalable multimedia data processing in server clusters. IEEE MultiMedia, 2011,18(1):64-77. [doi: 10.1109/MMUL.2010.70]

[18] Doulkeridis C, Nørvåg K. On saying “enough already!” in MapReduce. In: Proc. of the 1st Int'l Workshop on Cloud Intelligence. New York: ACM Press, 2012. 7-7. [doi: 10.1145/2347673.2347680]

[19] Tsaparas P, Palpanas T, Kotidis Y, Koudas N, Srivastava D. Ranked join indices. In: Proc. of the 19th IEEE Int'l Conf. on Data Engineering. Piscataway: IEEE, 2003. 277-288. [doi: 10.1109/ICDE.2003.1260799]

引用本文

慈祥,马友忠,孟小峰.一种云环境下的大数据Top-K查询方法.软件学报,2014,25(4):813-825

复制

文章指标

点击次数:6976
下载次数: 10349
HTML阅读次数: 3439
引用次数: 0

历史

收稿日期:2013-09-10
最后修改日期:2013-12-18
录用日期:
在线发布日期: 2014-03-28
出版日期:

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码