一种基于共享执行策略的间隔查询优化技术
作者:
基金项目:

国家自然科学基金(61432006);中国人民大学科学研究基金(中央高校基本科研业务费专项资金)(10XNI018)


Technique Based on Shared Execution Strategy for Optimizing Interval Query
Author:
Fund Project:

National Natural Science Foundation of China (61432006); Fundamental Research Funds for the Central Universities, and the Research Funds of Renmin University of China (10XNI018)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [26]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    间隔查询作为重要的查询类型,广泛应用在社交网络、信息检索和数据库领域.为了支持高效的间隔查询,涌现出多种优化技术.尽管已有方法能够快速响应单个间隔查询,然而当查询负载超过服务器的处理能力时,70%的查询均不能在期望时间内得到响应.针对这一问题,提出采用共享执行策略优化间隔查询的方法SESIQ(shared execution strategy for interval queries).SESIQ对间隔查询进行批处理,分析一组间隔查询间可共享的操作,减少重复数据的访问,从而降低磁盘I/O和网络传输代价,提高检索性能.理论分析并实验验证了SESIQ的可行性,基于两种真实数据集的大量实验结果表明,SESIQ是有效的,间隔查询的检索性能可提升数十倍.

    Abstract:

    As an important query type, interval query is widely used in social networks, information retrieval and database domain. Many kinds of optimization techniques have sprung up to support effective interval query. Although existing methods are efficient to handle single query, they all suffer from performance problem when the concurrent query loads exceed the processing capacity of the server such that more than 70% queries couldn't receive the results in the expected time. To solve this problem, this paper presents a method named SESIQ (shared execution strategy for interval queries). SESIQ batches interval queries, analyzes common operations among a group of interval queries and reduces duplicate data access to lower the cost of disk I/O and network transmission. The paper theoretically studies and analyzes SESIQ, and demonstrates the feasibility by large number of experiments based on two types of real datasets. Results show that SESIQ improves the performance of interval query by several ten folds.

    参考文献
    [1] Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber R. Bigtable:A distributed storage system for structured data. In:Proc. of the Operating Systems Design and Implementation (OSDI). Seattle:USENIX Association, 2006. 205-218.
    [2] Apache Hbase. http://hbase.apache.org/
    [3] Cooper BF, Ramakrishnan R, Srivastava U, Silberstein A, Bohannon P, Jacobsen HA, Puz N, Weaver D, Yerneni R. PNUTS:Yahoo!'s hosted data serving platform. In:Proc. of the VLDB Endowment. Auckland:ACM Press, 2008. 1277-1288.[doi:10. 14778/1454159.1454167]
    [4] Lakshman A, Malik P. Cassandra:A decentralized structured storage system. ACM Operating Systems Review (SIGOPS), 2010, 44(2):35-40.[doi:10.1145/1773912.1773922]
    [5] DeCandia G, Hastorun D, Jampani M, Kakulapati G, Lakshman A, Pilchin A, Sivasubramanian S, Vosshall P, Vogels W. Dynamo:Amazon's highly available key-value store. In:Proc. of the 21st ACM Symp. on Operating Systems Principles (SOSP). Stevenson:ACM Press, 2007. 205-220.[doi:10.1145/1294261.1294281]
    [6] Chen Y, Chuang K, Chen M. Spatial-Temporal query homogeneity for KNN object search on road networks. In:Proc. of the 22nd ACM Int'l Conf. on Information and Knowledge Management (CIKM). San Francisco:ACM Press, 2013. 1019-1028.[doi:10.1145/2505515.2505524]
    [7] Salzberg B, Tsotras VJ. Comparison of access methods for time evolving data. ACM Computing Surveys (CSUR), 1999,31(2):158-221.[doi:10.1145/319806.319816]
    [8] Elmasri R, Wuu GTJ, Kim YJ. The time index:An access structure for temporal data. In:Proc. of the 16th Int'l Conf. on Very Large Data Bases (VLDB). Brisbane:Morgan Kaufmann Publishers, 1990. 1-12.
    [9] Kouramajian V, Kamel I, Elmasri R, Waheed R. The time index+:An incremental access structure for temporal databases. In:Proc. of the 3rd Int'l Conf. on Information and Knowledge Management (CIKM). Gaithersburg:ACM Press, 1994. 296-303.
    [10] Ang C, Tan K. The interval B-tree. Information Processing Letters, 1994,53(2):85-89.[doi:10.1145/191246.191298]
    [11] Stantic B, Topor R, Terry J, Sattar A. Advanced indexing technique for temporal data. Computer Science and Information Systems (COMSIS), 2010,7(4):679-703.[doi:10.2298/CSIS101020035S]
    [12] Kolovson C, Stonebraker M. Segment indexes:Dynamic indexing techniques for multi-dimensional interval data. SIGMOD Record, 1991,20(2):138-147.[doi:10.1145/115790.115807]
    [13] Bliujute R, Jensen CS, Saltenis S, Slivinskas G. Light-Weight indexing of general bitemporal data. In:Proc. of the 12th Int'l Conf. on Scientific and Statistical Database Management (SSDBM). Berlin:IEEE Computer Society, 2000. 125-138.[doi:10.1109/SSDM.2000.869783]
    [14] Kumar A, Tsotras VJ, Faloutsos C. Designing access methods for bitemporal databases. IEEE Trans. on Knowledge and Data Engineering (TKDE), 1998,10(1):1-20.[doi:10.1109/69.667079]
    [15] Zheng C, Shen G, Li S, Shenker S. Distributed segment tree:Support of range query and cover query over DHT. In:Proc. of the 5th Int'l workshop on Peer-To-Peer Systems (IPTPS). Santa Barbara, 2006.
    [16] Wang J, Wu S, Gao H, Li J, Ooi B.C. Indexing multi-dimensional data in a cloud system. In:Proc. of the ACM SIGMOD Int'l Conf. on Management of Data (SIGMOD). Indianapolis:ACM Press, 2010. 591-602.[doi:10.1145/1807167.1807232]
    [17] Wu S, Wu KL. An indexing framework for efficient retrieval on the cloud. IEEE Data Engineering Bulletin, 2009,32(1):75-82.
    [18] Sfakianakis G, Patlakas I, Ntarmos N, Triantafillou P. Interval indexing and querying on key-value cloud stores. In:Proc. of the 29th IEEE Int'l Conf. on Data Engineering (ICDE). Brisban:ACM Press, 2013. 805-816.[doi:10.1109/ICDE.2013.6544876]
    [19] Nishimura S, Das S, Agrawal D, Abbadi AE. MD-HBase:A scalable multi-dimensional data infrastructure for location aware services. In:Proc. of the 12th IEEE Int'l Conf. on Mobile Data Management (MDM). Luleå:IEEE Computer Society, 2011. 7-16.[doi:10.1109/MDM.2011.41]
    [20] Hsu YT, Pan YC, Wei LY, Peng WC, Lee WC. Key formulation schemes for spatial index in cloud data managements. In:Proc. of the 13th IEEE Int'l Conf. on Mobile Data Management (MDM). Bengrluru:IEEE Computer Society, 2012. 21-26.[doi:10.1109/MDM.2012.67]
    [21] Chen K, Jin P, Yue L. Efficient buffer management for PCM-enhanced hybrid memory architecture. In:Proc. of the 17th Asia-Pacific Web Conf. (APWeb). Guangzhou:Springer-Verlag, 2015. 29-40.[doi:10.1007/978-3-319-25255-1_3]
    [22] Jin P, Ou Y, Härder T, Li Z. AD-LRU:An efficient buffer replacement algorithm for flash-based databases. Data & Knowledge Engineering (DKE), 2012,72:83-102.[doi:10.1016/j.datak.2011.09.007]
    [23] Ou Y, Jin P, Härder T. Flash-Aware buffer management for database systems. Int'l Journal of Knowledge-Based Organizations (IJKBO), 2013,3(4):22-39.[doi:10.4018/ijkbo.2013100102]
    [24] Nguyen T, He H, Chen Y. SeTPR*-tree:Efficient buffering for spatiotemporal indexes via shared execution. The Computer Journal (CJ), 2013,56(1):115-137.[doi:10.1093/comjnl/bxs020]
    [25] Giannikis G, Alonso G, Kossmann D. SharedDB:Killing one thousand queries with one stone. VLDB Endowment, 2012,5(6):526-537.[doi:10.14778/2168651.2168654]
    [26] Xue ZB, Zhou X, Wang S. TOF:A throughput oriented framework for spatial queries processing in multi-core environment. In:Proc. of the Int'l Conf. on Database Systems for Advanced Applications (DASFAA). Hanoi:Springer-Verlag, 2015. 241-256.[doi:10.1007/978-3-319-18123-3_15]
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

周新,张孝,薛忠斌,王珊.一种基于共享执行策略的间隔查询优化技术.软件学报,2016,27(12):3067-3084

复制
分享
文章指标
  • 点击次数:1726
  • 下载次数: 3863
  • HTML阅读次数: 1160
  • 引用次数: 0
历史
  • 收稿日期:2015-09-21
  • 最后修改日期:2015-11-16
  • 在线发布日期: 2016-12-06
文章二维码
您是第19765693位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号