一种支持Top-k空间关键词检索的高效压缩索引
作者:
基金项目:

国家自然科学基金(61070054);国家重点基础研究发展计划(973)(2014CB340403);国家高技术研究发展计划(863)(2012AA011001);中央高校基本科研业务费专项资金(10XNI018)


Efficient Compressed Index for Top-k Spatial Keyword Query
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [17]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    基于位置的服务可以指引用户找到在特定位置或区域内能够提供所需要服务的对象(比如找某个高校附近(经纬度标识)的咖啡店).向这类服务提交一个查询位置和多个关键词,该类服务返回k个最相关的对象,对象和查询的相关性同时考虑空间相近性和文本相似性.为了支持高效的top-k空间关键词查询,出现了多种混合索引,然而现有的这些索引为了提供实时响应均耗费大量存储空间.提出一种基于压缩技术的索引CSTI,该索引显著减少了存储开销(至少减少80%甚至到两个数据量级),同时保持高效的查询性能.大量基于真实和仿真数据集的实验结果表明,CSTI在空间开销和响应时间上均优于已有方法.

    Abstract:

    Location-Based services guide a user to find the object which provides services located in a particular position or region (e.g., looking for a coffee shop near a university). Given a query location and multiple keywords, location-based services return the most relevant objects ranked according to location proximity and text relevancy. Various hybrid indexes have been proposed in recent years which combine R-tree and inverted index to improve query efficiency. Unfortunately, the state-of-the-art approaches require more space in order to reduce response time. Cache mechanism is inefficient due to huge storage overhead. In this paper, a novel index based on index compressed technology (CSTI) is proposed, to answer top-k SKQ. CSTI significantly reduces storage overhead (by at least 80%), while maintaining efficient query performance. Extensive experiments based on real dataset and simulated dataset confirm CSTI is effective and efficient.

    参考文献
    [1] Chen YY, Suel T, Markowetz A. Efficient query processing in geographic Web search engines. In: Chaudhuri S, Hristidis V, Polyzotis N, eds. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. Chicago: ACM, 2006. 277-288.
    [2] Hariharan R, Hore B, Li C, Mehrotra S. Processing spatial-keyword queries in geographic information retrieval systems. In: Proc. of the 19th Int'l Conf. on Scientific and Statistical Database Management (SSDBM 2007). Banff: IEEE Computer Society, 2007. 16.
    [3] Zhou Y, Xie X, Wang C, Gong Y, Ma WY. Hybrid index structures for location-based Web search. In: Herzog O, Schek HJ, Fuhr N, Chowdhury A, Teiken W, eds. Proc. of the 2005 ACM CIKM Int'l Conf. on Information and Knowledge Management. Bremen: ACM, 2005. 155-162.
    [4] Khodaei A, Shahabi A, Li C. Hybrid indexing and seamless ranking of spatial and textual features of Web documents. In: Bringas PG, Hameurlain A, Quirchmayr G, eds. Proc. of the 21st Int'l Conf. on Database and Expert Systems Applications. Bilbao: Springer-Verlag, 2010. 450-466.
    [5] Li Z, Lee KCK, Zheng B, Lee WC, Lee DL, Wang X. Ir-Tree: An efficient index for geographic document search. IEEE Trans. on Knowledge and Data Engineering, 2011,23(4):585-599.
    [6] Felipe ID, Hristidis V, Rishe N. Keyword search on spatial databases. In: Alonso GA, Blakeley JLP, Chen A, eds. Proc. of the 24th Int'l Conf. on Data Engineering (ICDE 2008). Cancún: IEEE, 2008. 656-665.
    [7] Cong G, Jensen CS, Wu D. Efficient retrieval of the top-k most relevant spatial Web objects. PVLDB, 2009,2(1):337-348.
    [8] Rocha-Junior JB, Gkorgkas O, Jonassen S, Nørvåg K. Efficient processing of top-k spatial keyword queries. In: Pfoser D, Tao YF, Mouratidis K, Nascimento MA, Mokbel MF, Shekhar SS, Huang Y, eds. Proc. of the 12th Int'l Symp. on Advances in Spatial and Temporal Databases (SSTD 2011). Minneapolis: Springer-Verlag, 2011. 205-222.
    [9] Papadias D, Kalnis P, Zhang J, Tao Y. Efficient OLAP operations in spatial data warehouses. In: Jensen CS, Schneider M, Seeger B, Tsotras VJ, eds. Proc. of the 7th Int'l Symp. on Advances in Spatial and Temporal Databases (SSTD 2001). Redondo Beach: Springer-Verlag, 2001. 443-459.
    [10] Chen L, Cong G, Jensen CS, Wu D. Spatial keyword query processing: An experimental evaluation. PVLDB, 2013,6(3):217-228.
    [11] Lu JH, Lu Y, Cong G. Reverse spatial and textual k nearest neighbor search. In: Sellis TK, Miller RJ, Kementsietsidis A, Velegrakis Y, eds. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data (SIGMOD 2011). Athens: ACM, 2011. 349-360.
    [12] Li GL, Feng JH, Xu J. Desks: Direction-Aware spatial keyword search. In: Kementsietsidis A, Salles MAV, eds. Proc. of the IEEE 28th Int'l Conf. on Data Engineering (ICDE 2012). Washington: IEEE Computer Society, 2012. 474-485.
    [13] Hu J, Fan J, Li GL, Chen SS. Top-k fuzzy spatial keyword search. Chinese Journal of Computers, 2012,35(11):2237-2246 (in Chinese with English abstract).
    [14] Luo SQ, Luo YF, Zhou SG, Cong G, Guan JH: Distributed spatial keyword querying on road networks.In: Amer-Yahia S, Christophides V, Kementsietsidis A, Garofalakis MN, Idreos S, Leroy V, eds. Proc. of the 17th Int'l Conf. on Extending Database Technology (EDBT). Athens: OpenProceedings, 2014. 235-246.
    [15] Zhang CY, Zhang Y, Zhang WJ, Lin XM, Cheema MA, Wang XY. Diversified spatial keyword search on road networks. In: Amer-Yahia S, Christophides V, Kementsietsidis A, Garofalakis MN, Idreos S, Leroy V, eds. Proc. of the 17th Int'l Conf. on Extending Database Technology (EDBT). Athens: OpenProceedings, 2014. 367-378.
    [16] Yan H, Ding S, Suel T. Inverted index compression and query processing with optimized document ordering. In: Quemada J, León G, Maarek YS, Nejdl W, eds. Proc. of the 18th Int'l Conf. on World Wide Web (WWW 2009). Madrid: ACM, 2009. 401-410.
    [17] Zhou X, Zhang X, Wang YH, Li R, Wang S. Efficient distributed multi-dimensional index for big data management. In: Wang JY, Xiong H, Ishikawa Y, Xu JL, Zhou JF, eds. Proc. of the 14th Int'l Conf. on Web-Age Information Management (WAIM 2013). Beidaihe: Springer-Verlag, 2013. 130-141.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

周新,张孝,安润功,薛忠斌,王珊.一种支持Top-k空间关键词检索的高效压缩索引.软件学报,2014,25(S2):157-168

复制
分享
文章指标
  • 点击次数:2786
  • 下载次数: 4528
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:2014-05-07
  • 最后修改日期:2014-08-19
  • 在线发布日期: 2015-01-29
文章二维码
您是第19754468位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号