基于图压缩的k可达查询处理
作者:
基金项目:

国家重点基础研究发展计划(973)(2012CB316200);国家自然科学基金(61190115,61033015,61173023);中央高校基本科研业务费专项资金(HIT.NSRIF.201180)


K-Reach Query Processing Based on Graph Compression
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [16]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    研究了基于图压缩的k可达查询处理,提出了一种支持k可达查询的图压缩算法k-RPC及无需解压缩的查询处理算法,k-RPC算法在所有基于等价类的支持k-reach查询的图压缩算法中是最优的.由于k-RPC算法是基于严格的等价关系,因此进一步又提出了线性时间的近似图压缩算法k-GRPC.k-GRPC算法允许从原始图中删除部分边,然后使用k-RPC获得更好的压缩比.提出了线性时间的无需解压缩的查询处理算法.真实数据上的实验结果表明,对于稀疏的原始图,两种压缩算法的压缩比分别可以达到45%,对于稠密的原始图,两种压缩算法的压缩比分别可以达到75%和67%;与在原始图上直接进行查询处理相比,两种基于压缩图的查询处理算法效率更好,在稀疏图上的查询效率可以提高2.5倍.

    Abstract:

    This paper focuses on k-reach query processing based on graph compression and proposes a k-reach query preserving graph compression algorithm k-RPC and a query processing algorithm which is able to query on the compressed graph without decompression. k-RPC algorithm is optimal among all the compression algorithms based on equivalent class which supports k-reach query. Considering k-RPC is based on a strict equivalent relation, this study further produces a linear approximate graph compression algorithm k-GRPC. k-GRPC first removes some edges from the input graph, then utilizes k-RPC to acquire better compression ratio. Novel linear query processing algorithms which are able to answer k-reach query on the compressed graph without decompression are introduced. Experiments on real datasets demonstrate that the compression ratios of these two compression algorithms can reach to 45% for sparse graphs and 75%, 67% for dense graphs. Comparing with the query processing on original graphs, the query performance on compressed graphs is better and for sparse graphs, it can be 2.5 times better.

    参考文献
    [1] Cheng J, Shang Z, Cheng H, Wang H, Yu JX. K-Reach: Who is in your small world? In: Proc. of the 2012 VLDB Endowment. Istanbul: ACM Press, 2012. 1292-1303. http://research.microsoft.com/pubs/166387/vldb2012.pdf
    [2] Jin R, Ruan R, Dey S, Yu JX. Scarab: Scaling reachability computation on large graphs. In: Proc. of the 2012 Int'l Conf. on Management of Data. Scottsdale: ACM Press, 2012. 169-180. [doi: 10.1145/2213836.2213856]
    [3] Jin RM, Xiang Y, Ruan N, Fuhry D. 3-Hop: A high-compression indexing scheme for reachability query. In: Proc. of the 35th SIGMOD Int'l Conf. on Management of Data. Providence: ACM Press, 2009. 813-826. [doi: 10.1145/1559845.1559930]
    [4] Jin RM, Xiang Y, Ruan N, Wang HX. Efficiently answering reachability queries on very large directed graphs. In: Proc. of the 2008 Int'l Conf. on Management of Data. Vancouver: ACM Press, 2008. 595-608. [doi: 10.1145/1376616.1376677]
    [5] Yu JX, Cheng J. Graph reachability queries: A survey. In: Managing and Mining Graph Data. New York: Springer-Verlag, 2010. 181-215.
    [6] Cohen E, Halperin E, Kaplan H, Zwick U. Reachability and distance queries via 2-hop labels. SIAM Journal on Computing, 2003, 32(5):1338-1355. [doi: 10.1137/S0097539702403098]
    [7] Wei F. TEDI: Efficient shortest path query answering on graphs. In: Proc. of the 2010 Int'l Conf. on Management of Data. Indianapolis: ACM Press, 2010. 99-110. [doi: 10.1145/1807167.1807181]
    [8] Cheng J, Yu JX. On-Line exact shortest distance query processing. In: Proc. of the 12th Int'l Conf. on Extending Database Technology. Saint Petersburg: ACM Press, 2009. 481-492. [doi: 10.1145/1516360.1516417]
    [9] Xiao YH, Wu WT, Pei J, Wang W, He ZY. Efficiently indexing shortest paths by exploiting symmetry in graphs. In: Proc. of the 12th Int'l Conf. on Extending Database Technology. Saint Petersburg: ACM Press, 2009. 493-504. [doi: 10.1145/1516360. 1516418]
    [10] Fan WF, Li JZ, Wang X, Wu YH. Query preserving graph compression. In: Proc. of the 2012 Int'l Conf. on Management of Data. Scottsdale: ACM Press, 2012. 157-168. [doi: 10.1145/2213836.2213855]
    [11] Aho AV, Garey MR, Ullman JD. The transitive reduction of a directed graph. SIAM Journal on Computing, 1972,1(2):131-137. [doi: 10.1137/0201008]
    [12] Simon K. An improved algorithm for transitive closure on acyclic digraphs. Theoretical Computer Science, 1988,58(1):325-346. [doi: 10.1016/0304-3975(88)90032-1]
    [13] Jagadish HV. A compression technique to materialize transitive closure. ACM Trans. on Database System, 1990,15(4):558-598. [doi: 10.1145/99935.99944]
    [14] Cheng J, Ke YP, Chu S, Cheng C. Efficient processing of distance queries in large graphs: a vertex cover approach. In: Proc. of the 2012 Int'l Conf. on Management of Data. Scottsdale: ACM Press, 2012. 457-468. [doi: 10.1145/2213836.2213888]
    [15] Cheng J, Zhu LH, Ke YP, Chu S. Fast algorithms for maximal clique enumeration with limited memory. In: Proc. of the 18th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. Beijing: ACM Press, 2012. 1240-1248. [doi: 10.1145/2339530. 2339724]
    [16] Schank T, Wagner D. Finding, counting and listing all triangles in large graphs, an experimental study. In: Proc. of the 4th Int'l Workshop on Efficient and Experimental Algorithms. Santorini Island: Springer-Verlag, 2005. 606-609. [doi: 10.1007/11427186 _54]
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

李鸣鹏,高宏,邹兆年.基于图压缩的k可达查询处理.软件学报,2014,25(4):797-812

复制
分享
文章指标
  • 点击次数:5523
  • 下载次数: 7732
  • HTML阅读次数: 2379
  • 引用次数: 0
历史
  • 收稿日期:2013-10-09
  • 最后修改日期:2013-12-18
  • 在线发布日期: 2014-03-28
文章二维码
您是第19728283位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号