Cache-Conscious Computation of Closed Iceberg Cubes

微信服务号

微信订阅号

2025-5-13- 3

Home > Archive>Volume 21, Issue 4, 2010 >620-631

Cache-Conscious Computation of Closed Iceberg Cubes
DOI:
                        
                    
Author:
                        LUAN HuaLUAN Hua

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
DU Xiao-YongDU Xiao-Yong

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
WANG ShanWANG Shan

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference [26]

Related [20]

Cited by [1]

Materials

Comments

Abstract:

The computation of data cubes usually produces huge outputs. There are two popular methods to solve this problem: Iceberg cube and closed cube, which can be combined together. Due to the importance and usability of closed iceberg cube, how to efficiently compute it becomes a key research issue. A cache-conscious computation method is proposed in this paper. The data are aggregated in a bottom-up manner. In the meantime, the closed cells covering the aggregate cells are discovered and output. Two pruning strategies are used to save unnecessary recursive calls. The Apriori pruning is utilized to support iceberg cube computation. To reduce the number of memory-related stalls and produce the aggregate results efficiently, multiple dimensions are pre-sorted and the software prefetching technology is introduced into data scans. A comprehensive and detailed performance study is conducted on both synthetic data and real data sets. The results show that the proposed closed iceberg cube computation method is efficient and effective.

Key words:OLAP (on-line analytical processing); closed iceberg cube; cache-conscious; memory-related stalls

Reference

[1] Gray J, Chaudhuri S, Bosworth A, Layman A, Reichart D, Venkatrao M, Pellow F, Pirahesh H. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Mining and Knowledge Discovery, 1997,1(1):29-53. [doi: 10.1023/A:1009726021843]

[2] Zhao Y, Deshpande PM, Naughton JF. An array-based algorithm for simultaneous multidimensional aggregates. In: Peckham J, ed. Proc. of the ACM SIGMOD Int’l Conf. on Management of Data. New York: ACM Press, 1997. 159-170.

[3] Beyer K, Ramakrishnan R. Bottom-Up computation of sparse and iceberg CUBEs. In: Delis A, Faloutsos C, Ghandeharizadeh S, eds. Proc. of the ACM SIGMOD Int’l Conf. on Management of Data. New York: ACM Press, 1999. 359-370.

[4] Xin D, Han JW, Li XL, Wah BW. Star-Cubing: Computing iceberg cubes by top-down and bottom-up integration. In: Freytag JC, Lockemann PC, Abiteboul S, Carey MJ, Selinger PG, Heuer A, eds. Proc. of the 29th Int’l Conf. on Very Large Data Bases. San Francisco: Morgan Kaufmann Publishers, 2003. 476-487.

[5] Shao Z, Han JW, Xin D. MM-Cubing: Computing iceberg cubes by factorizing the lattice space. In: Hatzopoulos M, Manolopoulos Y, eds. Proc. of the 16th Int’l Conf. on Scientific and Statistical Database Management. Washington: IEEE Computer Society, 2004. 213-222.

[6] Hurtado CA, Mendelzon AO, Vaisman AA. Maintaining data cubes under dimension updates. In: Kitsuregawa M, Maciaszak L, Papazoglou M, Pu C, eds. Proc. of the 15th Int’l Conf. on Data Engineering. Washington: IEEE Computer Society, 1999. 346-355.

[7] Lee KY, Kim MH. Efficient incremental maintenance of data cubes. In: Dayal U, Whang KY, Lomet DB, Alonso G, Lohman GM, Kersten ML, Cha SK, Kim YK, eds. Proc. of the 32nd Int’l Conf. on Very Large Data Bases. New York: ACM Press, 2006. 823-833.

[8] Barbara D, Sullivan M. Quasi-Cubes: Exploiting approximations in multidimensional databases. SIGMOD Record, 1997,26(3): 12-17. [doi: 10.1145/262762.262764]

[9] Shanmugasundaram J, Fayyad U, Bradley PS. Compressed data cubes for OLAP aggregate query approximation on continuous dimensions. In: Proc. of the ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. New York: ACM Press, 1999. 223-232. http://www.informatik.uni-trier.de/～ley/db/conf/kdd/kdd99.html

[10] Wang W, Feng JL, Lu HJ, Yu JX. Condensed cube: An effective approach to reducing data cube size. In: Agrawal R, Dittrich K, Ngu AH, eds. Proc. of the 18th Int’l Conf. on Data Engineering. Washington: IEEE Computer Society, 2002. 155-165.

[11] Sismanis Y, Deligiannakis A, Roussopoulos N, Kotidis Y. Dwarf: Shrinking the PetaCube. In: Franklin MJ, Moon B, Ailamaki A, eds. Proc. of the ACM SIGMOD Int’l Conf. on Management of Data. New York: ACM Press, 2002. 464-475.

[12] Lakshmanan LVS, Pei J, Han JW. Quotient cube: How to summarize the semantics of a data cube. In: Bernstein PA, Ioannidis Y, Ramakrishnan R, eds. Proc. of the 28th Int’l Conf. on Very Large Data Bases. San Francisco: Morgan Kaufmann Publishers, 2002. 778-789.

[13] Lakshmanan LVS, Pei J, Zhao Y. QC-Trees: An efficient summary structure for semantic OLAP. In: Halevy AY, Ives ZG, Doan AH, eds. Proc. of the ACM SIGMOD Int’l Conf. on Management of Data. New York: ACM Press, 2003. 64-75.

[14] Xin D, Shao Z, Han JW, Liu HY. C-Cubing: Efficient computation of closed cubes by aggregation-based checking. In: Liu L, Reuter A, Whang KY, Zhang JJ, eds. Proc. of the 22nd Int’l Conf. on Data Engineering. Washington: IEEE Computer Society, 2006.

[15] Balmin A, Papadimitriou T, Papakonstantinou Y. Hypothetical queries in an OLAP environment. In: Abbadi AE, Brodie ML, Chakravarthy S, Dayal U, Kamel N, Schlageter G, Whang KY, eds. Proc. of the 26th Int’l Conf. on Very Large Data Bases. San Francisco: Morgan Kaufmann Publishers, 2000. 220-231.

[16] Lakshmanan LVS, Russakovshy A, Sashikanth V. What-If OLAP queries with changing dimensions. In: Proc. of the 24th Int’l Conf. on Data Engineering. 2008. 1334-1336. http://www.informatik.uni-trier.de/～ley/db/conf/icde/icde2008.html

[17] Hennessy JL, Patterson DA. Computer Architecture: A Quantitative Approach. 3rd ed., San Francisco: Morgan Kaufmann Publishers, 2002.

[18] Boncz PA, Manegold S, Kersten ML. Database architecture optimized for the new bottleneck: memory access. In: Atkinson MP, Orlowska ME, Valduriez P, Zdonik SB, Brodie ML, eds. Proc. of the 25th Int’l Conf. on Very Large Data Bases. San Francisco: Morgan Kaufmann Publishers, 1999. 54-65.

[19] Rao J, Ross KA. Making B+-Trees cache conscious in main memory. In: Chen WD, Naughton JF, Bernstein PA, eds. Proc. of the ACM SIGMOD Int’l Conf. on Management of Data. New York: ACM Press, 2000. 475-486.

[20] Chen S, Gibbons PB, Mowry TC. Improving index performance through prefetching. In: Aref WG, ed. Proc. of the ACM SIGMOD Int’l Conf. on Management of Data. New York: ACM Press, 2001. 235-246.

[21] Chen S, Ailamaki A, Gibbons P, Mowry T. Improving hash join performance through prefetching. In: Proc. of the 20th Int’l Conf. on Data Engineering. Washington: IEEE Computer Society, 2004. 116-127. http://www.informatik.uni-trier.de/～ley/db/conf/icde/ icde2004.html

[22] Ailamaki A, DeWitt DJ, Hill MD, Wood DA. DBMSs on a modern processor: Where does time go? In: Atkinson MP, Orlowska ME, Valduriez P, Zdonik SB, Brodie ML, eds. Proc. of the 25th Int’l Conf. on Very Large Data Bases. San Francisco: Morgan Kaufmann Publishers, 1999. 266-277.

[23] IA-32 Intel architecture optimization reference manual. 2005. http://srl.cs.jhu.edu/manuals/24896612.pdf

[24] Hahn C, Warren S. Extended edited synoptic cloud reports from ships and land stations over the globe. 1952～1996. 1999. http://cdiac.esd.ornl.gov/ftp/ndp026c/ndp026c.txt

[25] Liu DW, Luan H, Wang S, Qin B. Main memory database TPC-H workload characterization on modern processor. Journal of Software, 2008,19(10):2573-2584 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/19/2573.htm [doi: 10.3724/SP.J.1001.2008.02573]

附中文参考文献: [25] 刘大为,栾华,王珊,覃飙.内存数据库在TPC-H负载下的处理器性能.软件学报,2008,19(10):2573-2584. http://www.jos.org.cn/ 1000-9825/19/2573.htm [doi: 10.3724/SP.J.1001.2008.02573]

Get Citation

栾华,杜小勇,王珊.缓存敏感的封闭冰山立方体计算.软件学报,2010,21(4):620-631

Copy

Article Metrics

Abstract:5075
PDF: 11225
HTML: 0
Cited by: 0

History

Received:May 14,2008
Revised:October 09,2008
Adopted:
Online:
Published:

You are the first2044080Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History