• Article
  • | |
  • Metrics
  • |
  • Reference [18]
  • |
  • Related [20]
  • |
  • Cited by [19]
  • | |
  • Comments
    Abstract:

    Mining maximum frequent itemsets is a key problem in data mining field with numerous important applications. The existing algorithms of mining maximum frequent itemsets are based on local databases, and very little work has been done in distributed databases. However, using the existing algorithms for the maximum frequent itemsets or using the algorithms proposed for the global frequent itemsets needs to generate a lots of candidate itemsets and requires a large amount of communication overhead. Therefore, this paper proposes an algorithm for fast mining global maximum frequent itemsets (FMGMFI), which can conveniently get the global frequency of any itemset from the corresponding paths of every local FP-tree by using frequent pattern tree and require far less communication overhead by the searching strategy of bottom-up and top-down. Experimental results show that FMGMFI is effective and efficient.

    Reference
    [1]Han J, Kamber M. Data Mining: Concepts and Techniques. Beijing: High Education Press, 2001.
    [2]Agrawal R, ImielinSki T, Swami A. Mining association rules between sets of items in large database. In: Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. Vol 2, Washington DC: SIGMOD, 1993. 207-216.
    [3]Agrawal, R Srikant. Fast algorithms for mining association rules. In: Proc. of the 20th Int'l Conf. Very Large Data Bases(VLDB'94). 1994.487-499.
    [4]Yang M, Sun ZH. An incremental updating algorithm based on prefix ceneral list for association rules. Chinese Journal of Computers, 2003,26(10): 1318-1325.
    [5]Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation. In: Proc. of the 2000 ACM-SIGMOD Int'l Conf. on Management of Data. Dallas: ACM Press, 2000. 1-12.
    [6]Bayardo RJ. Efficiently mining long patterns from databases. In: Haas LM, Tiwary A, eds. Proc. of the ACM SIGMOD Int'l Conf.on Management of Data. New York: ACM Press, 1998.85-93.
    [7]Lin D, Kedem ZM. Pincer-Search: A new algorithm for discovering the maximum frequent set. In: Proc. of the 6th European Conf.on Extending Database Technology. Heidelberg: Springer-Verlag, 1998. 105-119.
    [8]Lu SF, Lu ZD. Fast mining maximum frequent itemsets. Journal of Software, 2001,12(2):293-297 (in Chinese with English abstract).
    [9]Song YQ, Zhu YQ, Sun ZH, Chen G. An algorithm an its updating algorithm based on FP-Tree for mining maximum frequent itemsets. Journal of Software, 2003,14(9):1586-1592 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/14/1586.htm
    [10]Park JS, Chen MS, Yu PS. Efficient parallel data mining for association rules. In: Proc. of the 4th Int'l Conf. on Information and Knowledge Management. 1995. 31-36.
    [11]Agrawal R, Shafer J. Parallel mining of association rules. IEEE Trans. on Knowledge and Data Engineering, 1996,8(6):962-969.
    [12]Cheung DW, Han JW, Ng VT. A fast distributed algorithm for mining association rules. In: Proc. of the IEEE 4th Int'l Conf.Parallel and Distributed Information Systems. Miami Beach: IEEE Press, 1996. 31-44.
    [13]Yang M, Sun ZH, Ji GL. Fast mining of global frequent Itemsets. Journal of Computer of Research and Development,2003,40(4):620-626 (in Chinese with English abstract).
    [14]Cheung DW, Lee SD, Xiao YQ. Effect of data skewness and workload balance in parallel data mining. IEEE Trans. on Knowledge and Data Engineering, 2002,14(3):498-514.
    [4]杨明,孙志挥.一种基于前缀广义表的关联规则增量式更新算法.计算机学报,2003,26(10):1318-1325.
    [8]路松峰,卢正鼎.快速开采最大频繁项目集.软件学报,2001,12(2):293-297.
    [9]宋余庆,朱玉全,孙志挥,陈耿.基于FP-tree的最大频繁项目集挖掘及更新算法.软件学报,2003,14(9):1586-1592.http://www.jos.org.cn/1000-9825/14/1586.btm
    [13]杨明,孙志挥,吉根林.快速挖掘全局频繁项目集.计算机研究与发展,2003,40(4):620-626.
    Comments
    Comments
    分享到微博
    Submit
Get Citation

陆介平,杨明,孙志挥,鞠时光.快速挖掘全局最大频繁项目集.软件学报,2005,16(4):553-560

Copy
Share
Article Metrics
  • Abstract:4333
  • PDF: 5699
  • HTML: 0
  • Cited by: 0
History
  • Received:June 03,2004
  • Revised:July 02,2004
You are the first2038773Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063