快速挖掘全局最大频繁项目集

微信服务号

微信订阅号

2025年5月17日 7:15 星期六

首页 > 过刊浏览>2005年第16卷第4期 >553-560

快速挖掘全局最大频繁项目集
DOI:
                        
                    
CSTR:
                        
                    
作者:
                        陆介平陆介平
东南大学,计算机科学与工程系,江苏,南京,210096
在期刊界中查找
在百度中查找
在本站中查找
杨明杨明
东南大学,计算机科学与工程系,江苏,南京,210096
在期刊界中查找
在百度中查找
在本站中查找
孙志挥孙志挥
东南大学,计算机科学与工程系,江苏,南京,210096
在期刊界中查找
在百度中查找
在本站中查找
鞠时光鞠时光
江苏大学,计算机科学与通信工程学院,江苏,镇江,212013
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:Supported by the National Natural Science Foundation of China under Grant No.70371015(国家自然科学基金);the NationalNatural Science Foundation of Jiangsu Province under Grant No.BK2004058(江苏省自然科学基金)

Fast Mining of Global Maximum Frequent Itemsets

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

挖掘最大频繁项目集是多种数据挖掘应用中的关键问题.现行可用的最大频繁项目集挖掘算法大多基于单机环境,针对分布式环境下的全局最大频繁项目集挖掘尚不多见.若将基于单机环境的最大频繁项目集挖掘算法运用于分布式环境,或运用分布式环境下的全局频繁项目集挖掘算法来挖掘全局最大频繁项目集,均会产生大量的候选频繁项目集,且网络通信代价高.为此,提出了快速挖掘全局最大频繁项目集算法FMGMFI(fast mining global maximum frequent itemsets),该算法采用FP-tree存储结构,可方便地从各局部FP-tree的相关路径中得到项目集的频度,同时采用自顶向下和自底向上的双向搜索策略,可有效地降低网络通信代价.实验结果表明,FMGMF算法是有效、可行的.

关键词:分布式数据库;数据挖掘;频繁模式树;全局最大频繁项目集

Abstract:

Mining maximum frequent itemsets is a key problem in data mining field with numerous important applications. The existing algorithms of mining maximum frequent itemsets are based on local databases, and very little work has been done in distributed databases. However, using the existing algorithms for the maximum frequent itemsets or using the algorithms proposed for the global frequent itemsets needs to generate a lots of candidate itemsets and requires a large amount of communication overhead. Therefore, this paper proposes an algorithm for fast mining global maximum frequent itemsets (FMGMFI), which can conveniently get the global frequency of any itemset from the corresponding paths of every local FP-tree by using frequent pattern tree and require far less communication overhead by the searching strategy of bottom-up and top-down. Experimental results show that FMGMFI is effective and efficient.

Key words:distribute database;data mining;frequent pattern tree;global maximum frequent itemset

引用本文

陆介平,杨明,孙志挥,鞠时光.快速挖掘全局最大频繁项目集.软件学报,2005,16(4):553-560

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2004-06-03
最后修改日期:2004-07-02
录用日期:
在线发布日期:
出版日期:

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码