通过对分类数据的深入研究,提出了一种高效的多层关联规则挖掘方法:首先,根据分类数据所在的领域知识构建基于领域知识的项相关性模型DICM(domain knowledge-based item correlation model),并通过该模型对分类数据的项进行层次聚类;然后,基于项的聚类结果对事务数据库进行约简划分;最后,将约简划分后的事务数据库映射至一种压缩的AFOPT 树形结构,并通过遍历AFOPT 树替代原事务数据库来挖掘频繁项集.由于缩小了事务数据库规模,并采用了压缩的AFOPT结构,所提出的方法
This paper proposes a idea for mining multiple-level and generalized association rules. First, an item correlation model is set up, based on the domain knowledge and clusters the items according to their correlation. Secondly, the transaction database, based on the item clusters, are reduced which make the transaction database smaller. Finally, the partitioned transaction databases are projected onto a compact structure called AFOPT-tree and find the frequent itemsets from the AFOPT. Based on the proposed idea, this paper proposes a top-down algorithm TD-CBP-MLARM and a bottom-up algorithm BU-CBP-MLARM to mine the multiple-level association rules. Additionally, this paper extends the idea to a generalized mining association rule and gives a new efficient algorithm CBP-GARM. The experiments show that the proposed algorithms not only corrects and completes mining results, but also outperform the well-known and current algorithms in mining effectiveness.