张东月,周丽华,吴湘云,赵丽红.基于网格耦合的数据流聚类.软件学报,2019,30(3):667-683 |
基于网格耦合的数据流聚类 |
Data Stream Clustering Based on Grid Coupling |
投稿时间:2018-07-20 修订日期:2018-09-20 |
DOI:10.13328/j.cnki.jos.005693 |
中文关键词: 数据流 聚类分析 网格耦合 网格结构 聚类质量 |
英文关键词:data stream clustering analysis grid coupling grid structure the quality of cluster |
基金项目:国家自然科学基金(61762090,61262069,61472346,61662086);云南省自然科学基金(2016FA026,2015FB114);云南省创新研究团队项目(2018HC019);云南省高等学校科技创新团队项目(IRTSTYN) |
|
摘要点击次数: 1994 |
全文下载次数: 1198 |
中文摘要: |
随着越来越多的应用程序产生数据流,数据流聚类分析的研究受到了广泛关注.基于网格的聚类通过将数据流映射到网格结构中形成数据概要,进而对概要进行聚类.这种方法通常具有较高的效率,但是每个网格独立处理,没有考虑网格之间的相互影响,因此聚类质量有待提高.在聚类过程中不再独立处理网格,而是考虑了网格之间的耦合关系,提出了一种基于网格耦合的数据流聚类算法.网格的耦合更加准确地表达了数据之间的相关性,从而提高了聚类的质量.在合成和真实数据流上的实验结果表明,所提算法具有较高的聚类质量和效率. |
英文摘要: |
As more and more applications generate data streams, the research on data stream clustering analysis has received extensive attention. Grid-based clustering maps data streams into grid structures to form data summaries, and then clusters data summaries. This method usually has high efficiency, but each grid is processed independently, and the interaction between the grids is not considered, so the clustering quality needs to be improved. In this study, the coupling relationship between grids is considered rather than processed independently in the clustering process, and an algorithm for clustering data stream based on grid coupling is proposed. The proposed approach improves the quality of clusters as the coupling of the grid more accurately captures the correlation amongst the data. Experimental evaluations on synthetic and real data streams illustrate the superiority of the proposed approach compared with the state-of-the-arts approaches. |
HTML 下载PDF全文 查看/发表评论 下载PDF阅读器 |