Survey on Density Peaks Cclustering Algorithm

DOI：10.13328/j.cnki.jos.006122

 作者 单位 E-mail 徐晓 中国矿业大学计算机科学与技术学院 徐州 221116 丁世飞 中国矿业大学计算机科学与技术学院 徐州 221116矿山数字化教育部工程研究中心 徐州 221116 dingsf@cumt.edu.cn 丁玲 中国矿业大学计算机科学与技术学院 徐州 221116

密度峰值聚类（density peaks clustering，DPC）算法是聚类分析中基于密度的一种新兴算法，该算法考虑局部密度和相对距离绘制决策图，快速识别簇中心，完成聚类.DPC具有唯一的输入参数，且无需先验知识，也无需迭代.自2014年提出以来，DPC引起了学者们的极大兴趣并得到了快速发展.本文首先阐述DPC的基本理论，并通过与经典聚类算法比较分析了DPC的特点；其次，分别从聚类精度和计算复杂度两个角度分析了DPC的弊端及其优化方法，包括局部密度优化、分配策略优化、多密度峰优化以及计算复杂度优化，并介绍了每个类别的主要代表算法；最后介绍了DPC在不同领域中的相关应用研究.我们对DPC的优缺点提供了全面的理论分析，并对DPC的优化以及应用进行了全面阐述.我们还试图找出进一步的挑战来促进DPC研究发展.

Density peaks clustering (DPC) algorithm is an emerging algorithm in density-based clustering analysis which drawns a decision-graph based on the calculation of local-density and relative-distance to obtain the cluster centers fast. DPC is known as only one input parameter without prior knowledge and no iteration. Since DPC was introduced in 2014, it has attracted great interests and developments in recent years. This survey first analyzes theoretical of DPC and analyzes the satisfactory behaviors of DPC by comparing with classical clustering algorithms. Secondly, DPC survey is described in terms of clustering accuracy and computational complexity, including local-density optimization, allocation-strategy optimization, multi-density peaks optimization and computational complexity optimization, to provide a clear organization. The main representative algorithms of each category are presented simultaneously. Finally, it introduces the related application research of DPC in different fields. The characteristic of this overview is that we provide a comprehensive analysis for the advantages and disadvantages of DPC, and give a comprehensive description for the improvements and applications of DPC. We also attempt to find out some further challenges to promote DPC researching.
HTML  下载PDF全文  查看/发表评论  下载PDF阅读器