[关键词]
[摘要]
谱聚类是聚类分析中极具代表性的方法之一,由于其对数据结构没有太多假设要求,受到了研究者们的广泛关注.但传统的谱聚类算法通常受到谱嵌入的可扩展性和泛化性的限制,即:无法应对大规模设置和复杂数据分布.为克服以上缺陷,旨在引入深度学习框架提升谱聚类的泛化能力与可扩展能力,同时,结合多视图学习挖掘数据样本的多样性特征,从而提出一种知识迁移下的深度一致性多视图谱聚类网络(CMvSC).首先,考虑到单个视图的局部不变性,CMvSC采用局部学习层独立学习每个视图的特有嵌入;其次,由于多视图具有全局一致性,CMvSC引入全局学习层进行参数共享与特征迁移,学习多视图间的共享嵌入;同时,考虑到邻接矩阵对谱聚类性能的重要影响,CMvSC通过训练孪生网络和设计对比损失来学习成对数据间的近邻关系,以替代传统谱聚类算法中的距离度量;最后,4个数据集上的实验结果证明了CMvSC对多视图谱聚类任务的有效性.
[Key word]
[Abstract]
Spectral clustering, which is one of the most representative methods in clustering analysis, receives much attention from scholars, because it does not constrain the data structure of the original samples. However, traditional spectral clustering algorithm usually contains two major limitations, i.e., it is unable to cope with the large-scale settings and complex data distribution. To overcome the above shortcomings, this study introduces a deep learning framework to improve the generalization and scalability of spectral clustering, and combines the multi-view learning to mine diverse features among data samples, finally proposes a knowledge transferring based deep consensus network for multi-view spectral clustering (CMvSC). First, considering the local invariance of single view, CMvSC adopts the local learning layer to learn the specific embedding of each view individually. Then, because of the global consistency among multiple views, CMvSC introduces the global learning layer to achieve parameter sharing and feature transferring, and learns the shared embedding in different views. Meanwhile, taking the effect of affinity matrix for spectral clustering into consideration, CMvSC learns the affinity correlation between the paired samples by training the Siamese network and designing the contrastive loss, which replaces the distance metric in traditional spectral clustering. Finally, the experimental results on four datasets demonstrate the effectiveness of the proposed CMvSC for multi-view clustering.
[中图分类号]
[基金项目]
国家自然科学基金(61976247)