Approximate Weighted Kernel k-means for Large-Scale Spectral Clustering

doi:10.13328/j.cnki.jos.004888

微信服务号

微信订阅号

2025-5-15- 23

Home > Archive>Volume 26, Issue 11, 2015 >2836-2846. DOI:10.13328/j.cnki.jos.004888

PDF HTML XML Export Cite reminder

Approximate Weighted Kernel k-means for Large-Scale Spectral Clustering
DOI:
                        10.13328/j.cnki.jos.004888
                    
Author:
                        JIA Hong-JieJIA Hong-Jie
School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China;Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, The Chinese Academy of Sciences, Beijing 100190, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
DING Shi-FeiDING Shi-Fei
School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China;Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, The Chinese Academy of Sciences, Beijing 100190, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
SHI Zhong-ZhiSHI Zhong-Zhi
Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, The Chinese Academy of Sciences, Beijing 100190, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Spectral clustering is based on algebraic graph theory. It turns the clustering problem into the graph partitioning problem. To solve the graph cut objective function, the properties of the Rayleigh quotient are usually utilized to map the original data points into a lower dimensional eigen-space by calculating the eigenvectors of Laplacian matrix and then conducting the clustering in the new space. However, during the process of spectral clustering, the space complexity of storing similarity matrix is O(n²), and the time complexity of the eigen-decomposition of Laplacian matrix is usually O(n³). Such complexity is unacceptable when dealing with large-scale data sets. It can be proved that both normalized cut graph clustering and weighted kernel k-means are equivalent to the matrix trace maximization problem, which suggests that weighted kernel k-means algorithm can be used to optimize the objective function of normalized cut without the eigen-decomposition of Laplacian matrix. Nonetheless, weighted kernel k-means algorithm needs to calculate the kernel matrix, and its space complexity is still O(n²). To address this challenge, this study proposes an approximate weighted kernel k-means algorithm in which only part of the kernel matrix is used to solve big data spectral clustering problem. Theoretical analysis and experimental comparison show that approximate weighted kernel k-means has similar clustering performance with weighted kernel k-means algorithm, but its time and space complexity is greatly reduced.

Key words:spectral clustering;trace maximization;weighted kernel k-means;approximate kernel matrix;big data

Get Citation

贾洪杰,丁世飞,史忠植.求解大规模谱聚类的近似加权核k-means算法.软件学报,2015,26(11):2836-2846

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:February 15,2015
Revised:August 26,2015
Adopted:
Online: November 04,2015
Published:

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History