基于成对约束的判别型半监督聚类分析

微信服务号

微信订阅号

2025年7月17日 20:35 星期四

首页 > 过刊浏览>2008年第19卷第11期 >2791-2802

基于成对约束的判别型半监督聚类分析
DOI:
                        
                    
CSTR:
                        
                    
作者:
                        尹学松尹学松
南京航空航天大学 信息科学与技术学院,江苏 南京 210016； 浙江广播电视大学 计算机科学与技术系,浙江 杭州 310012
在期刊界中查找
在百度中查找
在本站中查找
胡恩良胡恩良
南京航空航天大学 信息科学与技术学院,江苏 南京 210016
在期刊界中查找
在百度中查找
在本站中查找
陈松灿陈松灿
南京航空航天大学 信息科学与技术学院,江苏 南京 210016
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:Supported by the National Natural Science Foundation of China under Grant Nos.60505004, 60773061 (国家自然科学基金)

Discriminative Semi-Supervised Clustering Analysis with Pairwise Constraints

Author:

YIN Xue-Song
YIN Xue-Song

在期刊界中查找
在百度中查找
在本站中查找
HU En-Liang
HU En-Liang

在期刊界中查找
在百度中查找
在本站中查找
CHEN Song-Can
CHEN Song-Can

在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [19]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

现有一些典型的半监督聚类方法一方面难以有效地解决成对约束的违反问题,另一方面未能同时处理高维数据.通过提出一种基于成对约束的判别型半监督聚类分析方法来同时解决上述问题.该方法有效地利用了监督信息集成数据降维和聚类,即在投影空间中使用基于成对约束的K均值算法对数据聚类,再利用聚类结果选择投影空间.同时,该算法降低了基于约束的半监督聚类算法的计算复杂度,并解决了聚类过程中成对约束的违反问题.在一组真实数据集上的实验结果表明,与现有相关半监督聚类算法相比,新方法不仅能够处理高维数据,还有效地提高了聚类性能.

关键词:半监督聚类;成对约束;闭包中心;投影矩阵;聚类分析

Abstract:

Most existing semi-supervised clustering algorithms with pairwise constraints neither solve the problem of violation of pairwise constraints effectively, nor handle the high-dimensional data simultaneously. This paper presents a discriminative semi-supervised clustering analysis algorithm with pairwise constraints, called DSCA, which effectively utilizes supervised information to integrate dimensionality reduction and clustering. The proposed algorithm projects the data onto a low-dimensional manifold, where pairwise constraints based K-means algorithm is simultaneously used to cluster the data. Meanwhile, pairwise constraints based K-means algorithm presented in this paper reduces the computational complexity of constraints based semi-supervised algorithm and resolve the problem of violating pairwise constraints in the existing semi-supervised clustering algorithms. Experimental results on real-world datasets demonstrate that the proposed algorithm can effectively deal with high-dimensional data and provide an appealing clustering performance compared with the state-of-the-art semi-supervised algorithm.

Key words:semi-supervised clustering; pairwise constraints; closure centroid; projection matrix; clustering analysis

参考文献

[1] Bar-Hillel A, Hertz T, Shental N, Weinshall D. Learning a mahalanobis metric from equivalence constraints. Journal of Machine Learning Research, 2005,6(5):937-965.

[2] Wagstaff K, Cardie C, Rogers S, Schroedl S. Constrained K-means clustering with background knowledge. In: Brodley CE, Danyluk AP, eds. Proc. of the 18th Int’l Conf. on Machine Learning. Williamstown: Morgan Kaufmann Publishers, 2001. 577-584.

[3] Bar-Hillel A, Hertz T, Shental N, Weinshall D. Learning distance functions using equivalence relations. In: Fawcett T, Mishra N, eds. Proc. of the 20th Int’l Conf. on Machine Learning. Washington: Morgan Kaufmann Publishers, 2003. 11-18.

[4] Basu S, Banerjee A, Mooney RJ. Semi-Supervised clustering by seeding. In: Sammut C, Hoffmann AG, eds. Proc. of the 19th Int’l Conf. on Machine Learning. Sydney: Morgan Kaufmann Publishers, 2002. 19-26.

[5] Xing EP, Ng AY, Jordan MI, Russell S. Distance metric learning with application to clustering with side-information. In: Becher S, Thrun S, Obermayer K, eds. Proc. of the 16th Annual Conf. on Neural Information Processing System. Cambridge: MIT Press, 2003. 505-512.

[6] Basu S, Banerjee A, Mooney RJ. A probabilistic framework for semi-supervised clustering. In: Boulicaut JF, Esposito F, Giannotti F, Pedreschi D, eds. Proc. of the 10th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. New York: ACM Press, 2004. 59-68.

[7] Bilenko M, Basu S, Mooney RJ. Integrating constraints and metric learning in semi-supervised clustering. In: Brodley CE, ed. Proc. of the 21st Int’l Conf. on Machine Learning. New York: ACM Press, 2004. 81-88.

[8] Tang W, Xiong H, Zhong S, Wu J. Enhancing semi-supervised clustering: a feature projection perspective. In: Berkhin P, Caruana R, Wu XD, eds. Proc. of the 13th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. New York: ACM Press, 2007. 707-716.

[9] Basu S, Banerjee A, Mooney RJ. Active semi-supervision for pairwise constrained clustering. In: Jonker W, Petkovic M, eds. Proc. of the SIAM Int’l Conf. on Data Mining. Cambridge: MIT Press, 2004. 333-344.

[10] Yan B, Domeniconi C. An adaptive kernel method for semi-supervised clustering. In: Fürnkranz J, Scheffer T, Spiliopoulou M, eds. Proc. of the 17th European Conf. on Machine Learning. Berlin: Sigma Press, 2006. 18-22.

[11] Yeung DY, Chang H. Extending the relevant component analysis algorithm for metric learning using both positive and negative equivalence constraints. Pattern Recognition, 2006,39(5):1007-1010.

[12] Beyer K, Goldstein J, Ramakrishnan R, Shaft U. When is “Nearest Neighbors Meaningful”? In: Beeri C, Buneman P, eds. Proc. of the Int’l Conf. on Database Theory. New York: ACM Press, 1999. 217-235.

[13] Ding CH, Li T. Adaptive dimension reduction using discriminant analysis and K-means clustering. In: Ghahramani Z, ed. Proc. of the 19th Int’l Conf. on Machine Learning. New York: ACM Press, 2007. 521-528.

[14] Zhang DQ, Zhou ZH, Chen SC. Semi-Supervised dimensionality reduction. In: Mandoiu I, Zelikovsky A, eds. Proc. of the 7th SIAM Int’l Conf. on Data Mining. Cambridge: MIT Press, 2007. 629-634.

[15] Ye JP, Zhao Z, Liu H. Adaptive distance metric learning for clustering. In: Bishop CM, Frey B, eds. Proc. of IEEE Computer Society Conf. on Computer Vision and Pattern Recognition. Madison: IEEE Computer Society Press, 2007. 1-7.

[16] Chen JH, Zhao Z, Ye JP, Liu H. Nonlinear adaptive distance metric learning for clustering. In: Berkhin P, Caruana R, Wu XD, eds. Proc. of the 13th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. New York: ACM Press, 2007. 123-132.

[17] Saul LK, Roweis ST. Think globally, fit locally: Unsupervised learning of low dimensional manifolds. Journal of Machine Learning Research, 2003,4(3):119-155.

[18] Schultz M, Joachims T. Learning a distance metric from relative comparisons. In: Thrun S, Saul LK, Sch?lkopf B, eds. Proc. of the 17th Annual Conf. on Neural Information Processing System. Cambridge: MIT Press, 2004. 41-48.

[19] De la Torre F, Kanade T. Discriminative cluster analysis. In: William WC, Andrew M, eds. Proc. of the 19th Int’l Conf. on Machine Learning. New York: ACM Press, 2006. 241-248.

引用本文

尹学松,胡恩良,陈松灿.基于成对约束的判别型半监督聚类分析.软件学报,2008,19(11):2791-2802

复制

文章指标

点击次数:8441
下载次数: 12456
HTML阅读次数: 0
引用次数: 0

历史

收稿日期:2008-01-08
最后修改日期:2008-08-26
录用日期:
在线发布日期:
出版日期:

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码