The capacity of dealing with mixed numeric and categorical valued data is undoubtedly important for clustering algorithms because there is usually a mixture of numeric and categorical valued attributes in real databases. The use of fuzzy techniques makes clustering algorithms robust against noise and missing values in the databases. In this paper, a fuzzy kprototypes algorithm integrating k-means and k-modes algorithm is presented and is used to mixed databases. Experiments on several real databases demonstrategythat fuzzy algorithm can get better result than the corres ponding hard algorithm.Some properries of fuzzt k-prototypes algorithm are also discussed.
[1] Huang, Zhe xue. Clustering large data sets with mixed numeric and categorical values. In: Lu Hong-jun, Motoda, Hiroshi, Liu Huan, eds. Proceedings of the 1st Pacific-Asia Conference on Knowledge Discovery & Data Mining. Singapore: World Scientific, 1997. 21~34.
[2] Huang, Zhe-xue. Extensions to the k-means algorithms for clustering large data sets with categorical values. Data Mining and Knowledge Discovery, 1998,2:283~304.
[3] Ruspini, E.H. A new approach to clustering. Information Control, 1969,(19)..22~32.
[4] Bezedek, J.C. Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, 1987.
[5] Dave, R. N. , Bhaswan, K. Adaptive fuzzy C-shells clustering and detection of ellipses. IEEE Transactions on Neural Networks, 1992,3:643~662.
[6] Hathaway, R. J. , Bezedek, J.C. , Tucker, W.T. An improved convergence theory for the Fuzzy C-means clustering algorithm. In: Bezedek, J.C. , ed. The Analysis of Fuzzy Information. Boca Raton: CRC Press, 1986.
[7] Huang, Zhe-xue, Ng, M.K. A fuzzy K-modes algorithm for clustering categorical data. IEEE Transactions on Fuzzy Systems, 1999,7(4) :446~452.
[8] Bezedek, J.C. A convergence theorem for the fuzzy ISODATA clustering algorithms. IEEE Transactions on Pattern Analysis Machine Intelligence, 1980,PAMI 2:1~8.
[9] Ismail, M.A. Fuzzy C-means: optimality of solutions and effective termination of the algorithm. Pattern Recognition,1986,19(6) :481~485.
[10] Hathaway, R.J. Local convergence of the fuzzy C-means algorithms. Pattern Recognition, 1986,19(6):177~180.
[11] Blake, C. L. , Merz, C.J. UCI Repository of machine learning databases. Department of Information and Computer Science. University of California, Irvine, CA, 1980, http:∥www. ics. uci. edu/~mlearn/MLRespository. html.
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.