Abstract:Many classical clustering algorithms like Average-link, K-means, K-medoids, Clara, Clarans and so on are all based on a single cluster-center and are only apt to discover convex-structured clusters. Other methods, e.g., CURE and DBSCAN, use more than one point to represent a cluster and can find some well-separated clusters of arbitrary shape. However, they only consider the original scale of the input data; thus, they cannot depart over-lapped or noisy clusters. To this end, this paper is used to propose a multilevel core-set based agglomerative clustering algorithm (MulCA). The idea of MulCA is that the clustering structure is described by multi-level core set. Clustering process is achieved through procedure which the top of the core set automatically becomes the underlying data set. In addition, through the introduction of random sampling based ε-core set (RBC), MulCA algorithm is applied to large-scale data sets. A large number of numerical experiments fully verify the algorithm MulCA.