Abstract:This paper proposes a mining algorithm of density-based cluster-structure, named MCluStream, to resolve the problems of input parameter selection and overlapping cluster identification in evolving data stream. First, a tree topology index, named CR-Tree, is designed to map a pair of data points with directly core reachable into relationship of father and child node. The CR-Tree that record relationships among points represents cluster-structure under a series of subEps settings. Second, the online update of cluster-structure on CR-Tree is completed by MCluStream under sliding window environments, which effectively maintains clusters over massive evolving data streams. Third, a fast cluster-structure extraction method is implemented from the CR-Tree. Users can easily select reasonable clustering results according to the visualized cluster-structure. Finally, experimental evaluations on massive-scale real and synthetic data demonstrate the effective mining result and better performance of the proposed algorithm compared against state-of-the-art methods. MCluStream is desirable to be applied to self-adaptive density-based clustering over high-volume data streams.