Abstract:Clustering data stream basically requires fast processing speed as well as quality clustering results. In this paper, some novel approaches are presented for such a clustering task using graphics processing units (GPUs), e.g., K-means-based method, stream clustering method, and evolving data stream analysis method. The common characteristics of these methods are making use of the strong computational and pipeline power of GPUs. Different from the pervious clustering methods with individual framework, the methods share the same framework with multi-function, which provides a uniform platform for stream clustering. In stream clustering, the core operations are distance computing and comparison. These two operations could be implemented by using capabilities of GPUs on fragment vector processing. Extensive experiments are conducted in a PC with Pentium IV 3.4G CPU and NVIDIA GeForce 6800 GT graphic card. A comprehensive performance study is presented to prove the efficiency of the proposed algorithms. It is shown that these algorithms are about 7 times faster than the previous CPU-based algorithms. Therefore, they well support the applications of high speed data streams.