Abstract:This paper focuses on the study of efficient and scalable classification algorithm that tightly integrates classification technology with relational database system technology. In this paper, an approach based on grouping and counting is proposed to build classifier, which uses SQL (structured query language) provided by relational database to implement the major computation tasks. In order to improve the performance, several optimization strategies and a redundant rules'pruning strategy together with a feature selection method integrating with the process of inding classification rules are also proposed.With all methods and strategies,the classification algrthm can find a compact set of classification rules quickly from a large volume of data.In addition the same classification accuracy with current popular classification algorithms and high training speed,the unique features of the classification algorithm also include its linear scalability with respect to the number of training samples and the number of attributes,and the simplicity in implementation.