Abstract:Binarization is the most popular discretization method in decision tree generation, while for the domain with many continuous attributes, it always gets a big incomprehensible tree which can't be described as knowledge. In order to get a more intelligible decision tree, this paper presents a new discretization algorithm, RCAT, for continuous attributes in the generation of binary classification tree. It uses simple binarization to solve the multisplitting problem through mapping a continuous attribute into another probability attribute based on statistic information. Two pruning methods are introduced to simplify the constructed tree. Empirical results of several domains show that, for the two-class problem with a preponderance of continuous attributes, RCAT algorithm can generate a much smaller decision tree efficiently with higher intelligibility than binarization while retaining predictive accuracy.