Abstract:Based on the antibody clonal selection theory, an immune clonal data reduction algorithm is proposed for instance selection problems of data reduction. The theory of Markov chain proves that the new algorithm is convergent with probability 1. The experimental studies on seven standard data sets of UCI repository show that the algorithm proposed in this paper is effective. The best domain of the weight parameter λ is determined by analyzing its effect on algorithm’s performance. Furthermore, an encoding method based on the stratified strategy is introduced to accelerate the convergence speed when solving large scale data reduction problems. The experimental studies based on seven large scale data sets show that the improved method is superior to the primary one. Finally,the best domain of the number of stratums t is determined by analyzing its effect on algorithm’s performance based on the data sets Letter and DNA.