Abstract:In this paper, a popular statistics-based training and tagging method for Chinese texts is studied, and the nonlinear relation between training set and tagging accuracy is analyzed from the aspects of the structure and numerical value of the matrix of transition probabilities and the matrix of symbol probabilities. In order to make use of training corpus sufficiently and get the higher tagging accuracy, the training and tagging method is improved from two aspects: using other grammatical attributes of words, and strengthening the processing of unknown words. With the improved method, open test and close test showed that the overall accuracies are about 96.5% and 96% respectively.