Abstract:In the case of the imbalanced protocol flows, the changes of flow distribution have a huge impact on the accuracy and stability of traffic classifiers that use machine learning algorithms. It is very important to select a suitable machine learning algorithm to classify the imbalanced protocol flows on line. By means of single-factor experiment design, this paper verifies that it is possible for C4.5 decision tree, Na?ve Bayes with kernel density estimation (NBK) and support vector machine (SVM) to classify traffic with the first four packets of the TCP connection. After comparing the performances of the three classifiers abovementioned, the study finds that the testing time of C4.5 decision tree is the shortest and SVM is the most stable. Finally, Bagging algorithm is applied to classify traffic. The experimental results show that, the stability of Bagging is similar to SVM and the testing time and modeling time of Bagging is close to C4.5 decision tree. Therefore, Bagging classifier is the most suitable to classify traffic on line.