Abstract:Traffic classification is the basis and key for optimizing network quality of service. Machine learning algorithms apply flow statistics in traffic classification, which are significant for identifying both encrypted and private traffic. However, the discriminator bias problem and the class imbalance problem are two main challenges in traffic classification. The discriminator bias problem denotes that some flow statistics can improve the accuracies for some applications but reduce the accuracies for other applications. The class imbalance problem denotes that machine learning based traffic classifier identifies the minority application with a low accuracy. To address the above two issues, traffic classification framework based on ensemble clustering (TCFEC) is proposed in this paper. TCFEC is composed of several base classifiers trained by clustering in different feature subspaces and an optimal decision component. It is able to improve accuracy in traffic classification. Specifically, compared with the traffic classifier based on traditional machine learning algorithms, TCFEC improves average flow accuracy by 5% as well as average byte accuracy by 6%.