基于集成聚类的流量分类架构
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金(61303061,61402485);高性能计算国家重点实验室开放课题(201513-01)


Traffic Classification Framework Based on Ensemble Clustering
Author:
Affiliation:

Fund Project:

National Natural Science Foundation of China (61303061, 61402485); Open Fund from HPCL (201513-01)

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    流量分类是优化网络服务质量的基础与关键.机器学习算法利用数据流统计特征分类流量,对于识别加密私有协议流量具有重要意义.然而,特征偏置和类别不平衡是基于机器学习的流量分类研究所面临的两大挑战.特征偏置是指一些数据流统计特征在提高部分应用识别准确率的同时也降低了另外一部分应用识别的准确率.类别不平衡是指机器学习流量分类器对样本数较少的应用识别的准确率较低.为解决上述问题,提出了基于集成聚类的流量分类架构(traffic classification framework based on ensemble clustering,简称TCFEC).TCFEC由多个基于不同特征子空间聚类的基分类器和一个最优决策部件构成,能够提高流量分类的准确率.具体而言,与传统的机器学习流量分类器相比,TCFEC的平均流准确率最高提升5%,字节准确率最高提升6%.

    Abstract:

    Traffic classification is the basis and key for optimizing network quality of service. Machine learning algorithms apply flow statistics in traffic classification, which are significant for identifying both encrypted and private traffic. However, the discriminator bias problem and the class imbalance problem are two main challenges in traffic classification. The discriminator bias problem denotes that some flow statistics can improve the accuracies for some applications but reduce the accuracies for other applications. The class imbalance problem denotes that machine learning based traffic classifier identifies the minority application with a low accuracy. To address the above two issues, traffic classification framework based on ensemble clustering (TCFEC) is proposed in this paper. TCFEC is composed of several base classifiers trained by clustering in different feature subspaces and an optimal decision component. It is able to improve accuracy in traffic classification. Specifically, compared with the traffic classifier based on traditional machine learning algorithms, TCFEC improves average flow accuracy by 5% as well as average byte accuracy by 6%.

    参考文献
    相似文献
    引证文献
引用本文

鲁刚,余翔湛,张宏莉,郭荣华.基于集成聚类的流量分类架构.软件学报,2016,27(11):2870-2883

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2015-03-16
  • 最后修改日期:2015-04-07
  • 录用日期:
  • 在线发布日期: 2016-11-02
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号