抗噪的未知应用层协议报文格式最佳分段方法

doi:10.3724/SP.J.1001.2013.04243

微信服务号

微信订阅号

2025年5月10日 16:55 星期六

首页 > 过刊浏览>2013年第24卷第3期 >604-617. DOI:10.3724/SP.J.1001.2013.04243

PDF HTML阅读 XML下载导出引用引用提醒

抗噪的未知应用层协议报文格式最佳分段方法
DOI:
                        10.3724/SP.J.1001.2013.04243
                    
CSTR:
                        
                    
作者:
                        黎敏黎敏
中山大学 电子与通信工程系,广东 广州 510006
在期刊界中查找
在百度中查找
在本站中查找
余顺争余顺争
中山大学 电子与通信工程系,广东 广州 510006
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金(60970146); 国家高技术研究发展计划(863)(2007AA01Z449); 国家自然科学基金-广东联合基金(U0735002)

Noise-Tolerant and Optimal Segmentation of Message Formats for Unknown Application- Layer Protocols

Author:

LI Min
LI Min
Department of Electronics and Communication Engineering, Sun Yat-Sen University, Guangzhou 510006, China
在期刊界中查找
在百度中查找
在本站中查找
YU Shun-Zheng
YU Shun-Zheng
Department of Electronics and Communication Engineering, Sun Yat-Sen University, Guangzhou 510006, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

为了自动解析未知应用层协议的报文格式,提出一种未知应用层协议报文格式的最佳分段方法.这种方法不需要关于未知应用层协议的先验知识.它首先建立一种用于最佳分段的隐半马尔可夫模型(HSMM),并利用未知应用层协议在网络会话过程中传输的报文序列样本集来估计该模型的参数;再通过基于HSMM的最大似然概率分段方法,对报文中的各个字段进行最佳划分,同时获取代表各个字段语义的关键词.这种方法并不要求训练集绝对纯净.它能够基于观测序列的似然概率分布,发现混杂在训练集中的其他协议数据(噪声)并进行有效过滤.实验结果表明,该方法能够解析文本和二进制协议的报文格式,依据关键词构建的协议识别特征有很高的准确识别率,并能有效地检测出噪声.

关键词:应用层协议;报文格式;分段;隐半马尔可夫模型

Abstract:

In order to automatically parse message formats of unknown application-layer protocols, this paper proposes an approach to optimally segment the message formats without a priori knowledge. A hidden semi-Markov model (HSMM) is established for the segmentation and its parameters are estimated from a set of message sequences collected from application sessions. By using the estimated HSMM in the maximum most likely segmentation, a message can be optimally divided into segments and keywords that provide semantic information about the segments can be extracted. This approach does not require the training set to be absolutely pure. The noise mixed in the training set can be filtered out based on its likelihood fitting to the HSMM. The experiments conducted in this paper show that the approach is suited to both text and binary protocols. The application-layer signatures constructed from the extracted keywords are highly accurate in identifying the protocols. The noise mixed in the training set can be efficiently detected and automatically filtered out.

Key words:application-layer protocol;message format;segmentation;hidden semi Markov model

引用本文

黎敏,余顺争.抗噪的未知应用层协议报文格式最佳分段方法.软件学报,2013,24(3):604-617

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2011-08-11
最后修改日期:2012-04-09
录用日期:
在线发布日期: 2013-03-01
出版日期:

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码