当前位置: X-MOL 学术J. Netw. Comput. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
R-gram: Inferring message formats of service protocols with relative positional n-grams
Journal of Network and Computer Applications ( IF 8.7 ) Pub Date : 2021-10-16 , DOI: 10.1016/j.jnca.2021.103247
Jiaojiao Jiang 1 , Jean-Guy Schneider 2 , Steve Versteeg 3 , Jun Han 3 , MD Arafat Hossain 3 , Chengfei Liu 3
Affiliation  

Automatically discovering message formats of unknown service or system protocols from network traces has become important for a variety of applications, such as emulating the behavior of an unknown protocol in service virtualization, or enabling deep packet inspection in network security. Among existing schemes, the keyword extraction based approaches have been shown to be effective. Inspired by the template structure of protocol messages, recent works leverage the positions of keywords to extract message keywords more accurately. However, these methods are deficient for messages with large variations in length. To address this problem, we propose R-gram, which exploits the relative positions of keywords in messages, allowing the keywords to be robustly detected in variable length messages. It first extracts the common template of the messages in a given message trace with a fast sampling technique, and segments each message into blocks according to the relative positions of the common keywords in the template. It then identifies message keywords in each block by using a new concept and technique — relative positional n-gram (r-gram in short). Finally, the message keywords are used to separate all the messages into type-specific clusters and consequently derive the message format for each cluster. We have implemented and evaluated R-gram on real-world service traces containing either textual or binary protocol messages. Our experimental results show that R-gram is more accurate and robust than existing state-of-the-art tools in protocol message format extraction. Furthermore, R-gram is efficient for processing large-scale message traces.



中文翻译:

R-gram:使用相对位置 n-gram 推断服务协议的消息格式

从网络跟踪中自动发现未知服务或系统协议的消息格式对于各种应用变得很重要,例如在服务虚拟化中模拟未知协议的行为,或在网络安全中启用深度数据包检查。在现有方案中,基于关键字提取的方法已被证明是有效的。受协议消息模板结构的启发,最近的工作利用关键字的位置来更准确地提取消息关键字。然而,这些方法对于长度变化很大的消息是有缺陷的。为了解决这个问题,我们提出了R-gram,它利用了相对位置消息中的关键字,允许在可变长度的消息中可靠地检测关键字。它首先通过快速采样技术提取给定消息跟踪中消息的公共模板,并根据模板中公共关键字的相对位置将每条消息分割成块。然后,它通过使用一个新的概念和技术识别在每个块中的消息的关键字- - [R elative位置Ñ -gram(ř- 简而言之-克)。最后,消息关键字用于将所有消息分成特定类型的集群,从而为每个集群导出消息格式。我们已经在包含文本或二进制协议消息的真实服务跟踪上实现并评估了 R-gram。我们的实验结果表明,在协议消息格式提取方面,R-gram 比现有的最先进工具更准确、更稳健。此外,R-gram 对于处理大规模消息跟踪是有效的。

更新日期:2021-10-25
down
wechat
bug