A De-Compositional Approach to Regular Expression Matching for Network Security,IEEE/ACM Transactions on Networking - X-MOL

当前位置： X-MOL 学术 › IEEE ACM Trans. Netw. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A De-Compositional Approach to Regular Expression Matching for Network Security
IEEE/ACM Transactions on Networking ( IF 3.0 ) Pub Date : 2019-10-28 , DOI: 10.1109/tnet.2019.2941920
Alex X. Liu , Eric Norige

Regular Expression (RegEx) matching is the industry standard for Deep Packet Inspection (DPI) because RegExes are significantly more expressive than strings. To achieve high matching speed, we need to convert the RegExes to Deterministic Finite State Automata (DFA). However, DFA has the state explosion problem, that is, the number of DFA states and transitions can be exponential with the number of RegExes. Much work has addressed the DFA state explosion problem; however, none has met all the requirements of fast and automated construction, small memory image, and high matching speed. In this paper, we propose a decompositional approach, with fast and automated construction, small memory image, and high matching speed, to DFA state explosion. The first key idea is to decompose a complex RegEx that cause exponential state increases into a set of simpler RegExes that do not cause exponential state increases, where any character string that matches the complex RegEx also matches all the RegExes in the set of simpler RegExes; that is, the set of strings that match the complex RegEx is a subset of strings that match the set of simpler RegExes. The second key idea is to use a stateful post-processing engine to filter the matches that are actually the matches of the complex RegEx. Given an input string for matching, instead of using the large DFA constructed from the original complex RegEx to perform the matching, we first use the small DFA constructed from the set of simpler RegExes to perform the matching, and then, if the small DFA reports a match, we use the post-processing engine to determine whether it is a true match to the original complex RegEx. Because the pre-processing is simple, automaton construction can be automated and fast, and because most on-line processing is done by a DFA, its matching speed is close to that of a DFA alone. Our experimental results show that our decompositional approach achieves orders of magnitude faster DFA construction (in terms of seconds instead of minutes), 30 times smaller memory image, and 43% faster matching speeds, than state-of-the-art software based RegEx matching algorithms.

中文翻译：

一种用于网络安全的正则表达式匹配的分解方法

正则表达式（RegEx）匹配是深度包检查（DPI）的行业标准，因为正则表达式比字符串具有更高的表现力。为了实现高匹配速度，我们需要将RegExes转换为确定性有限状态自动机（DFA）。但是，DFA存在状态爆炸问题，即DFA状态和转换的数量与RegExes的数量成指数关系。许多工作已经解决了DFA状态爆炸问题；但是，没有一个软件能够满足快速和自动构造，小内存映像和高匹配速度的所有要求。在本文中，我们提出了一种分解方法，该方法具有快速且自动化的构造，小内存映像和高匹配速度的DFA状态爆炸。第一个关键思想是将导致指数状态增加的复杂RegEx分解为一组不会导致指数状态增加的简单RegEx，其中，与复杂RegEx匹配的任何字符串也将与这组简单RegExes中的所有RegEx匹配；也就是说，与复杂RegEx匹配的字符串集是与较简单RegExes匹配的字符串的子集。第二个关键思想是使用有状态的后处理引擎来过滤匹配项，这些匹配项实际上是复杂RegEx的匹配项。给定用于匹配的输入字符串，我们首先使用从较简单的RegExes集合构造的小DFA进行匹配，而不是使用从原始复杂RegEx构造的大DFA进行匹配，然后，如果小DFA报告一场比赛我们使用后处理引擎来确定它是否与原始的复杂RegEx真正匹配。因为预处理很简单，所以自动机的构建可以实现自动化和快速，并且由于大多数在线处理都是由DFA完成的，因此其匹配速度接近于单独的DFA。我们的实验结果表明，与基于最新软件的RegEx匹配相比，我们的分解方法可将DFA的构建速度提高数个数量级（以秒为单位，而不是分钟），将存储映像缩小30倍，并且将匹配速度提高43％算法。

更新日期：2020-01-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文

相关文章参考文献引文

点击加载相关文章

全部期刊列表>>

阿拉丁

英语语言编辑翻译加编辑

专注于基础生命科学与临床研究的交叉领域

遥感数据采集

数字地球

开学添书香，满额有好礼

加速出版服务

编辑润色服务全线九折优惠

传播分子、细胞和发育生物学领域的重大发现

环境管理资源效率浪费最小化

先进材料生物材料

聚焦分子细胞和生物体生物学

“转化老年科学”.正在征稿

化学工程

wiley你是哪种学术人格

细胞生物学

100+材料学期刊

人工智能新刊

图书出版流程

征集眼内治疗给药新技术

英语语言编辑服务

快速找到合适的投稿机会

动态系统的数学与计算机建模

热点论文一站获取

定位全球科研英才

中国图象图形学学会合作刊

东北石油大学合作期刊

动物源性食品遗传学与育种

专业英语编辑服务

中科大

华盛顿

上海交大

中山大学

西湖大学

药物所

普渡大学

东方理工

ACS材料视界

客服邮箱：service@x-mol.com
官方微信：X-molTeam2
邮编：100098
地址：北京市海淀区知春路56号中航科技大厦

bug