当前位置: X-MOL 学术Theor. Comput. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Online parameterized dictionary matching with one gap
Theoretical Computer Science ( IF 0.9 ) Pub Date : 2020-09-14 , DOI: 10.1016/j.tcs.2020.09.016
Avivit Levy , B. Riva Shalom

We study the online Parameterized Dictionary Matching with One Gap problem (PDMOG) which is the following. Preprocess a dictionary D of d patterns, where each pattern contains a special gap symbol that can match any string, so that given a text T arriving online, a character at a time, we can report all the patterns from D that parameterized match to suffixes of the text that has arrived so far, before the next character arrives. Two equal-length strings are a parameterized match if there exists a bijection on the alphabets, such that one string matches the other under the bijection. The gap symbols are associated with bounds determining the possible lengths of matching strings. Online Dictionary Matching with One Gap (DMOG) captures the difficulty in a bottleneck procedure for cyber-security, as many digital signatures of viruses manifest themselves as patterns with a single gap. Parameterized match captures possible encryption of the patterns. We also define the strict PDMOG problem, in which subpatterns of the same dictionary pattern should be parameterized matched via the same bijection. This captures situations where subpatterns of a dictionary pattern are encoded simultaneously. We study this problem for special case called alphabet-saturated dictionairy, where every subpattern contains all characters of the dictionary alphabet Σ. We use the following parameters to describe our results: D is the total size of the dictionary (not including the gaps), plsc is the longest parameterized suffix chain of subpatterns in D, op is the number of parameterized patterns occurrences in T, α and β are the minimum left and maximum right gap borders in the non-uniformly bounded dictionary case, δ(GD) is the degeneracy of the graph GD representing dictionary D. This graph is classified as sparse or dense according the the value of the δ(GD) and plsc parameters. We obtain:

O˜(D) preprocessing time/space and O˜(δ(GD)plsc+plscmax{|Σ|,M}+op) query time per text character algorithm for online PDMOG with sparse graph dictionaries.

O˜(D+d(βα)) preprocessing time/space and O˜(plscd(βα)+plscmax{|Σ|,M}+op) query time per text character algorithm for online PDMOG with dense graph dictionaries.

O˜(D) preprocessing time/space and O˜(δ(GD)plsc+op) query time per text character algorithm for strict PDMOG with alphabet-saturated dictionaries.

These results are parallel to the ones obtained for the Dictionary with One Gap (DMOG) problem almost matching the lower bounds achieved for this problem [7]. While the parameter δ(GD) can be as large as d and much lager if the dictionary has non-uniform gap boundaries, and the parameter plsc could theoretically be as large as d, in many practical situations these parameters are actually small. The strength of our work is in achieving results that explore and exploit small values for these parameters, thus supplying algorithms that are suitable for some practical cyber security needs.



中文翻译:

在线参数化字典相距一格

我们研究具有以下问题的在线参数化字典匹配一缺口问题(PDMOG)。预处理字典dd模式,其中每个模式包含一个特殊的差距符号,可以匹配任何字符串,从而使给定文本牛逼同时到达在线,角色,我们可以报告从所有的模式d在下一个字符到达之前,已参数化的匹配项与到目前为止已经到达的文本后缀匹配。如果字母上存在双射,则两个等长字符串是参数化的匹配项,以使一个字符串在双射下匹配另一个字符串。间隙符号与确定匹配字符串可能长度的界限相关联。具有多个漏洞的在线词典(DMOG)捕获了网络安全瓶颈过程中的难题,因为许多病毒的数字签名都以单一漏洞的形式表现出来。参数化匹配可捕获模式的可能加密。我们还定义了严格的PDMOG问题,其中应通过相同的双射参数化对相同词典模式的子模式进行参数化匹配。这捕获了字典模式的子模式被同时编码的情况。我们针对特殊情况(称为字母饱和字典)研究此问题,其中每个子模式都包含字典字母Σ的所有字符。我们使用以下参数来描述我们的结果:d是字典的总大小(不包括空格),plscD中子模式的最长参数化后缀链,opT中出现的参数化模式数,αβ 是非均匀边界字典情况下的最小左间隙边界和最大右间隙边界, δGd 是图的简并性 Gd代表字典d。根据图的值,该图被分为稀疏稠密δGdplsc参数。我们获得:

Ød 预处理时间/空间和 ØδGd可编程逻辑控制器+可编程逻辑控制器最高{|Σ|中号}+Øp 带有稀疏图字典的在线PDMOG的每个文本字符算法的查询时间。

Ød+dβ-α 预处理时间/空间和 Ø可编程逻辑控制器dβ-α+可编程逻辑控制器最高{|Σ|中号}+Øp 带有密集图字典的在线PDMOG的每文本字符查询时间算法。

Ød 预处理时间/空间和 ØδGd可编程逻辑控制器+Øp 带有字母饱和字典的严格PDMOG的每个文本字符算法的查询时间。

这些结果与从“一缺口字典”(DMOG)问题获得的结果几乎匹配,该结果与该问题的下界几乎匹配[7]。而参数δGd 可以大到 d如果字典具有不均匀的间隙边界,并且参数plsc在理论上可以与d一样大,则更大,在许多实际情况下,这些参数实际上很小。我们工作的优势在于获得探索和利用这些参数较小值的结果,从而提供适合某些实际网络安全需求的算法。

更新日期:2020-09-14
down
wechat
bug