当前位置: X-MOL 学术IEEE Access › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
DNA Encoding and STR Extraction for Anomaly Intrusion Detection Systems
IEEE Access ( IF 3.4 ) Pub Date : 2021-01-28 , DOI: 10.1109/access.2021.3055431
Omar Fitian Rashid , Zulaiha Ali Othman , Suhaila Zainudin , Noor Azah Samsudin

Deoxyribonucleic acid (DNA) can be used to discover the presence of diseases in the human body. Similarly, its functionality can be leveraged in an intrusion detection system (IDS) to detect attacks against computer systems and network traffic. Various approaches have been proposed for using DNA sequences in IDSs. The most popular is the DNA sequence matching method, which is also used in biology. A technique that uses the DNA sequence in an IDS has previously been proposed to generate a normal signature sequence with an alignment threshold value. However, its detection rate is very low. Therefore, this paper considers the two main factors that affect the detection accuracy via the DNA sequence, DNA encoding and the short tandem repeat (STR) (i.e., the DNA keys and their positions). It then proposes two DNA encoding methods, named DEM3sel, and DEMdif, which differ in terms of the length of the DNA sequence and the network traffic representation. DEM3sel uses three characters to represent all 41 network attributes but uses a single fixed character to distinguish between nominal and numerical attributes. DEMdif uses different characters to represent all the network attributes based on the attribute values and uses a single fixed character to distinguish between nominal and numerical attributes. In all these methods, the Teiresias algorithm is used to extract the short tandem repeat (STR), which includes both the keys and their positions in the network traffic, while the Knuth-Morris-Pratt algorithm is used as a matching process to determine whether the network traffic is normal or an attack. An experiment is conducted to assess the proposed methods' performance on two standard datasets: KDDCup 99 and NSL-KDD. The experiment is run 30 times for each DNA encoding method. The results show that DEM3sel obtains the best result compared with DEMdif, where the detection rate, false alarm rate, and accuracy of detection are 99.58%, 35.53%, and 92.74% respectively. The results also show that using more keys and their positions improves the false alarm rate and the accuracy of DEM3sel by up to 26.48% and 1.75%, respectively. Moreover, the performance of the proposed method DEM3sel is comparable to or better than state-of-the-art algorithms. Thus, it can be concluded that the proposed DNA sequence method is suitable for use in an IDS.

中文翻译:


异常入侵检测系统的 DNA 编码和 STR 提取



脱氧核糖核酸(DNA)可用于发现人体内是否存在疾病。同样,它的功能可以在入侵检测系统(IDS)中利用来检测针对计算机系统和网络流量的攻击。已经提出了在 IDS 中使用 DNA 序列的各种方法。最流行的是DNA序列匹配方法,该方法也用于生物学。之前已经提出了一种使用 IDS 中的 DNA 序列的技术来生成具有比对阈值的正常签名序列。然而,其检出率非常低。因此,本文通过DNA序列、DNA编码和短串联重复序列(STR)(即DNA键及其位置)来考虑影响检测精度的两个主要因素。然后提出了两种 DNA 编码方法,称为 DEM3sel 和 DEMdif,它们在 DNA 序列的长度和网络流量表示方面有所不同。 DEM3sel 使用三个字符来表示所有 41 个网络属性,但使用单个固定字符来区分标称属性和数字属性。 DEMdif根据属性值使用不同的字符来表示所有网络属性,并使用单个固定字符来区分标称属性和数值属性。在所有这些方法中,Teiresias算法用于提取短串联重复序列(STR),其中包括密钥及其在网络流量中的位置,而Knuth-Morris-Pratt算法用作匹配过程来确定是否网络流量正常或受到攻击。进行了一项实验来评估所提出的方法在两个标准数据集:KDDCup 99 和 NSL-KDD 上的性能。每种 DNA 编码方法的实验运行 30 次。 结果表明,与DEMdif相比,DEM3sel获得了最好的结果,检测率、误报率和检测准确率分别为99.58%、35.53%和92.74%。结果还表明,使用更多的关键点及其位置可以使 DEM3sel 的误报率和准确度分别提高高达 26.48% 和 1.75%。此外,所提出的方法 DEM3sel 的性能与最先进的算法相当或更好。因此,可以得出结论,所提出的 DNA 序列方法适合在 IDS 中使用。
更新日期:2021-01-28
down
wechat
bug