PF-TL: Payload Feature-Based Transfer Learning for Dealing with the Lack of Training Data,Electronics

当前位置： X-MOL 学术 › Electronics › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

PF-TL: Payload Feature-Based Transfer Learning for Dealing with the Lack of Training Data
Electronics ( IF 2.6 ) Pub Date : 2021-05-12 , DOI: 10.3390/electronics10101148
Ilok Jung , Jongin Lim , Huy Kang Kim

The number of studies on applying machine learning to cyber security has increased over the past few years. These studies, however, are facing difficulties with making themselves usable in the real world, mainly due to the lack of training data and reusability of a created model. While transfer learning seems like a solution to these problems, the number of studies in the field of intrusion detection is still insufficient. Therefore, this study proposes payload feature-based transfer learning as a solution to the lack of training data when applying machine learning to intrusion detection by using the knowledge from an already known domain. Firstly, it expands the extracting range of information from header to payload to accurately deliver the information by using an effective hybrid feature extraction method. Secondly, this study provides an improved optimization method for the extracted features to create a labeled dataset for a target domain. This proposal was validated on publicly available datasets, using three distinctive scenarios, and the results confirmed its usability in practice by increasing the accuracy of the training data created from the transfer learning by 30%, compared to that of the non-transfer learning method. In addition, we showed that this approach can help in identifying previously unknown attacks and reusing models from different domains.

中文翻译：

PF-TL：基于有效载荷功能的转移学习，以应对缺乏培训数据的情况

在过去几年中，将机器学习应用于网络安全的研究数量有所增加。然而，这些研究在使它们在现实世界中可用时面临着困难，这主要是由于缺乏训练数据和已创建模型的可重用性。尽管迁移学习似乎可以解决这些问题，但入侵检测领域的研究数量仍然不足。因此，本研究提出了一种基于有效负载特征的转移学习方法，以解决在使用来自已知领域的知识将机器学习应用于入侵检测时缺乏训练数据的问题。首先，它通过使用有效的混合特征提取方法扩展了从标头到有效载荷的信息提取范围，以准确地传递信息。第二，这项研究为提取的特征提供了一种改进的优化方法，以为目标域创建带标签的数据集。这项提案已通过三种不同的情况在公开数据集上得到了验证，结果通过将通过迁移学习创建的训练数据的准确性与非迁移学习方法相比提高了30％，从而证实了其在实践中的可用性。此外，我们证明了这种方法可以帮助识别以前未知的攻击并重用来自不同域的模型。与非转移学习方法相比，通过转移学习创建的训练数据的准确性提高了30％，结果证实了其在实践中的可用性。此外，我们证明了这种方法可以帮助识别以前未知的攻击并重用来自不同域的模型。与非转移学习方法相比，通过转移学习创建的训练数据的准确性提高了30％，结果证实了其在实践中的可用性。此外，我们证明了这种方法可以帮助识别以前未知的攻击并重用来自不同域的模型。

更新日期：2021-05-12

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11