Malware-Detection Method with a Convolutional Recurrent Neural Network Using Opcode Sequences,Information Sciences

当前位置： X-MOL 学术 › Inform. Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Malware-Detection Method with a Convolutional Recurrent Neural Network Using Opcode Sequences
Information Sciences Pub Date : 2020-05-21 , DOI: 10.1016/j.ins.2020.05.026
Seungho Jeon , Jongsub Moon

This paper presents a novel malware-detection model with a convolutional recurrent neural network using opcode sequences. Statistically, an executable file is considered as a set of consecutive machine codes. First, the theoretical foundation on which opcode sequences can be used to detect malware has been discussed. Next, an algorithm for extracting opcode sequences from executables and a deep learning-based malware-detection method that uses the opcode sequences as input have been presented. The proposed model comprises an opcode-level convolutional autoencoder that transforms a long opcode sequence to a relatively short compressed sequence at the front end and a dynamic recurrent neural network classifier that performs a prediction task using the codes generated by the opcode-level convolutional autoencoder at the rear end. Experimentally, the proposed model provided a malware-detection accuracy of $96 %$ , receiver operating characteristic-area under the curve of $0.99$ , and true positive rate (TPR) of $95 %$ . The highest accuracy and TPR achieved by existing malware-detection methods using opcode sequences were $97 %$ and $82 %$ , respectively. Compared with this method, the proposed model delivered a slightly lower accuracy of $96 %$ but a considerably larger TPR of $95 %$ . Therefore, the proposed model is capable of more reliable malware detection.

中文翻译：

卷积递归神经网络的操作码序列的恶意软件检测方法

本文提出了一种使用操作码序列的具有卷积循环神经网络的新型恶意软件检测模型。从统计上讲，可执行文件被视为一组连续的机器代码。首先，已经讨论了可以使用操作码序列检测恶意软件的理论基础。接下来，提出了一种从可执行文件中提取操作码序列的算法，以及使用该操作码序列作为输入的基于深度学习的恶意软件检测方法。提出的模型包括一个操作码级卷积自动编码器，该编码器在前端将长的操作码序列转换为较短的压缩序列；以及一个动态递归神经网络分类器，该分类器使用由操作码级卷积自动编码器生成的代码执行预测任务后端。实验上， $96 ％$ ，曲线下的接收器工作特性区域 $0.99$ ，以及 $95 ％$ 。现有的使用操作码序列进行恶意软件检测的方法实现的最高准确性和TPR为 $97 ％$ 和 $82 ％$ ，分别。与这种方法相比，所提出的模型的精度略低 $96 ％$ 但TPR较大 $95 ％$ 。因此，提出的模型能够进行更可靠的恶意软件检测。

更新日期：2020-05-21

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11