当前位置: X-MOL 学术PeerJ Comput. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep learning based Sequential model for malware analysis using Windows exe API Calls
PeerJ Computer Science ( IF 3.5 ) Pub Date : 2020-07-27 , DOI: 10.7717/peerj-cs.285
Ferhat Ozgur Catak 1, 2 , Ahmet Faruk Yazı 3 , Ogerta Elezaj 1 , Javed Ahmed 1
Affiliation  

Malware development has seen diversity in terms of architecture and features. This advancement in the competencies of malware poses a severe threat and opens new research dimensions in malware detection. This study is focused on metamorphic malware, which is the most advanced member of the malware family. It is quite impossible for anti-virus applications using traditional signature-based methods to detect metamorphic malware, which makes it difficult to classify this type of malware accordingly. Recent research literature about malware detection and classification discusses this issue related to malware behavior. The main goal of this paper is to develop a classification method according to malware types by taking into consideration the behavior of malware. We started this research by developing a new dataset containing API calls made on the windows operating system, which represents the behavior of malicious software. The types of malicious malware included in the dataset are Adware, Backdoor, Downloader, Dropper, spyware, Trojan, Virus, and Worm. The classification method used in this study is LSTM (Long Short-Term Memory), which is a widely used classification method in sequential data. The results obtained by the classifier demonstrate accuracy up to 95% with 0.83 $F_1$-score, which is quite satisfactory. We also run our experiments with binary and multi-class malware datasets to show the classification performance of the LSTM model. Another significant contribution of this research paper is the development of a new dataset for Windows operating systems based on API calls. To the best of our knowledge, there is no such dataset available before our research. The availability of our dataset on GitHub facilitates the research community in the domain of malware detection to benefit and make a further contribution to this domain.

中文翻译:

使用 Windows exe API 调用进行恶意软件分析的基于深度学习的序列模型

恶意软件开发在架构和功能方面呈现出多样性。恶意软件能力的进步构成了严重的威胁,并为恶意软件检测开辟了新的研究维度。这项研究的重点是变态恶意软件,它是恶意软件家族中最先进的成员。反病毒应用程序使用传统的基于签名的方法几乎不可能检测变态恶意软件,这使得很难对此类恶意软件进行相应的分类。最近有关恶意软件检测和分类的研究文献讨论了与恶意软件行为相关的这个问题。本文的主要目标是通过考虑恶意软件的行为来开发一种根据恶意软件类型的分类方法。我们通过开发一个新的数据集开始这项研究,其中包含在 Windows 操作系统上进行的 API 调用,它代表了恶意软件的行为。数据集中包含的恶意软件类型包括广告软件、后门、下载程序、Dropper、间谍软件、特洛伊木马、病毒和蠕虫。本研究使用的分类方法是LSTM(长短期记忆),它是序列数据中广泛使用的分类方法。分类器获得的结果表明准确率高达 95%,$F_1$-score 为 0.83,这是相当令人满意的。我们还使用二进制和多类恶意软件数据集进行实验,以展示 LSTM 模型的分类性能。这篇研究论文的另一个重要贡献是基于 API 调用为 Windows 操作系统开发了一个新的数据集。据我们所知,在我们的研究之前没有这样的数据集。我们的数据集在 GitHub 上的可用性有助于恶意软件检测领域的研究社区受益并为该领域做出进一步的贡献。
更新日期:2020-08-20
down
wechat
bug