当前位置: X-MOL 学术Comput. Secur. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Feature analysis for data-driven APT-related malware discrimination
Computers & Security ( IF 5.6 ) Pub Date : 2021-01-16 , DOI: 10.1016/j.cose.2021.102202
Luis Francisco Martín Liras , Adolfo Rodríguez de Soto , Miguel A. Prada

Advanced Persistent Threats (APTs) have become a major concern for IT security professionals around the world. These attacks are characterized by the use of both highly sophisticated, evasive and cautious human and technical resources. It is very common to notice the combined use of different malware in long APT campaigns. This fact makes it interesting to investigate the malware that has been used in APT campaigns. Different approaches have been proposed to find discriminatory features to detect APT malware. Features from either static, dynamic and network-related analyses have been separately proposed for that aim. The new approach considered in this study aims to identify the most discriminatory features to distinguish APT-campaign-belonging malware from non-APT malware executables. This approach suggests to identify the discriminatory features from not one but all three groups of these analyses by using domain knowledge and with a purpose of interpretability. As a result, a set with the most discriminatory features of each type is provided. To achieve this set, well-known machine learning techniques have been used. One of the most important limitations in the use of these learning techniques is the availability of a relevant amount of data. In this paper, a large dataset of 19,457 malware samples is publicly provided, including both malware known to be related with APTs and generic non-APT-belonging malware samples. In order to analyze the discrimination ability of the features, the proposed approach follows several steps. First, an exploratory analysis is conducted to obtain knowledge about the data structure. Later, feature selection is performed using different discriminatory techniques. The resulting selection of features is assessed by means of four well-known binary classification techniques. The high accuracy of the results shows that the proposed features are discriminative enough for the stated purpose. Finally, these results are interpreted and the findings are discussed from the perspective of prior knowledge and assumptions about APT-related malware.



中文翻译:

数据驱动的APT相关恶意软件识别功能分析

高级持久威胁(APT)已成为全球IT安全专业人员关注的主要问题。这些攻击的特点是使用高度复杂,回避和谨慎的人力和技术资源。在长期的APT广告系列中,通常会同时使用不同的恶意软件。这一事实使得调查APT活动中使用的恶意软件变得很有趣。已经提出了不同的方法来找到用于检测APT恶意软件的歧视性特征。为此,分别提出了静态,动态和与网络相关的分析中的功能。本研究中考虑的新方法旨在确定最具区分性的功能,以区分具有APT广告系列的恶意软件和非APT恶意软件可执行文件。该方法建议通过使用领域知识并具有可解释性的目的,从这些分析的三个组中而不是一个中识别出区别特征。结果,提供了具有每种类型的最有区别的特征的集合。为了实现这一目标,已经使用了众所周知的机器学习技术。使用这些学习技术的最重要限制之一是相关数据量的可用性。在本文中,公开提供了一个由19,457个恶意软件样本组成的大型数据集,包括已知与APT相关的恶意软件和不属于APT的通用恶意软件样本。为了分析特征的辨别能力,提出的方法遵循几个步骤。首先,进行探索性分析以获得有关数据结构的知识。后来,使用不同的区分技术执行特征选择。通过四种众所周知的二进制分类技术评估特征的最终选择。结果的高精度表明,所提出的特征对于所描述的目的是足够有区别的。最后,从有关APT相关恶意软件的先验知识和假设的角度出发,对这些结果进行解释并讨论其发现。

更新日期:2021-02-05
down
wechat
bug