当前位置: X-MOL 学术Future Gener. Comput. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Research on unsupervised feature learning for Android malware detection based on Restricted Boltzmann Machines
Future Generation Computer Systems ( IF 7.5 ) Pub Date : 2021-02-25 , DOI: 10.1016/j.future.2021.02.015
Zhen Liu , Ruoyu Wang , Nathalie Japkowicz , Deyu Tang , Wenbin Zhang , Jie Zhao

Android malware detection has attracted much attention in recent years. Existing methods mainly research on extracting static or dynamic features from mobile apps and build mobile malware detection model by machine learning algorithms. The number of extracted static or dynamic features maybe much high. As a result, the data suffers from high dimensionality. In addition, to avoid being detected, malware data is varied and hard to obtain in the first place. To detect zeroday malware, unsupervised malware detection methods were applied. In such case, unsupervised feature reduction method is an available choice to reduce the data dimensionality. In this paper, we propose an unsupervised feature learning algorithm called Subspace based Restricted Boltzmann Machines (SRBM) for reducing data dimensionality in malware detection. Multiple subspaces in the original data are firstly searched. And then, an RBM is built on each subspace. All outputs of the hidden layers of the trained RBMs are combined to represent the data in lower dimension. The experimental results on OmniDroid, CIC2019 and CIC2020 datasets show that the features learned by SRBM perform better than the ones learned by other feature reduction methods when the performance is evaluated by clustering evaluation metrics, i.e., NMI, ACC and Fscore.



中文翻译:

基于受限玻尔兹曼机的Android恶意软件检测无监督特征学习研究

近年来,Android恶意软件检测已引起广泛关注。现有方法主要研究从移动应用中提取静态或动态特征,并通过机器学习算法建立移动恶意软件检测模型。提取的静态或动态特征的数量可能很多。结果,数据遭受高维数的困扰。另外,为了避免被检测到,恶意软件数据一开始是多种多样的并且很难获得。为了检测零日恶意软件,应用了无监督的恶意软件检测方法。在这种情况下,可以使用无监督的特征约简方法来减少数据维数。在本文中,我们提出了一种无监督的特征学习算法,称为基于子空间的受限玻尔兹曼机(SRBM),用于减少恶意软件检测中的数据维数。首先搜索原始数据中的多个子空间。然后,在每个子空间上建立一个RBM。训练后的RBM的隐藏层的所有输出被组合起来以较低维度表示数据。在OmniDroid,CIC2019和CIC2020数据集上的实验结果表明,当通过聚类评估指标对性能进行评估时,SRBM所学习的特征的性能要优于其他特征约简方法所学习的特征。NMIACCFscore

更新日期:2021-03-08
down
wechat
bug