Can Machine Learning Model with Static Features be Fooled: an Adversarial Machine Learning Approach,arXiv - CS - Computational Complexity

当前位置： X-MOL 学术 › arXiv.cs.CC › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Can Machine Learning Model with Static Features be Fooled: an Adversarial Machine Learning Approach
arXiv - CS - Computational Complexity Pub Date : 2019-04-20 , DOI: arxiv-1904.09433
Rahim Taheri, Reza Javidan, Mohammad Shojafar, Vinod P and Mauro Conti

The widespread adoption of smartphones dramatically increases the risk of attacks and the spread of mobile malware, especially on the Android platform. Machine learning-based solutions have been already used as a tool to supersede signature-based anti-malware systems. However, malware authors leverage features from malicious and legitimate samples to estimate statistical difference in-order to create adversarial examples. Hence, to evaluate the vulnerability of machine learning algorithms in malware detection, we propose five different attack scenarios to perturb malicious applications (apps). By doing this, the classification algorithm inappropriately fits the discriminant function on the set of data points, eventually yielding a higher misclassification rate. Further, to distinguish the adversarial examples from benign samples, we propose two defense mechanisms to counter attacks. To validate our attacks and solutions, we test our model on three different benchmark datasets. We also test our methods using various classifier algorithms and compare them with the state-of-the-art data poisoning method using the Jacobian matrix. Promising results show that generated adversarial samples can evade detection with a very high probability. Additionally, evasive variants generated by our attack models when used to harden the developed anti-malware system improves the detection rate up to 50% when using the Generative Adversarial Network (GAN) method.

中文翻译：

具有静态特征的机器学习模型能否被愚弄：对抗性机器学习方法

智能手机的广泛采用极大地增加了攻击和移动恶意软件传播的风险，尤其是在 Android 平台上。基于机器学习的解决方案已经被用作替代基于签名的反恶意软件系统的工具。然而，恶意软件作者利用恶意和合法样本的特征来估计统计差异，以创建对抗样本。因此，为了评估机器学习算法在恶意软件检测中的脆弱性，我们提出了五种不同的攻击场景来干扰恶意应用程序（app）。通过这样做，分类算法不恰当地将判别函数拟合到数据点集上，最终产生更高的误分类率。此外，为了区分对抗样本和良性样本，我们提出了两种防御机制来对抗攻击。为了验证我们的攻击和解决方案，我们在三个不同的基准数据集上测试我们的模型。我们还使用各种分类器算法测试我们的方法，并将它们与使用雅可比矩阵的最新数据中毒方法进行比较。有希望的结果表明，生成的对抗样本可以以非常高的概率逃避检测。此外，当使用生成对抗网络 (GAN) 方法时，我们的攻击模型在用于强化开发的反恶意软件系统时生成的规避变体将检测率提高了 50%。我们还使用各种分类器算法测试我们的方法，并将它们与使用雅可比矩阵的最新数据中毒方法进行比较。有希望的结果表明，生成的对抗样本可以以非常高的概率逃避检测。此外，当使用生成对抗网络 (GAN) 方法时，我们的攻击模型在用于强化开发的反恶意软件系统时生成的规避变体将检测率提高了 50%。我们还使用各种分类器算法测试我们的方法，并将它们与使用雅可比矩阵的最新数据中毒方法进行比较。有希望的结果表明，生成的对抗样本可以以非常高的概率逃避检测。此外，当使用生成对抗网络 (GAN) 方法时，我们的攻击模型在用于强化开发的反恶意软件系统时生成的规避变体将检测率提高了 50%。

更新日期：2020-03-03

点击分享查看原文

点击收藏

阅读更多本刊最新论文