Towards interpreting ML-based automated malware detection models: a survey,arXiv - CS - Cryptography and Security

当前位置： X-MOL 学术 › arXiv.cs.CR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Towards interpreting ML-based automated malware detection models: a survey
arXiv - CS - Cryptography and Security Pub Date : 2021-01-15 , DOI: arxiv-2101.06232
Yuzhou Lin, Xiaolin Chang

Malware is being increasingly threatening and malware detectors based on traditional signature-based analysis are no longer suitable for current malware detection. Recently, the models based on machine learning (ML) are developed for predicting unknown malware variants and saving human strength. However, most of the existing ML models are black-box, which made their pre-diction results undependable, and therefore need further interpretation in order to be effectively deployed in the wild. This paper aims to examine and categorize the existing researches on ML-based malware detector interpretability. We first give a detailed comparison over the previous work on common ML model inter-pretability in groups after introducing the principles, attributes, evaluation indi-cators and taxonomy of common ML interpretability. Then we investigate the interpretation methods towards malware detection, by addressing the importance of interpreting malware detectors, challenges faced by this field, solutions for migitating these challenges, and a new taxonomy for classifying all the state-of-the-art malware detection interpretability work in recent years. The highlight of our survey is providing a new taxonomy towards malware detection interpreta-tion methods based on the common taxonomy summarized by previous re-searches in the common field. In addition, we are the first to evaluate the state-of-the-art approaches by interpretation method attributes to generate the final score so as to give insight to quantifying the interpretability. By concluding the results of the recent researches, we hope our work can provide suggestions for researchers who are interested in the interpretability on ML-based malware de-tection models.

中文翻译：

解释基于ML的自动化恶意软件检测模型：一项调查

恶意软件正日益受到威胁，基于传统基于签名的分析的恶意软件检测器不再适合当前的恶意软件检测。最近，开发了基于机器学习（ML）的模型来预测未知的恶意软件变体并节省人力。但是，大多数现有的ML模型都是黑盒子，这使得它们的预测结果不可靠，因此需要进一步解释才能有效地在野外部署。本文旨在对基于ML的恶意软件检测器可解释性的现有研究进行检查和归类。在介绍了通用ML可解释性的原理，属性，评估指标和分类法之后，我们首先对以前的关于通用ML模型可解释性的工作进行了详细的比较。然后，我们通过解决解释恶意软件检测器的重要性，该领域面临的挑战，解决这些挑战的解决方案以及对所有最新的恶意软件检测可解释性进行分类的新分类法，来研究针对恶意软件检测的解释方法最近几年。我们调查的重点是根据先前在共同领域进行的研究总结的共同分类法，为恶意软件检测解释方法提供一种新的分类法。此外，我们是第一个通过解释方法属性来评估最新方法以生成最终分数的方法，从而为量化解释性提供了见识。通过总结最近的研究结果，

更新日期：2021-01-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文