Lessons Learnt on Reproducibility in Machine Learning Based Android Malware Detection,Empirical Software Engineering

当前位置： X-MOL 学术 › Empir. Software Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Lessons Learnt on Reproducibility in Machine Learning Based Android Malware Detection
Empirical Software Engineering ( IF 3.5 ) Pub Date : 2021-05-24 , DOI: 10.1007/s10664-021-09955-7
Nadia Daoudi , Kevin Allix , Tegawendé F. Bissyandé , Jacques Klein

A well-known curse of computer security research is that it often produces systems that, while technically sound, fail operationally. To overcome this curse, the community generally seeks to assess proposed systems under a variety of settings in order to make explicit every potential bias. In this respect, recently, research achievements on machine learning based malware detection are being considered for thorough evaluation by the community. Such an effort of comprehensive evaluation supposes first and foremost the possibility to perform an independent reproduction study in order to sharpen evaluations presented by approaches’ authors. The question Can published approaches actually be reproduced? thus becomes paramount despite the little interest such mundane and practical aspects seem to attract in the malware detection field. In this paper, we attempt a complete reproduction of five Android Malware Detectors from the literature and discuss to what extent they are “reproducible”. Notably, we provide insights on the implications around the guesswork that may be required to finalise a working implementation. Finally, we discuss how barriers to reproduction could be lifted, and how the malware detection field would benefit from stronger reproducibility standards—like many various fields already have.

中文翻译：

基于机器学习的Android恶意软件检测中可重复性方面的经验教训

计算机安全研究的一个众所周知的诅咒是，它通常会生产出虽然在技术上可靠但在操作上会失败的系统。为了克服这一诅咒，社区通常试图在各种环境下评估提议的系统，以明确每种潜在的偏见。在这方面，最近，社区正在考虑基于机器学习的恶意软件检测方面的研究成果，以进行全面评估。这种全面评估的努力首先和最重要的前提是，有可能进行独立的复制研究，以加强方法作者的评估。问题可以实际复制已发布的方法吗？因此，尽管这种平凡而实用的方面似乎在恶意软件检测领域吸引了很少的关注，但仍变得至关重要。在本文中，我们尝试从文献中完全复制出五个Android恶意软件检测器，并讨论它们在何种程度上是“可复制的”。值得注意的是，我们提供了有关完成工作实施可能需要的猜测工作的启示。最后，我们讨论如何消除复制障碍，以及如何像更多个领域一样，从更强的可再现性标准中受益于恶意软件检测领域。

更新日期：2021-05-24

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11