Machine learning-based dynamic analysis of Android apps with improved code coverage,EURASIP Journal on Information Security

当前位置： X-MOL 学术 › EURASIP J. Info. Secur. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Machine learning-based dynamic analysis of Android apps with improved code coverage
EURASIP Journal on Information Security ( IF 2.5 ) Pub Date : 2019-04-29 , DOI: 10.1186/s13635-019-0087-1
Suleiman Y. Yerima , Mohammed K. Alzaylaee , Sakir Sezer

This paper investigates the impact of code coverage on machine learning-based dynamic analysis of Android malware. In order to maximize the code coverage, dynamic analysis on Android typically requires the generation of events to trigger the user interface and maximize the discovery of the run-time behavioral features. The commonly used event generation approach in most existing Android dynamic analysis systems is the random-based approach implemented with the Monkey tool that comes with the Android SDK. Monkey is utilized in popular dynamic analysis platforms like AASandbox, vetDroid, MobileSandbox, TraceDroid, Andrubis, ANANAS, DynaLog, and HADM. In this paper, we propose and investigate approaches based on stateful event generation and compare their code coverage capabilities with the state-of-the-practice random-based Monkey approach. The two proposed approaches are the state-based method (implemented with DroidBot) and a hybrid approach that combines the state-based and random-based methods. We compare the three different input generation methods on real devices, in terms of their ability to log dynamic behavior features and the impact on various machine learning algorithms that utilize the behavioral features for malware detection. Experiments performed using 17,444 applications show that overall, the proposed methods provide much better code coverage which in turn leads to more accurate machine learning-based malware detection compared to the state-of- the- art approach.

中文翻译：

基于机器学习的Android应用程序动态分析，具有更高的代码覆盖率

本文研究了代码覆盖范围对基于机器学习的Android恶意软件动态分析的影响。为了最大化代码覆盖范围，Android上的动态分析通常需要生成事件以触发用户界面并最大程度地发现运行时行为特征。大多数现有Android动态分析系统中最常用的事件生成方法是使用Android SDK随附的Monkey工具实现的基于随机的方法。Monkey在流行的动态分析平台中使用，例如AASandbox，vetDroid，MobileSandbox，TraceDroid，Andrubis，ANANAS，DynaLog和HADM。在本文中，我们提出并研究了基于状态事件生成的方法，并将其代码覆盖能力与基于实践的基于随机事件的Monkey方法进行比较。提出的两种方法是基于状态的方法（由DroidBot实现）和混合方法，结合了基于状态的方法和基于随机的方法。我们比较了实际设备上三种不同的输入生成方法，它们记录动态行为特征的能力以及对利用行为特征进行恶意软件检测的各种机器学习算法的影响。使用17,444个应用程序进行的实验表明，总体而言，与最先进的方法相比，所提出的方法可提供更好的代码覆盖率，进而导致更精确的基于机器学习的恶意软件检测。我们比较了实际设备上三种不同的输入生成方法，它们记录动态行为特征的能力以及对利用行为特征进行恶意软件检测的各种机器学习算法的影响。使用17,444个应用程序执行的实验表明，总体而言，所提出的方法提供了更好的代码覆盖范围，与最先进的方法相比，该方法又导致了基于机器学习的恶意软件检测的准确性。我们比较了实际设备上三种不同的输入生成方法，它们记录动态行为特征的能力以及对利用行为特征进行恶意软件检测的各种机器学习算法的影响。使用17,444个应用程序进行的实验表明，总体而言，与最先进的方法相比，所提出的方法可提供更好的代码覆盖率，进而导致更精确的基于机器学习的恶意软件检测。

更新日期：2020-04-16

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文