当前位置: X-MOL 学术Entropy › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Detecting Malware with Information Complexity
Entropy ( IF 2.1 ) Pub Date : 2020-05-20 , DOI: 10.3390/e22050575
Nadia Alshahwan 1 , Earl T Barr 1 , David Clark 1 , George Danezis 1 , Héctor D Menéndez 2
Affiliation  

Malware concealment is the predominant strategy for malware propagation. Black hats create variants of malware based on polymorphism and metamorphism. Malware variants, by definition, share some information. Although the concealment strategy alters this information, there are still patterns on the software. Given a zoo of labelled malware and benign-ware, we ask whether a suspect program is more similar to our malware or to our benign-ware. Normalized Compression Distance (NCD) is a generic metric that measures the shared information content of two strings. This measure opens a new front in the malware arms race, one where the countermeasures promise to be more costly for malware writers, who must now obfuscate patterns as strings qua strings, without reference to execution, in their variants. Our approach classifies disk-resident malware with 97.4% accuracy and a false positive rate of 3%. We demonstrate that its accuracy can be improved by combining NCD with the compressibility rates of executables using decision forests, paving the way for future improvements. We demonstrate that malware reported within a narrow time frame of a few days is more homogeneous than malware reported over two years, but that our method still classifies the latter with 95.2% accuracy and a 5% false positive rate. Due to its use of compression, the time and computation cost of our method is nontrivial. We show that simple approximation techniques can improve its running time by up to 63%. We compare our results to the results of applying the 59 anti-malware programs used on the VirusTotal website to our malware. Our approach outperforms each one used alone and matches that of all of them used collectively.

中文翻译:


检测具有信息复杂性的恶意软件



恶意软件隐藏是恶意软件传播的主要策略。黑帽基于多态性和变态性创建恶意软件变体。根据定义,恶意软件变体共享一些信息。尽管隐藏策略改变了这些信息,但软件上仍然存在模式。给定一个充满标记的恶意软件和良性软件的动物园,我们会询问可疑程序是否与我们的恶意软件更相似,或者与我们的良性软件更相似。归一化压缩距离 (NCD) 是一种通用度量,用于测量两个字符串的共享信息内容。这一措施为恶意软件军备竞赛开辟了一条新战线,其中的对策对于恶意软件编写者来说可能会付出更高的代价,他们现在必须在其变体中将模式混淆为字符串,而不参考执行情况。我们的方法对磁盘驻留恶意软件进行分类的准确度为 97.4%,误报率为 3%。我们证明,通过将 NCD 与使用决策林的可执行文件的压缩率相结合,可以提高其准确性,为未来的改进铺平道路。我们证明,在几天的狭窄时间范围内报告的恶意软件比两年内报告的恶意软件更加同质,但我们的方法仍然以 95.2% 的准确率和 5% 的误报率对后者进行分类。由于使用了压缩,我们的方法的时间和计算成本是不小的。我们证明,简单的近似技术可以将其运行时间缩短高达 63%。我们将我们的结果与将 VirusTotal 网站上使用的 59 个反恶意软件程序应用于我们的恶意软件的结果进行比较。我们的方法优于单独使用的每一种方法,并且与集体使用的所有方法相匹配。
更新日期:2020-05-20
down
wechat
bug