Abstract
Attributing a piece of malware to its creator typically requires threat intelligence. Binary attribution increases the level of difficulty as it mostly relies upon the ability to disassemble binaries to obtain authorship-related features. We perform a systematic analysis of works in the area of malware authorship attribution. We identify key findings, some shortcomings of current approaches and explore the open research challenges. To mitigate the lack of ground truth datasets in this domain, we publish alongside this survey the largest and most diverse meta-information dataset of 17,513 malware labeled to 275 threat actor groups.
- S. Afroz, A. C. Islam, A. Stolerman, R. Greenstadt, and D. McCoy. 2014. Doppelgänger Finder: Taking Stylometry to the Underground. In 2014 IEEE Symposium on Security and Privacy. 212–226. https://doi.org/10.1109/SP.2014.21Google ScholarDigital Library
- Mohammadhadi Alaeiyan, Ali Dehghantanha, Tooska Dargahi, Mauro Conti, and Saeed Parsa. 2020. A Multilabel Fuzzy Relevance Clustering System for Malware Attack Attribution in the Edge Layer of Cyber-Physical Networks. ACM Trans. Cyber-Phys. Syst. 4, 3, Article 31(mar 2020), 22 pages. https://doi.org/10.1145/3351881Google ScholarDigital Library
- AlienVault. [n. d.]. https://otx.alienvault.com/Google Scholar
- Saed Alrabaee, Mourad Debbabi, and Lingyu Wang. 2019. On the feasibility of binary authorship characterization. Digital Investigation 28, Supplement (2019), S3–S11. https://doi.org/10.1016/j.diin.2019.01.028Google ScholarDigital Library
- Saed Alrabaee, Mourad Debbabi, and Lingyu Wang. 2022. A Survey of Binary Code Fingerprinting Approaches: Taxonomy, Methodologies, and Features. ACM Comput. Surv. 55, 1, Article 19 (jan 2022), 41 pages. https://doi.org/10.1145/3486860Google ScholarDigital Library
- Saed Alrabaee, ElMouatez Billah Karbab, Lingyu Wang, and Mourad Debbabi. 2019. BinEye: Towards Efficient Binary Authorship Characterization Using Deep Learning. In Computer Security - ESORICS 2019 - 24th European Symposium on Research in Computer Security, Luxembourg, September 23-27, 2019, Proceedings, Part II. 47–67. https://doi.org/10.1007/978-3-030-29962-0_3Google ScholarDigital Library
- Saed Alrabaee, Noman Saleem, Stere Preda, Lingyu Wang, and Mourad Debbabi. 2014. OBA2: An Onion approach to Binary code Authorship Attribution. Digital Investigation 11(2014), S94 – S103. https://doi.org/10.1016/j.diin.2014.03.012 Proceedings of the First Annual DFRWS Europe.Google ScholarCross Ref
- Saed Alrabaee, Paria Shirani, Mourad Debbabi, and Lingyu Wang. 2017. On the Feasibility of Malware Authorship Attribution. In Foundations and Practice of Security, Frédéric Cuppens, Lingyu Wang, Nora Cuppens-Boulahia, Nadia Tawbi, and Joaquin Garcia-Alfaro (Eds.). Springer International Publishing, Cham, 256–272.Google Scholar
- Saed Alrabaee, Paria Shirani, Lingyu Wang, and Mourad Debbabi. 2018. FOSSIL: A Resilient and Efficient System for Identifying FOSS Functions in Malware Binaries. ACM Trans. Priv. Secur. 21, 2, Article 8(Jan. 2018), 34 pages. https://doi.org/10.1145/3175492Google ScholarDigital Library
- Saed Alrabaee, Paria Shirani, Lingyu Wang, Mourad Debbabi, and Aiman Hanna. 2018. On Leveraging Coding Habits for Effective Binary Authorship Attribution. In Computer Security, Javier Lopez, Jianying Zhou, and Miguel Soriano (Eds.). Springer International Publishing, Cham, 26–47.Google Scholar
- Victor M. Alvarez. 2020. YARA. https://virustotal.github.io/yara/ Retrieved May 30, 2020 fromGoogle Scholar
- Naqqash Aman, Yasir Saleem, Fahim H. Abbasi, and Farrukh Shahzad. 2017. A Hybrid Approach for Malware Family Classification. In Applications and Techniques in Information Security, Lynn Batten, Dong Seong Kim, Xuyun Zhang, and Gang Li (Eds.). Springer Singapore, Singapore, 169–180.Google Scholar
- armbues. 2015. ioc_parser. https://github.com/armbues/ioc_parser Retrieved Oct 16, 2020 fromGoogle Scholar
- Vitor Ventura Asheer Malhotra and Jungsoo An. 2022. Lazarus and the tale of three RATs. https://blog.talosintelligence.com/lazarus-three-rats/ Retrieved Feb 1, 2023 fromGoogle Scholar
- AT&T Cybersecurity. 2018. OTX Trends 2018 Q1 and Q2. https://cybersecurity.att.com/resource-center/white-papers/2018-open-threat-exchange-trends Retrieved May 21, 2020 fromGoogle Scholar
- Brian. Bartholomew and Juan Andres Guerrero-Saade. 2016. WAVE YOUR FALSE FLAGS! DECEPTION TACTICS MUDDYING ATTRIBUTION IN TARGETED ATTACKS. (2016). https://media.kasperskycontenthub.com/wp-content/uploads/sites/43/2017/10/20114955/Bartholomew-GuerreroSaade-VB2016.pdf Retrieved May 24, 2020 fromGoogle Scholar
- Omri Ben Bassat and Itay Cohen. 2019. Mapping the Connections Inside Russia’s APT Ecosystem. https://www.intezer.com/blog-russian-apt-ecosystem/ Retrieved May 24, 2020 fromGoogle Scholar
- Boldizsár Bencsáth, Gábor Pék, Levente Buttyán, and Márk Félegyházi. 2012. The Cousins of Stuxnet: Duqu, Flame, and Gauss. Future Internet 4, 4 (2012), 971–1003. https://doi.org/10.3390/fi4040971Google ScholarCross Ref
- Marius Benthin. 2022. Attribution of Malware Binaries to APT Actors using an Ensemble Classifier. Master’s thesis.Google Scholar
- Edward Loper Bird, Steven and Ewan Klein. 2009. Natural Language Processing with Python.Google Scholar
- Bishop Fox. 2019. cyber.dic. https://github.com/BishopFox/cyberdic Retrieved Oct 16, 2020 fromGoogle Scholar
- Coen Boot. 2019. Applying Supervised Learning on Malware Authorship Attribution. Master’s thesis.Google Scholar
- Xander Bouwman, Harm Griffioen, Jelle Egbers, Christian Doerr, Bram Klievink, and Michel van Eeten. 2020. A different cup of TI? The added value of commercial threat intelligence. In 29th USENIX Security Symposium (USENIX Security 20). USENIX Association, 433–450. https://www.usenix.org/conference/usenixsecurity20/presentation/bouwmanGoogle Scholar
- Michael Brennan, Sadia Afroz, and Rachel Greenstadt. 2012. Adversarial Stylometry: Circumventing Authorship Recognition to Preserve Privacy and Anonymity. ACM Trans. Inf. Syst. Secur. 15, 3, Article 12(nov 2012), 22 pages. https://doi.org/10.1145/2382448.2382450Google ScholarDigital Library
- Steven Burrows, Alexandra L Uitdenbogerd, and Andrew Turpin. 2014. Comparing techniques for authorship attribution of source code. Softw., Pract. Exper. 44, 1 (2014), 1–32. https://doi.org/10.1002/spe.2146Google ScholarCross Ref
- Aylin Caliskan, Fabian Yamaguchi, Edwin Dauber, Richard E. Harang, Konrad Rieck, Rachel Greenstadt, and Arvind Narayanan. 2018. When Coding Style Survives Compilation: De-anonymizing Programmers from Executable Binaries. In 25th Annual Network and Distributed System Security Symposium, NDSS 2018, San Diego, California, USA, February 18-21, 2018. https://faculty.washington.edu/aylin/papers/caliskan_when.pdfGoogle ScholarCross Ref
- Aylin Caliskan-Islam, Richard Harang, Andrew Liu, Arvind Narayanan, Clare Voss, Fabian Yamaguchi, and Rachel Greenstadt. 2015. De-anonymizing Programmers via Code Stylometry. In 24th USENIX Security Symposium (USENIX Security 15). USENIX Association, Washington, D.C., 255–270. https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/caliskan-islamGoogle ScholarDigital Library
- Alejandro Calleja, Juan Tapiador, and Juan Caballero. 2016. A Look into 30 Years of Malware Development from a Software Metrics Perspective. In Research in Attacks, Intrusions, and Defenses, Fabian Monrose, Marc Dacier, Gregory Blanc, and Joaquin Garcia-Alfaro (Eds.). Springer International Publishing, Cham, 325–345.Google Scholar
- A. Calleja, J. Tapiador, and J. Caballero. 2019. The MalSource Dataset: Quantifying Complexity and Code Reuse in Malware Development. IEEE Transactions on Information Forensics and Security 14, 12(Dec 2019), 3175–3190. https://doi.org/10.1109/TIFS.2018.2885512Google ScholarCross Ref
- N. Carlini and D. Wagner. 2017. Towards Evaluating the Robustness of Neural Networks. In 2017 IEEE Symposium on Security and Privacy (SP). 39–57. https://doi.org/10.1109/SP.2017.49Google ScholarCross Ref
- Centro Criptológico Nacional (CCN-CERT). 2020. Ciberamenazas Y Tendencias. https://www.ccn-cert.cni.es/informes/informes-ccn-cert-publicos/5377-ccn-cert-ia-13-20-ciberamenazas-y-tendencias-edicion-2020/file.html Retrieved Nov 16, 2020 fromGoogle Scholar
- Chronicle. 2004. VirusTotal. www.virustotal.com Retrieved Oct 16, 2020 fromGoogle Scholar
- Itay Cohen and Eyal Itkin. 2020. GRAPHOLOGY OF AN EXPLOIT – HUNTING FOR EXPLOITS BY LOOKING FOR THE AUTHOR’S FINGERPRINTS. (2020). https://vblocalhost.com/uploads/VB2020-Cohen-Itkin.pdfGoogle Scholar
- Stephen A. Cook. 1971. The Complexity of Theorem-Proving Procedures. In Proceedings of the Third Annual ACM Symposium on Theory of Computing (Shaker Heights, Ohio, USA) (STOC ’71). Association for Computing Machinery, New York, NY, USA, 151–158. https://doi.org/10.1145/800157.805047Google ScholarDigital Library
- Council on Foreign Relations. 2020. Cyber Operations Tracker. https://www.cfr.org/interactive/cyber-operations Retrieved Oct 27, 2020 fromGoogle Scholar
- cyber-research. 2019. APTMalware. https://github.com/cyber-research/APTMalware Retrieved Sep 25, 2020 fromGoogle Scholar
- Edwin Dauber, Aylin Caliskan, Richard E. Harang, Gregory Shearer, Michael Weisman, Frederica Nelson, and Rachel Greenstadt. 2019. Git Blame Who?: Stylistic Authorship Attribution of Small, Incomplete Source Code Fragments. PoPETs 2019, 3 (2019), 389–408. https://doi.org/10.2478/popets-2019-0053Google ScholarCross Ref
- M. V. Emmerik and T. Waddington. 2004. Using a decompiler for real-world source recovery. In 11th Working Conference on Reverse Engineering. 27–36. https://doi.org/10.1109/WCRE.2004.42Google ScholarCross Ref
- Mohammad Reza Farhadi, Benjamin C.M. Fung, Yin Bun Fung, Philippe Charland, Stere Preda, and Mourad Debbabi. 2015. Scalable code clone search for malware analysis. Digital Investigation 15(2015), 46 – 60. https://doi.org/10.1016/j.diin.2015.06.001 Special Issue: Big Data and Intelligent Data Analysis.Google ScholarDigital Library
- FireEye. 2017. FLOSS. https://github.com/fireeye/flare-floss Retrieved May 24, 2020 fromGoogle Scholar
- Georgia Frantzeskou, Efstathios Stamatatos, Stefanos Gritzalis, Carole E. Chaski, and Blake Stephen Howald. 2007. Identifying Authorship by Byte-Level N-Grams: The Source Code Author Profile (SCAP) Method. IJDE 6, 1 (2007). http://www.utica.edu/academic/institutes/ecii/publications/articles/B41158D1-C829-0387-009D214D2170C321.pdfGoogle Scholar
- Noah Gamer. 2016. The problem with open source malware. https://blog.trendmicro.com/the-problem-with-open-source-malware/ Retrieved May 29, 2020 fromGoogle Scholar
- GitHub. 2020. GitHub Repositories. https://github.com Retrieved May 24, 2020 fromGoogle Scholar
- Siyi Gong and Hao Zhong. 2021. Code Authors Hidden in File Revision Histories: An Empirical Study. In 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC). 71–82. https://doi.org/10.1109/ICPC52881.2021.00016Google ScholarCross Ref
- Hugo Gonzalez, Natalia Stakhanova, and Ali A. Ghorbani. 2018. Authorship Attribution of Android Apps. In Proceedings of the Eighth ACM Conference on Data and Application Security and Privacy (CODASPY ’18). ACM, 277–286. https://doi.org/10.1145/3176258.3176322Google ScholarDigital Library
- Google. 2008-2020. Google Code Jam. https://codingcompetitions.withgoogle.com/codejam/ Retrieved May 24, 2020 fromGoogle Scholar
- Google Scholar. [n. d.]. https://scholar.google.comGoogle Scholar
- H. Haddadpajouh, A. Azmoodeh, A. Dehghantanha, and R. M. Parizi. 2020. MVFCC: A Multi-View Fuzzy Consensus Clustering Model for Malware Threat Attribution. IEEE Access 8(2020), 139188–139198.Google ScholarCross Ref
- Karsten Hahn. 2021. Malware family naming hell is our own fault. https://www.gdatasoftware.com/blog/malware-family-naming-hell Retrieved Jan 31, 2023 fromGoogle Scholar
- Weijie Han, Jingfeng Xue, Yong Wang, Fuquan Zhang, and Xianwei Gao. 2021. APTMalInsight: Identify and cognize APT malware based on system call information and ontology knowledge framework. Information Sciences 546(2021), 633–664. https://doi.org/10.1016/j.ins.2020.08.095Google ScholarCross Ref
- Irfan Ul Haq and Juan Caballero. 2019. A Survey of Binary Code Similarity. CoRR abs/1909.11424(2019). arxiv:1909.11424 http://arxiv.org/abs/1909.11424Google Scholar
- Steven Hendrikse. 2017. The Effect of Code Obfuscation on Authorship Attribution of Binary Computer Files. Ph. D. Dissertation. https://nsuworks.nova.edu/gscis_etd/1009Google Scholar
- Ben Herzog. 2018. The GandCrab Ransomware Mindset. https://research.checkpoint.com/2018/gandcrab-ransomware-mindset/ Retrieved May 24, 2020 fromGoogle Scholar
- Hex-Rays. [n. d.]. Decompiler. https://hex-rays.com/decompiler/ Retrieved Mar 3, 2023 fromGoogle Scholar
- Floyd Hightower. 2017. Observable Finder. https://github.com/fhightower/ioc-finder Retrieved Oct 16, 2020 fromGoogle Scholar
- Jiwon Hong, Sanghyun Park, Sang-Wook Kim, Dongphil Kim, and Wonho Kim. 2018. Classifying malwares for identification of author groups. Concurrency and Computation: Practice and Experience 30, 3(2018), e4197. https://doi.org/10.1002/cpe.4197 e4197 cpe.4197.Google ScholarCross Ref
- Jiwon Hong, Sung-Jun Park, Taeri Kim, Yung-Kyun Noh, Sang-Wook Kim, Dongphil Kim, and Wonho Kim. 2019. Malware Classification for Identifying Author Groups: A Graph-Based Approach. In Proceedings of the Conference on Research in Adaptive and Convergent Systems (Chongqing, China) (RACS ’19). Association for Computing Machinery, New York, NY, USA, 169–174. https://doi.org/10.1145/3338840.3355684Google ScholarDigital Library
- Matthew Honnibal and Ines Montani. 2017. spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. To appear (2017).Google Scholar
- M. Hurier, G. Suarez-Tangil, S. K. Dash, T. F. Bissyandé, Y. Le Traon, J. Klein, and L. Cavallaro. 2017. Euphony: Harmonious Unification of Cacophonous Anti-Virus Vendor Labels for Android Malware. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). 425–435. https://doi.org/10.1109/MSR.2017.57Google ScholarDigital Library
- Vaibhavi Kalgutkar, Ratinder Kaur, Hugo Gonzalez, Natalia Stakhanova, and Alina Matyukhina. 2019. Code Authorship Attribution: Methods and Challenges. ACM Comput. Surv. 52, 1, Article 3 (Feb. 2019), 36 pages. https://doi.org/10.1145/3292577Google ScholarDigital Library
- Vaibhavi Kalgutkar, Natalia Stakhanova, Paul Cook, and Alina Matyukhina. 2018. Android authorship attribution through string analysis. In Proceedings of the 13th International Conference on Availability, Reliability and Security, ARES 2018, Hamburg, Germany, August 27-30, 2018, Sebastian Doerr, Mathias Fischer, Sebastian Schrittwieser, and Dominik Herrmann (Eds.). ACM, 4:1–4:10. https://doi.org/10.1145/3230833.3230849Google ScholarDigital Library
- Kaspersky. 2020. The power of threat attribution. https://media.kaspersky.com/en/business-security/enterprise/threat-attribution-engine-whitepaper.pdf Retrieved Oct 02, 2020 fromGoogle Scholar
- Eujeanne Kim, Sung-Jun Park, Seokwoo Choi, Dong-Kyu Chae, and Sang-Wook Kim. 2021. MANIAC: A Man-Machine Collaborative System for Classifying Malware Author Groups. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security(Virtual Event, Republic of Korea) (CCS ’21). Association for Computing Machinery, New York, NY, USA, 2441–2443. https://doi.org/10.1145/3460120.3485355Google ScholarDigital Library
- B. Kolosnjaji, A. Demontis, B. Biggio, D. Maiorca, G. Giacinto, C. Eckert, and F. Roli. 2018. Adversarial Malware Binaries: Evading Deep Learning for Malware Detection in Executables. In 2018 26th European Signal Processing Conference (EUSIPCO). 533–537. https://doi.org/10.23919/EUSIPCO.2018.8553214Google ScholarCross Ref
- Ivan Krsul and Eugene H. Spafford. 1997. Authorship analysis: identifying the author of a program. Comput. Secur. 16, 3 (1997), 233–257. https://doi.org/10.1016/S0167-4048(97)00005-9Google ScholarDigital Library
- Giuseppe Laurenza and Riccardo Lazzeretti. 2020. dAPTaset: A Comprehensive Mapping of APT-Related Data. In Computer Security, Apostolos P. Fournaris, Manos Athanatos, Konstantinos Lampropoulos, Sotiris Ioannidis, George Hatzivasilis, Ernesto Damiani, Habtamu Abie, Silvio Ranise, Luca Verderame, Alberto Siena, and Joaquin Garcia-Alfaro (Eds.). Springer International Publishing, Cham, 217–225.Google Scholar
- Giuseppe Laurenza, Riccardo Lazzeretti, and Luca Mazzotti. 2020. Malware Triage for Early Identification of Advanced Persistent Threat Activities. Digital Threats 1, 3, Article 16 (aug 2020), 17 pages. https://doi.org/10.1145/3386581Google ScholarDigital Library
- Valentine Legoy, Marco Caselli, Christin Seifert, and Andreas Peter. 2020. Automated Retrieval of ATT&CK Tactics and Techniques for Cyber Threat Reports. arxiv:2004.14322 [cs.CR]Google Scholar
- Antoine Lemay, Joan Calvet, François Menet, and José M. Fernandez. 2018. Survey of publicly available reports on advanced persistent threat actors. Computers and Security 72 (2018), 26 – 59. https://doi.org/10.1016/j.cose.2017.08.005Google ScholarDigital Library
- Lockheed-Martin. 2015. Gaining The Advantage Applying Cyber Kill Chain® Methodology to Network Defense. https://www.lockheedmartin.com/content/dam/lockheed-martin/rms/documents/cyber/Gaining_the_Advantage_Cyber_Kill_Chain.pdf Retrieved May 24, 2020 fromGoogle Scholar
- Andrea Marcelli, Mariano Graziano, Xabier Ugarte-Pedrero, Yanick Fratantonio, Mohamad Mansouri, and Davide Balzarotti. 2022. How Machine Learning Is Solving the Binary Function Similarity Problem. In 31st USENIX Security Symposium (USENIX Security 2022). USENIX Association.Google Scholar
- Morgan Marquis-Boire, Marion Marschalek, and Claudio Guarnieri. 2015. BIG GAME HUNTING: THE PECULIARITIES IN NATIONSTATE MALWARE RESEARCH. (2015). https://www.blackhat.com/docs/us-15/materials/us-15-MarquisBoire-Big-Game-Hunting-The-Peculiarities-Of-Nation-State-Malware-Research.pdfGoogle Scholar
- Masrepus, vfsrfs, and garanews. 2019. Un{i}packer. https://github.com/unipacker/unipacker Retrieved May 24, 2020 fromGoogle Scholar
- Alina Matyukhina, Natalia Stakhanova, Mila Dalla Preda, and Celine Perley. 2019. Adversarial Authorship Attribution in Open-Source Projects. In Proceedings of the Ninth ACM Conference on Data and Application Security and Privacy (Richardson, Texas, USA) (CODASPY ’19). ACM, New York, NY, USA, 291–302. https://doi.org/10.1145/3292006.3300032Google ScholarDigital Library
- Xiaozhu Meng. 2016. Fine-grained Binary Code Authorship Identification. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2016). ACM, 1097–1099. https://doi.org/10.1145/2950290.2983962Google ScholarDigital Library
- Xiaozhu Meng and Barton P. Miller. 2018. Binary Code Multi-Author Identification in Multi-Toolchain Scenarios. Under Submission (2018). http://ftp.cs.wisc.edu/paradyn/papers/Meng17MultiToolchain.pdfGoogle Scholar
- Xiaozhu Meng, Barton P. Miller, and Somesh Jha. 2018. Adversarial Binaries for Authorship Identification. CoRR abs/1809.08316(2018). arxiv:1809.08316 http://arxiv.org/abs/1809.08316Google Scholar
- Xiaozhu Meng, Barton P. Miller, and Kwang-Sung Jun. 2017. Identifying Multiple Authors in a Binary Program. In Computer Security – ESORICS 2017, Simon N. Foley, Dieter Gollmann, and Einar Snekkenes (Eds.). Springer International Publishing, Cham, 286–304.Google ScholarCross Ref
- Xiaozhu Meng, B. P. Miller, W. R. Williams, and A. R. Bernat. 2013. Mining Software Repositories for Accurate Authorship. In 2013 IEEE International Conference on Software Maintenance (ICSM). IEEE Computer Society, Los Alamitos, CA, USA, 250–259. https://doi.org/10.1109/ICSM.2013.36Google ScholarDigital Library
- Najmeh Miramirkhani, Mahathi Priya Appini, Nick Nikiforakis, and Michalis Polychronakis. 2017. Spotless Sandboxes: Evading Malware Analysis Systems Using Wear-and-Tear Artifacts. In 2017 IEEE Symposium on Security and Privacy (SP). 1009–1024. https://doi.org/10.1109/SP.2017.42Google ScholarCross Ref
- MISP: Open Source Threat Intelligence Platform. 2020. List of Threat Actors. https://raw.githubusercontent.com/MISP/misp-galaxy/main/clusters/threat-actor.json Retrieved Oct 27, 2020 fromGoogle Scholar
- Mitre. 2020. ATT&CK. https://attack.mitre.org/ Retrieved May 22, 2020 fromGoogle Scholar
- Tempestt J. Neal, Kalaivani Sundararajan, Aneez Fatima, Yiming Yan, Yingfei Xiang, and Damon L. Woodard. 2018. Surveying Stylometry Techniques and Applications. ACM Comput. Surv. 50, 6 (2018), 86:1–86:36. https://doi.org/10.1145/3132039Google ScholarDigital Library
- OASIS Cyber Threat Intelligence. 2020. STIX/TAXII 2.0. https://oasis-open.github.io/cti-documentation/ Retrieved May 24, 2020 fromGoogle Scholar
- Office of the Director of National Intelligence. 2018. A Guide to Cyber Attribution. https://www.dni.gov/files/CTIIC/documents/ODNI_A_Guide_to_Cyber_Attribution.pdf Retrieved Sep 25, 2020 fromGoogle Scholar
- P. W. Oman and C. R. Cook. 1989. Programming Style Authorship Analysis. In Proceedings of the 17th Conference on ACM Annual Computer Science Conference (Louisville, Kentucky) (CSC ’89). Association for Computing Machinery, New York, NY, USA, 320–326. https://doi.org/10.1145/75427.75469Google ScholarDigital Library
- Luca Pascarella, Fabio Palomba, Massimiliano Di Penta, and Alberto Bacchelli. 2018. How is Video Game Development Different from Software Development in Open Source?. In Proceedings of the 15th International Conference on Mining Software Repositories (Gothenburg, Sweden) (MSR ’18). Association for Computing Machinery, New York, NY, USA, 392–402. https://doi.org/10.1145/3196398.3196418Google ScholarDigital Library
- Kexin Pei, Zhou Xuan, Junfeng Yang, Suman Jana, and Baishakhi Ray. 2020. Trex: Learning execution semantics from micro-traces for binary similarity. arXiv preprint arXiv:2012.08680(2020).Google Scholar
- Daniel Plohmann, Martin Clauss, Steffen Enders, and Elmar Padilla. 2018. Malpedia: A Collaborative Effort to Inventorize the Malware Landscape. The Journal on Cybercrime & Digital Investigations 3, 1(2018). https://journal.cecyf.fr/ojs/index.php/cybin/article/view/17Google Scholar
- Erwin Quiring, Alwin Maier, and Konrad Rieck. 2019. Misleading Authorship Attribution of Source Code using Adversarial Learning. In 28th USENIX Security Symposium (USENIX Security 19). USENIX Association, Santa Clara, CA, 479–496. https://www.usenix.org/conference/usenixsecurity19/presentation/quiringGoogle Scholar
- Edward Raff, Richard Zak, Gary Lopez Munoz, William Fleming, Hyrum S. Anderson, Bobby Filar, Charles Nicholas, and James Holt. 2020. Automatic Yara Rule Generation Using Biclustering. In 13th ACM Workshop on Artificial Intelligence and Security (AISec’20). https://doi.org/10.1145/3411508.3421372Google ScholarDigital Library
- Radim Řehůřek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA, Valletta, Malta, 45–50. http://is.muni.cz/publication/884893/en.Google Scholar
- Rewterz. 2023. Annual Threat Intelligence Report 2022. https://www.rewterz.com/wp-content/uploads/2023/01/Annual-Threat-Intelligence-Report-2022.pdf Retrieved Mar 8, 2023 fromGoogle Scholar
- Thomas Rid and Ben Buchanan. 2015. Attributing Cyber Attacks. Journal of Strategic Studies 38, 1-2 (2015), 4–37. https://doi.org/10.1080/01402390.2014.977382 arXiv:https://doi.org/10.1080/01402390.2014.977382Google ScholarCross Ref
- Ed Robbins. 2017. Solvers for Type Recovery and Decompilation of Binaries. Ph. D. Dissertation. University of Kent,. https://kar.kent.ac.uk/61349/Google Scholar
- Royi Ronen, Marian Radu, Corina Feuerstein, Elad Yom-Tov, and Mansour Ahmadi. 2018. Microsoft Malware Classification Challenge. https://doi.org/10.48550/ARXIV.1802.10135Google ScholarCross Ref
- Ishai Rosenberg, Guillaume Sicard, and Eli (Omid) David. 2017. DeepAPT: Nation-State APT Attribution Using End-to-End Deep Neural Networks. In Artificial Neural Networks and Machine Learning – ICANN 2017, Alessandra Lintas, Stefano Rovetta, Paul F.M.J. Verschure, and Alessandro E.P. Villa (Eds.). Springer International Publishing, Cham, 91–99.Google ScholarCross Ref
- Ishai Rosenberg, Guillaume Sicard, and Eli (Omid) David. 2018. End-to-End Deep Neural Networks and Transfer Learning for Automatic Analysis of Nation-State Malware. Entropy 20, 5. https://doi.org/10.3390/e20050390Google ScholarCross Ref
- Jay Rosenberg and Christiaan Beek. 2018. Examining Code Reuse Reveals Undiscovered Links Among North Korea’s Malware Families. https://www.mcafee.com/blogs/other-blogs/mcafee-labs/examining-code-reuse-reveals-undiscovered-links-among-north-koreas-malware-families/ Retrieved May 24, 2020 fromGoogle Scholar
- Nathan Rosenblum, Xiaojin Zhu, and Barton P. Miller. 2011. Who Wrote This Code? Identifying the Authors of Program Binaries. In Computer Security – ESORICS 2011, Vijay Atluri and Claudia Diaz (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 172–189.Google ScholarCross Ref
- Nathan E. Rosenblum, Barton P. Miller, and Xiaojin Zhu. 2010. Extracting Compiler Provenance from Program Binaries. In Proceedings of the 9th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE ’10). ACM, 21–28. https://doi.org/10.1145/1806672.1806678Google ScholarDigital Library
- sapphirex00. 2018. APTs and OPs Table Guide. https://github.com/sapphirex00/Threat-Hunting/raw/master/apts_and_ops_tableguide.xlsx Retrieved Oct 27, 2020 fromGoogle Scholar
- Marcos Sebastián, Richard Rivera, Platon Kotzias, and Juan Caballero. 2016. AVclass: A Tool for Massive Malware Labeling. In Research in Attacks, Intrusions, and Defenses - 19th International Symposium, RAID 2016, Paris, France, September 19-21, 2016, Proceedings. 230–253. https://doi.org/10.1007/978-3-319-45719-2_11Google ScholarCross Ref
- Lucy Simko, Luke Zettlemoyer, and Tadayoshi Kohno. 2018. Recognizing and Imitating Programmer Style: Adversaries in Program Authorship Attribution. Proceedings on Privacy Enhancing Technologies 2018, 1(2018), 127 – 144. https://content.sciendo.com/view/journals/popets/2018/1/article-p127.xmlGoogle ScholarCross Ref
- Qige Song, Yongzheng Zhang, Linshu Ouyang, and Yige Chen. 2022. BinMLM: Binary Authorship Verification with Flow-aware Mixture-of-Shared Language Model. In 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). 1023–1033. https://doi.org/10.1109/SANER53432.2022.00120Google ScholarCross Ref
- Pasquale Stirparo, David Bizeul, Brian Bell, Ziv Chang, Joel Esler, Kristopher Bleich, Maite Moreno, Monnappa K A, J. Capmany, Paul Hutchinson, Boris Ivanov, Andre Gironda, Devon Ackerman, Carlos Fragoso, Eyal Sela, and Florian Egloff. 2015. APT Groups and Operations. https://apt.threattracking.com Retrieved May 24, 2020 fromGoogle Scholar
- Symantec. 2019. Internet security threat report 2019. https://www.symantec.com/content/dam/symantec/docs/reports/istr-24-2019-en.pdf Retrieved May 24, 2020 fromGoogle Scholar
- DBLP Team. 2020. DBLP computer science bibliography. https://dblp.uni-trier.de Retrieved May 24, 2020 fromGoogle Scholar
- Thailand Computer Emergency Response Team. 2020. Threat Group Cards: A Threat Actor Encyclopedia. https://apt.thaicert.or.th/ Retrieved Oct 27, 2020 fromGoogle Scholar
- Guido van Rossum, Barry Warsaw, and Nick Coghlan. 2001. PEP 8 Style Guide for Python Code. https://www.python.org/dev/peps/pep-0008/ Retrieved May 24, 2020 fromGoogle Scholar
- VirusShare. [n. d.]. https://virusshare.com/Google Scholar
- N. Virvilis and D. Gritzalis. 2013. The Big Four - What We Did Wrong in Advanced Persistent Threat Detection?. In 2013 International Conference on Availability, Reliability and Security(ARES), Vol. 00. 248–254. https://doi.org/10.1109/ARES.2013.32Google ScholarDigital Library
- Daniel Votipka, Seth M Rabin, Kristopher Micinski, Jeffrey S Foster, and Michelle M Mazurek. 2020. An observational investigation of reverse engineers’ processes. In Proceedings of the 29th USENIX Conference on Security Symposium. 1875–1892.Google Scholar
- VXUnderground. [n. d.]. https://vx-underground.org/Google Scholar
- Qinqin Wang, Hanbing Yan, and Zhihui Han. 2021. Explainable APT Attribution for Malware Using NLP Techniques. In 2021 IEEE 21st International Conference on Software Quality, Reliability and Security (QRS). 70–80. https://doi.org/10.1109/QRS54544.2021.00018Google ScholarCross Ref
- Claes Wohlin. 2014. Guidelines for snowballing in systematic literature studies and a replication in software engineering. In Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering (London, England, United Kingdom) (EASE ’14). Association for Computing Machinery, New York, NY, USA, Article 38, 10 pages. https://doi.org/10.1145/2601248.2601268Google ScholarDigital Library
- H. Xue, S. Sun, G. Venkataramani, and T. Lan. 2019. Machine Learning-Based Analysis of Program Binaries: A Comprehensive Study. IEEE Access 7(2019), 65889–65912.Google ScholarCross Ref
Index Terms
- Identifying Authorship in Malicious Binaries: Features, Challenges & Datasets
Recommendations
Malicious SSL Certificate Detection: A Step Towards Advanced Persistent Threat Defence
ICFNDS '17: Proceedings of the International Conference on Future Networks and Distributed SystemsAdvanced Persistent Threat (APT) is one of the most serious types of cyber attacks, which is a new and more complex version of multistep attack. Within the APT life cycle, continuous communication between infected hosts and Command and Control (C&C) ...
Adversarial Authorship Attribution in Open-Source Projects
CODASPY '19: Proceedings of the Ninth ACM Conference on Data and Application Security and PrivacyOpen-source software is open to anyone by design, whether it is a community of developers, hackers or malicious users. Authors of open-source software typically hide their identity through nicknames and avatars. However, they have no protection against ...
Formulistic Detection of Malicious Fast-Flux Domains
PAAP '12: Proceedings of the 2012 Fifth International Symposium on Parallel Architectures, Algorithms and ProgrammingBonnet creates harmful network attacks nowadays. Lawbreaker may implant malware into victim machines using botnets and, furthermore, he employs fast-flux domain technology to improve the lifetime of botnets. To circumvent the detection of command and ...
Comments