skip to main content
research-article

Analyzing Dynamic Code: A Sound Abstract Interpreter for Evil Eval

Published:21 January 2021Publication History
Skip Abstract Section

Abstract

Dynamic languages, such as JavaScript, employ string-to-code primitives to turn dynamically generated text into executable code at run-time. These features make standard static analysis extremely hard if not impossible, because its essential data structures, i.e., the control-flow graph and the system of recursive equations associated with the program to analyze, are themselves dynamically mutating objects. Nevertheless, assembling code at run-time by manipulating strings, such as by eval in JavaScript, has been always strongly discouraged, since it is often recognized that “eval is evil,” leading static analyzers to not consider such statements or ignoring their effects. Unfortunately, the lack of formal approaches to analyze string-to-code statements pose a perfect habitat for malicious code, that is surely evil and do not respect good practice rules, allowing them to hide malicious intents as strings to be converted to code and making static analyses blind to the real malicious aim of the code. Hence, the need to handle string-to-code statements approximating what they can execute, and therefore allowing the analysis to continue (even in the presence of dynamically generated program statements) with an acceptable degree of precision, should be clear. To reach this goal, we propose a static analysis allowing us to collect string values and to soundly over-approximate and analyze the code potentially executed by a string-to-code statement.

References

  1. Hynek Petrak [n.d.]. Hynek Petrak JS Malware collection. Retrieved from https://github.com/HynekPetrak/javascript-malware-collection.Google ScholarGoogle Scholar
  2. J. (D.) An, A. Chaudhuri, J. S. Foster, and M. Hicks. 2011. Dynamic inference of static types for Ruby. In Proceedings of the ACM SIGPLAN Symposium on Principles of Programming Languages (POPL’11), T. Ball and M. Sagiv (Eds.). ACM, 459--472.Google ScholarGoogle Scholar
  3. B. Anckaert, M. Madou, and K. De Bosschere. 2006. A model for self-modifying code. In Proceedings of the International Workshop on Information Hiding (LNCS), J. Camenisch, C. S. Collberg, N. F. Johnson, and P. Sallee (Eds.), Vol. 4437. Springer, 232--248.Google ScholarGoogle Scholar
  4. Vincenzo Arceri and Sergio Maffeis. 2017. Abstract domains for type juggling. Electr. Notes Theor. Comput. Sci. 331 (2017), 41--55. DOI:https://doi.org/10.1016/j.entcs.2017.02.003Google ScholarGoogle ScholarCross RefCross Ref
  5. Vincenzo Arceri and Isabella Mastroeni. 2019. An automata-based abstract semantics for string manipulation languages. In Proceedings of the 7th International Workshop on Verification and Program Transformation, (VPT@Programming’19). 19--33. DOI:https://doi.org/10.4204/EPTCS.299.5Google ScholarGoogle Scholar
  6. Vincenzo Arceri and Isabella Mastroeni. 2020. A sound abstract interpreter for dynamic code. In Proceedings of the 35th ACM/SIGAPP Symposium on Applied Computing (SAC’20), Chih-Cheng Hung, Tomás Cerný, Dongwan Shin, and Alessio Bechini (Eds.). ACM, 1979--1988. DOI:https://doi.org/10.1145/3341105.3373964Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Vincenzo Arceri, Isabella Mastroeni, and Sunyi Xu. 2020. Static analysis for ECMAScript string manipulation programs. Appl. Sci. 10 (2020), 3525. DOI:https://doi.org/10.3390/app10103525Google ScholarGoogle ScholarCross RefCross Ref
  8. Vincenzo Arceri, Martina Olliaro, Agostino Cortesi, and Isabella Mastroeni. 2019. Completeness of abstract domains for string analysis of JavaScript programs. In Proceedings of the 16th International Colloquium on Theoretical Aspects of Computing (ICTAC’19) (Lecture Notes in Computer Science), Robert M. Hierons and Mohamed Mosbah (Eds.), Vol. 11884. Springer, 255--272. DOI:https://doi.org/10.1007/978-3-030-32505-3_15Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Balliu and I. Mastroeni. 2010. A weakest precondition approach to robustness. Trans. Comput. Sci. 10 (2010), 261--297.Google ScholarGoogle ScholarCross RefCross Ref
  10. Al Bessey, Ken Block, Benjamin Chelf, Andy Chou, Bryan Fulton, Seth Hallem, Charles-Henri Gros, Asya Kamsky, Scott McPeak, and Dawson R. Engler. 2010. A few billion lines of code later: Using static analysis to find bugs in the real world. Commun. ACM 53, 2 (2010), 66--75. DOI:https://doi.org/10.1145/1646353.1646374Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. P. Biggar and D. Gregg. 2009. Static Analysis of Dynamic Scripting Languages. Technical Report. Department of Computer Science, Trinity College Dublin.Google ScholarGoogle Scholar
  12. Eric Bodden, Andreas Sewe, Jan Sinschek, Hela Oueslati, and Mira Mezini. 2011. Taming reflection: Aiding static analysis in the presence of reflection and custom class loaders. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11). 241--250.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Janusz A. Brzozowski. 1964. Derivatives of regular expressions. J. ACM 11, 4 (1964), 481--494.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Samuele Buro and Isabella Mastroeni. 2018. Abstract code injection—A semantic approach based on abstract non-interference. In Proceedings of the 19th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI’18) (Lecture Notes in Computer Science), Isil Dillig and Jens Palsberg (Eds.), Vol. 10747. Springer, 116--137.Google ScholarGoogle ScholarCross RefCross Ref
  15. H. Cai, Z. Shao, and A. Vaynberg. 2007. Certified self-modifying code. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’07), J. Ferrante and K. S. McKinley (Eds.). ACM, 66--77.Google ScholarGoogle Scholar
  16. Aske Simon Christensen, Anders Møller, and Michael I. Schwartzbach. 2003. Precise analysis of string expressions. In Proceedings of the 10th International Symposium on Static Analysis (SAS’03) (Lecture Notes in Computer Science), Radhia Cousot (Ed.), Vol. 2694. Springer, 1--18. DOI:https://doi.org/10.1007/3-540-44898-5_1Google ScholarGoogle Scholar
  17. R. Chugh, J. A. Meister, R. Jhala, and S. Lerner. 2009. Staged information flow for JavaScript. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’09), M. Hind and A. Diwan (Eds.). ACM, 50--62.Google ScholarGoogle Scholar
  18. P. Cousot. 1997. Types as abstract interpretations (invited paper). In Proceedings of the 24th ACM Symposium on Principles of Programming Languages (POPL’97). ACM Press, 316--331.Google ScholarGoogle Scholar
  19. P. Cousot and R. Cousot. 1977. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proceedings of the 4th ACM Symposium on Principles of Programming Languages (POPL’77). ACM Press, 238--252.Google ScholarGoogle Scholar
  20. P. Cousot and R. Cousot. 1992. Abstract interpretation frameworks. J. Logic Comput. 2, 4 (1992), 511--547.Google ScholarGoogle ScholarCross RefCross Ref
  21. P. Cousot and R. Cousot. 1995. Formal language, grammar and set-constraint-based program analysis by abstract interpretation. In Proceedings of the 7th ACM Conference on Functional Programming Languages and Computer Architecture. ACM Press, New York, NY, 170--181.Google ScholarGoogle Scholar
  22. P. Cousot and N. Halbwachs. 1978. Automatic discovery of linear restraints among variables of a program. In Proceedings of the 5th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL’78). ACM Press, 84--96. DOI:https://doi.org/10.1145/512760.512770Google ScholarGoogle Scholar
  23. Charlie Curtsinger, Benjamin Livshits, Benjamin G. Zorn, and Christian Seifert. 2011. ZOZZLE: Fast and precise in-browser javascript malware detection. In Proceedings of the 20th USENIX Security Symposium. USENIX Association. http://static.usenix.org/events/sec11/tech/full_papers/Curtsinger.pdfGoogle ScholarGoogle Scholar
  24. Mila Dalla Preda, Roberto Giacobazzi, Arun Lakhotia, and Isabella Mastroeni. 2015. Abstract symbolic automata: Mixed syntactic/semantic similarity analysis of executables. ACM SIGPLAN Notices 50, 1 (2015), 329--341.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Davis, R. Sigal, and E. J. Weyuker. 1994. Computability, Complexity, and Languages: Fundamentals of Theoretical Computer Science (Computer Science and Scientific Computing), 2nd ed. Elsevier.Google ScholarGoogle Scholar
  26. Kyung-Goo Doh, Hyunha Kim, and David A. Schmidt. 2009. Abstract parsing: Static analysis of dynamically generated string output using LR-parsing technology. In Proceedings of the 16th International Symposium on Static Analysis (SAS’09) (Lecture Notes in Computer Science), Jens Palsberg and Zhendong Su (Eds.), Vol. 5673. Springer, 256--272. DOI:https://doi.org/10.1007/978-3-642-03237-0_18Google ScholarGoogle Scholar
  27. S. Drape, C. Thomborson, and A. Majumdar. 2007. Specifying imperative data obfuscations. In Proceedings of the Conference on Information Security (IS’07) (Lecture Notes in Computer Science), J. A. Garay, A. K. Lenstra, M. Mambo, and R. Peralta (Eds.), Vol. 4779. Springer Verlag, 299--314.Google ScholarGoogle Scholar
  28. V. D’Silva. 2006. Widening for Automata. Diploma Thesis, Institut Fur Informatick, Universitat Zurich.Google ScholarGoogle Scholar
  29. François Gauthier, Behnaz Hassanshahi, and Alexander Jordan. 2018. AFFOGATO: Runtime detection of injection attacks for Node.js. In Proceedings of the ISSTA/ECOOP Workshops (ISSTA’18), Julian Dolby, William G. J. Halfond, and Ashish Mishra (Eds.). ACM, 94--99. DOI:https://doi.org/10.1145/3236454.3236502Google ScholarGoogle Scholar
  30. R. Giacobazzi. 1998. Abductive analysis of modular logic programs. J. Logic Comput. 8, 4 (1998), 457--484.Google ScholarGoogle ScholarCross RefCross Ref
  31. R. Giacobazzi, N. D. Jones, and I. Mastroeni. 2012. Obfuscation by partial evaluation of distorted interpreters. In Proceedings of the ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation (PEPM’12), O. Kiselyov and S. Thompson (Eds.). ACM Press, 63--72.Google ScholarGoogle Scholar
  32. Roberto Giacobazzi and Isabella Mastroeni. 2010. A proof system for abstract non-interference. J. Log. Comput. 20, 2 (2010), 449--479.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Roberto Giacobazzi and Isabella Mastroeni. 2012. Making abstract interpretation incomplete: Modeling the potency of obfuscation. In Proceedings of the 19th International Symposium on Static Analysis (SAS’12) (Lecture Notes in Computer Science), Antoine Miné and David Schmidt (Eds.), Vol. 7460. Springer, 129--145.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Roberto Giacobazzi and Isabella Mastroeni. 2018. Abstract non-interference: A unifying framework for weakening information-flow. ACM Trans. Priv. Secur. 21, 2 (2018), 9:1--9:31.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Nevin Heintze and Joxan Jaffar. 1994. Set constraints and set-based analysis. In Proceedings of the 2nd International Workshop on Principles and Practice of Constraint Programming (PPCP’94) (Lecture Notes in Computer Science), Alan Borning (Ed.), Vol. 874. Springer, 281--298. DOI:https://doi.org/10.1007/3-540-58601-6_107Google ScholarGoogle ScholarCross RefCross Ref
  36. Pieter Hooimeijer, Benjamin Livshits, David Molnar, Prateek Saxena, and Margus Veanes. 2011. Fast and precise sanitizer analysis with BEK. In Proceedings of the 20th USENIX Security Symposium. USENIX Association. Retrieved from http://static.usenix.org/events/sec11/tech/full_papers/Hooimeijer.pdf.Google ScholarGoogle Scholar
  37. Simon Holm Jensen, Peter A. Jonsson, and Anders Møller. 2012. Remedying the eval that men do. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’12), Mats Per Erik Heimdahl and Zhendong Su (Eds.). ACM, 34--44. DOI:https://doi.org/10.1145/2338965.2336758Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Simon Holm Jensen, Anders Møller, and Peter Thiemann. 2009. Type analysis for JavaScript. In Proceedings of the 16th International Symposium on Static Analysis (SAS’09). 238--255.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. R. Karim, F. Tip, A. Sochurkova, and K. Sen. 2018. Platform-independent dynamic taint analysis for JavaScript. IEEE Trans. Softw. Eng. 46, 12 (2020), 1364--1379.Google ScholarGoogle ScholarCross RefCross Ref
  40. Vineeth Kashyap, Kyle Dewey, Ethan A. Kuefner, John Wagner, Kevin Gibbons, John Sarracino, Ben Wiedermann, and Ben Hardekopf. 2014. JSAI: A static analysis platform for JavaScript. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’14). 121--132.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Hyunha Kim, Kyung-Goo Doh, and David A. Schmidt. 2013. Static validation of dynamically generated HTML documents based on abstract parsing and semantic processing. In Proceedings of the 20th International Symposium on Static Analysis (SAS’13) (Lecture Notes in Computer Science), Francesco Logozzo and Manuel Fähndrich (Eds.), Vol. 7935. Springer, 194--214. DOI:https://doi.org/10.1007/978-3-642-38856-9_12Google ScholarGoogle Scholar
  42. Hongki Lee, Sooncheol Won, Joonho Jin, Junhee Cho, and Sukyoung Ryu. 2012. SAFE: Formal specification and implementation of a scalable analysis framework for ECMAScript. In Proceedings of the International Workshop on Foundations of Object-Oriented Languages. ACM.Google ScholarGoogle Scholar
  43. Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondrej Lhoták, José Nelson Amaral, Bor-Yuh Evan Chang, Samuel Z. Guyer, Uday P. Khedker, Anders Møller, and Dimitrios Vardoulakis. 2015. In defense of soundiness: A manifesto. Commun. ACM 58, 2 (2015), 44--46. DOI:https://doi.org/10.1145/2644805Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Isabella Mastroeni and Durica Nikolic. 2010. Abstract program slicing: From theory towards an implementation. In Proceedings of the 12th International Conference on Formal Engineering Methods (ICFEM’10) (Lecture Notes in Computer Science), Jin Song Dong and Huibiao Zhu (Eds.), Vol. 6447. Springer, 452--467.Google ScholarGoogle ScholarCross RefCross Ref
  45. Isabella Mastroeni and Damiano Zanardini. 2017. Abstract program slicing: An abstract interpretation-based approach to program slicing. ACM Trans. Comput. Log. 18, 1 (2017), 7:1--7:58.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. N. Mavrogiannopoulos, N. Kisserli, and B. Preneel. 2011. A taxonomy of self-modifying code for obfuscation. Comput. Secur. 30, 8 (2011), 679--691.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Fadi Meawad, Gregor Richards, Floréal Morandat, and Jan Vitek. 2012. Eval begone!: Semi-automated removal of eval from javascript programs. In Proceedings of the 27th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’12), Gary T. Leavens and Matthew B. Dwyer (Eds.). ACM, 607--620. DOI:https://doi.org/10.1145/2384616.2384660Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Yasuhiko Minamide. 2005. Static approximation of dynamically generated Web pages. In Proceedings of the 14th International Conference on World Wide Web (WWW’05), Allan Ellis and Tatsuya Hagino (Eds.). ACM, 432--441. DOI:https://doi.org/10.1145/1060745.1060809Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Anders Møller. 2015. Static analysis of JavaScript. In Proceedings of the 22nd International Symposium on Static Analysis (SAS’15).Google ScholarGoogle Scholar
  50. Luca Negrini, Vincenzo Arceri, Pietro Ferrara, and Agostino Cortesi. 2020. Twinning automata and regular expressions for string static analysis. Retrieved from https://arxiv:cs.SE/2006.02715.Google ScholarGoogle Scholar
  51. Flemming Nielson, Hanne Riis Nielson, and Chris Hankin. 1999. Principles of Program Analysis. Springer. DOI:https://doi.org/10.1007/978-3-662-03811-6Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Changhee Park and Sukyoung Ryu. 2015. Scalable and precise static analysis of JavaScript applications via loop-sensitivity. In Proceedings of the 29th European Conference on Object-Oriented Programming (ECOOP’15). 735--756.Google ScholarGoogle Scholar
  53. Gregor Richards, Christian Hammer, Brian Burg, and Jan Vitek. 2011. The eval that men do—A large-scale study of the use of eval in JavaScript applications. In Proceedings of the 25th European Conference on Object-Oriented Programming (ECOOP’11) (Lecture Notes in Computer Science), Mira Mezini (Ed.), Vol. 6813. Springer, 52--78. DOI:https://doi.org/10.1007/978-3-642-22655-7_4Google ScholarGoogle Scholar
  54. Helmut Seidl, Reinhard Wilhelm, and Sebastian Hack. 2012. Compiler Design—Analysis and Transformation. Springer.Google ScholarGoogle Scholar
  55. Cristian-Alexandru Staicu, Michael Pradel, and Benjamin Livshits. 2018. SYNODE: Understanding and automatically preventing injection attacks on NODE.JS. In Proceedings of the 25th Annual Network and Distributed System Security Symposium (NDSS’18). The Internet Society. Retrieved from http://wp.internetsociety.org/ndss/wp-content/uploads/sites/25/2018/02/ndss2018_07A-2_Staicu_paper.pdf.Google ScholarGoogle ScholarCross RefCross Ref
  56. Peter Thiemann. 2005. Grammar-based analysis of string expressions. In Proceedings of the ACM SIGPLAN International Workshop on Types in Languages Design and Implementation (TLDI’05), J. Gregory Morrisett and Manuel Fähndrich (Eds.). ACM, 59--70. DOI:https://doi.org/10.1145/1040294.1040300Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Arnaud Venet. 1999. Automatic analysis of pointer aliasing for untyped programs. Sci. Comput. Program. 35, 2 (1999), 223--248.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Junjie Wang, Yinxing Xue, Yang Liu, and Tian Huat Tan. 2015. JSDC: A hybrid approach for JavaScript malware detection and classification. In Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security (ASIA CCS’15), Feng Bao, Steven Miller, Jianying Zhou, and Gail-Joon Ahn (Eds.). ACM, 109--120. DOI:https://doi.org/10.1145/2714576.2714620Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. X. Wang, Y. Jhi, S. Zhu, and P. Liu. 2008. STILL: Exploit code detection via static taint and initialization analyses. In Proceedings of the Annual Computer Security Applications Conference (ACSAC’08). IEEE Computer Society, 289--298.Google ScholarGoogle Scholar
  60. Yichen Xie and Alex Aiken. 2006. Static detection of security vulnerabilities in scripting languages. In Proceedings of the 15th USENIX Security Symposium, Angelos D. Keromytis (Ed.). USENIX Association. Retrieved from https://www.usenix.org/conference/15th-usenix-security-symposium/static-detection-security-vulnerabilities-scripting.Google ScholarGoogle Scholar
  61. Yinxing Xue, Junjie Wang, Yang Liu, Hao Xiao, Jun Sun, and Mahinthan Chandramohan. 2015. Detection and classification of malicious JavaScript via attack behavior modelling. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’15), Michal Young and Tao Xie (Eds.). ACM, 48--59. DOI:https://doi.org/10.1145/2771783.2771814Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Fang Yu, Muath Alkhalaf, and Tevfik Bultan. 2011. Patching vulnerabilities with sanitization synthesis. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11), Richard N. Taylor, Harald C. Gall, and Nenad Medvidovic (Eds.). ACM, 251--260. DOI:https://doi.org/10.1145/1985793.1985828Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Analyzing Dynamic Code: A Sound Abstract Interpreter for Evil Eval

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Privacy and Security
        ACM Transactions on Privacy and Security  Volume 24, Issue 2
        May 2021
        242 pages
        ISSN:2471-2566
        EISSN:2471-2574
        DOI:10.1145/3446639
        Issue’s Table of Contents

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 21 January 2021
        • Accepted: 1 September 2020
        • Revised: 1 June 2020
        • Received: 1 February 2020
        Published in tops Volume 24, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format