research-article

Analyzing Dynamic Code: A Sound Abstract Interpreter for Evil Eval

Authors:
Vincenzo Arceri

University of Verona, Verona, Italy

University of Verona, Verona, Italy

0000-0002-5150-0393
View Profile

,
Isabella Mastroeni

University of Verona, Verona, Italy

University of Verona, Verona, Italy

0000-0003-1213-536X
View Profile

Authors Info & Claims

ACM Transactions on Privacy and Security Volume 24 Issue 2Article No.: 10pp 1–38https://doi.org/10.1145/3426470

Published:21 January 2021Publication History

ACM Transactions on Privacy and Security

Abstract

Dynamic languages, such as JavaScript, employ string-to-code primitives to turn dynamically generated text into executable code at run-time. These features make standard static analysis extremely hard if not impossible, because its essential data structures, i.e., the control-flow graph and the system of recursive equations associated with the program to analyze, are themselves dynamically mutating objects. Nevertheless, assembling code at run-time by manipulating strings, such as by eval in JavaScript, has been always strongly discouraged, since it is often recognized that “eval is evil,” leading static analyzers to not consider such statements or ignoring their effects. Unfortunately, the lack of formal approaches to analyze string-to-code statements pose a perfect habitat for malicious code, that is surely evil and do not respect good practice rules, allowing them to hide malicious intents as strings to be converted to code and making static analyses blind to the real malicious aim of the code. Hence, the need to handle string-to-code statements approximating what they can execute, and therefore allowing the analysis to continue (even in the presence of dynamically generated program statements) with an acceptable degree of precision, should be clear. To reach this goal, we propose a static analysis allowing us to collect string values and to soundly over-approximate and analyze the code potentially executed by a string-to-code statement.

References

Hynek Petrak [n.d.]. Hynek Petrak JS Malware collection. Retrieved from https://github.com/HynekPetrak/javascript-malware-collection.Google Scholar
J. (D.) An, A. Chaudhuri, J. S. Foster, and M. Hicks. 2011. Dynamic inference of static types for Ruby. In Proceedings of the ACM SIGPLAN Symposium on Principles of Programming Languages (POPL’11), T. Ball and M. Sagiv (Eds.). ACM, 459--472.Google Scholar
B. Anckaert, M. Madou, and K. De Bosschere. 2006. A model for self-modifying code. In Proceedings of the International Workshop on Information Hiding (LNCS), J. Camenisch, C. S. Collberg, N. F. Johnson, and P. Sallee (Eds.), Vol. 4437. Springer, 232--248.Google Scholar
Vincenzo Arceri and Sergio Maffeis. 2017. Abstract domains for type juggling. Electr. Notes Theor. Comput. Sci. 331 (2017), 41--55. DOI:https://doi.org/10.1016/j.entcs.2017.02.003Google ScholarCross Ref
Vincenzo Arceri and Isabella Mastroeni. 2019. An automata-based abstract semantics for string manipulation languages. In Proceedings of the 7th International Workshop on Verification and Program Transformation, (VPT@Programming’19). 19--33. DOI:https://doi.org/10.4204/EPTCS.299.5Google Scholar
Vincenzo Arceri and Isabella Mastroeni. 2020. A sound abstract interpreter for dynamic code. In Proceedings of the 35th ACM/SIGAPP Symposium on Applied Computing (SAC’20), Chih-Cheng Hung, Tomás Cerný, Dongwan Shin, and Alessio Bechini (Eds.). ACM, 1979--1988. DOI:https://doi.org/10.1145/3341105.3373964Google ScholarDigital Library
Vincenzo Arceri, Isabella Mastroeni, and Sunyi Xu. 2020. Static analysis for ECMAScript string manipulation programs. Appl. Sci. 10 (2020), 3525. DOI:https://doi.org/10.3390/app10103525Google ScholarCross Ref
Vincenzo Arceri, Martina Olliaro, Agostino Cortesi, and Isabella Mastroeni. 2019. Completeness of abstract domains for string analysis of JavaScript programs. In Proceedings of the 16th International Colloquium on Theoretical Aspects of Computing (ICTAC’19) (Lecture Notes in Computer Science), Robert M. Hierons and Mohamed Mosbah (Eds.), Vol. 11884. Springer, 255--272. DOI:https://doi.org/10.1007/978-3-030-32505-3_15Google ScholarDigital Library
M. Balliu and I. Mastroeni. 2010. A weakest precondition approach to robustness. Trans. Comput. Sci. 10 (2010), 261--297.Google ScholarCross Ref
Al Bessey, Ken Block, Benjamin Chelf, Andy Chou, Bryan Fulton, Seth Hallem, Charles-Henri Gros, Asya Kamsky, Scott McPeak, and Dawson R. Engler. 2010. A few billion lines of code later: Using static analysis to find bugs in the real world. Commun. ACM 53, 2 (2010), 66--75. DOI:https://doi.org/10.1145/1646353.1646374Google ScholarDigital Library
P. Biggar and D. Gregg. 2009. Static Analysis of Dynamic Scripting Languages. Technical Report. Department of Computer Science, Trinity College Dublin.Google Scholar
Eric Bodden, Andreas Sewe, Jan Sinschek, Hela Oueslati, and Mira Mezini. 2011. Taming reflection: Aiding static analysis in the presence of reflection and custom class loaders. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11). 241--250.Google ScholarDigital Library
Janusz A. Brzozowski. 1964. Derivatives of regular expressions. J. ACM 11, 4 (1964), 481--494.Google ScholarDigital Library
Samuele Buro and Isabella Mastroeni. 2018. Abstract code injection—A semantic approach based on abstract non-interference. In Proceedings of the 19th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI’18) (Lecture Notes in Computer Science), Isil Dillig and Jens Palsberg (Eds.), Vol. 10747. Springer, 116--137.Google ScholarCross Ref
H. Cai, Z. Shao, and A. Vaynberg. 2007. Certified self-modifying code. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’07), J. Ferrante and K. S. McKinley (Eds.). ACM, 66--77.Google Scholar
Aske Simon Christensen, Anders Møller, and Michael I. Schwartzbach. 2003. Precise analysis of string expressions. In Proceedings of the 10th International Symposium on Static Analysis (SAS’03) (Lecture Notes in Computer Science), Radhia Cousot (Ed.), Vol. 2694. Springer, 1--18. DOI:https://doi.org/10.1007/3-540-44898-5_1Google Scholar
R. Chugh, J. A. Meister, R. Jhala, and S. Lerner. 2009. Staged information flow for JavaScript. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’09), M. Hind and A. Diwan (Eds.). ACM, 50--62.Google Scholar
P. Cousot. 1997. Types as abstract interpretations (invited paper). In Proceedings of the 24th ACM Symposium on Principles of Programming Languages (POPL’97). ACM Press, 316--331.Google Scholar
P. Cousot and R. Cousot. 1977. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proceedings of the 4th ACM Symposium on Principles of Programming Languages (POPL’77). ACM Press, 238--252.Google Scholar
P. Cousot and R. Cousot. 1992. Abstract interpretation frameworks. J. Logic Comput. 2, 4 (1992), 511--547.Google ScholarCross Ref
P. Cousot and R. Cousot. 1995. Formal language, grammar and set-constraint-based program analysis by abstract interpretation. In Proceedings of the 7th ACM Conference on Functional Programming Languages and Computer Architecture. ACM Press, New York, NY, 170--181.Google Scholar
P. Cousot and N. Halbwachs. 1978. Automatic discovery of linear restraints among variables of a program. In Proceedings of the 5th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL’78). ACM Press, 84--96. DOI:https://doi.org/10.1145/512760.512770Google Scholar
Charlie Curtsinger, Benjamin Livshits, Benjamin G. Zorn, and Christian Seifert. 2011. ZOZZLE: Fast and precise in-browser javascript malware detection. In Proceedings of the 20th USENIX Security Symposium. USENIX Association. http://static.usenix.org/events/sec11/tech/full_papers/Curtsinger.pdfGoogle Scholar
Mila Dalla Preda, Roberto Giacobazzi, Arun Lakhotia, and Isabella Mastroeni. 2015. Abstract symbolic automata: Mixed syntactic/semantic similarity analysis of executables. ACM SIGPLAN Notices 50, 1 (2015), 329--341.Google ScholarDigital Library
M. Davis, R. Sigal, and E. J. Weyuker. 1994. Computability, Complexity, and Languages: Fundamentals of Theoretical Computer Science (Computer Science and Scientific Computing), 2nd ed. Elsevier.Google Scholar
Kyung-Goo Doh, Hyunha Kim, and David A. Schmidt. 2009. Abstract parsing: Static analysis of dynamically generated string output using LR-parsing technology. In Proceedings of the 16th International Symposium on Static Analysis (SAS’09) (Lecture Notes in Computer Science), Jens Palsberg and Zhendong Su (Eds.), Vol. 5673. Springer, 256--272. DOI:https://doi.org/10.1007/978-3-642-03237-0_18Google Scholar
S. Drape, C. Thomborson, and A. Majumdar. 2007. Specifying imperative data obfuscations. In Proceedings of the Conference on Information Security (IS’07) (Lecture Notes in Computer Science), J. A. Garay, A. K. Lenstra, M. Mambo, and R. Peralta (Eds.), Vol. 4779. Springer Verlag, 299--314.Google Scholar
V. D’Silva. 2006. Widening for Automata. Diploma Thesis, Institut Fur Informatick, Universitat Zurich.Google Scholar
François Gauthier, Behnaz Hassanshahi, and Alexander Jordan. 2018. AFFOGATO: Runtime detection of injection attacks for Node.js. In Proceedings of the ISSTA/ECOOP Workshops (ISSTA’18), Julian Dolby, William G. J. Halfond, and Ashish Mishra (Eds.). ACM, 94--99. DOI:https://doi.org/10.1145/3236454.3236502Google Scholar
R. Giacobazzi. 1998. Abductive analysis of modular logic programs. J. Logic Comput. 8, 4 (1998), 457--484.Google ScholarCross Ref
R. Giacobazzi, N. D. Jones, and I. Mastroeni. 2012. Obfuscation by partial evaluation of distorted interpreters. In Proceedings of the ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation (PEPM’12), O. Kiselyov and S. Thompson (Eds.). ACM Press, 63--72.Google Scholar
Roberto Giacobazzi and Isabella Mastroeni. 2010. A proof system for abstract non-interference. J. Log. Comput. 20, 2 (2010), 449--479.Google ScholarDigital Library
Roberto Giacobazzi and Isabella Mastroeni. 2012. Making abstract interpretation incomplete: Modeling the potency of obfuscation. In Proceedings of the 19th International Symposium on Static Analysis (SAS’12) (Lecture Notes in Computer Science), Antoine Miné and David Schmidt (Eds.), Vol. 7460. Springer, 129--145.Google ScholarDigital Library
Roberto Giacobazzi and Isabella Mastroeni. 2018. Abstract non-interference: A unifying framework for weakening information-flow. ACM Trans. Priv. Secur. 21, 2 (2018), 9:1--9:31.Google ScholarDigital Library
Nevin Heintze and Joxan Jaffar. 1994. Set constraints and set-based analysis. In Proceedings of the 2nd International Workshop on Principles and Practice of Constraint Programming (PPCP’94) (Lecture Notes in Computer Science), Alan Borning (Ed.), Vol. 874. Springer, 281--298. DOI:https://doi.org/10.1007/3-540-58601-6_107Google ScholarCross Ref
Pieter Hooimeijer, Benjamin Livshits, David Molnar, Prateek Saxena, and Margus Veanes. 2011. Fast and precise sanitizer analysis with BEK. In Proceedings of the 20th USENIX Security Symposium. USENIX Association. Retrieved from http://static.usenix.org/events/sec11/tech/full_papers/Hooimeijer.pdf.Google Scholar
Simon Holm Jensen, Peter A. Jonsson, and Anders Møller. 2012. Remedying the eval that men do. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’12), Mats Per Erik Heimdahl and Zhendong Su (Eds.). ACM, 34--44. DOI:https://doi.org/10.1145/2338965.2336758Google ScholarDigital Library
Simon Holm Jensen, Anders Møller, and Peter Thiemann. 2009. Type analysis for JavaScript. In Proceedings of the 16th International Symposium on Static Analysis (SAS’09). 238--255.Google ScholarDigital Library
R. Karim, F. Tip, A. Sochurkova, and K. Sen. 2018. Platform-independent dynamic taint analysis for JavaScript. IEEE Trans. Softw. Eng. 46, 12 (2020), 1364--1379.Google ScholarCross Ref
Vineeth Kashyap, Kyle Dewey, Ethan A. Kuefner, John Wagner, Kevin Gibbons, John Sarracino, Ben Wiedermann, and Ben Hardekopf. 2014. JSAI: A static analysis platform for JavaScript. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’14). 121--132.Google ScholarDigital Library
Hyunha Kim, Kyung-Goo Doh, and David A. Schmidt. 2013. Static validation of dynamically generated HTML documents based on abstract parsing and semantic processing. In Proceedings of the 20th International Symposium on Static Analysis (SAS’13) (Lecture Notes in Computer Science), Francesco Logozzo and Manuel Fähndrich (Eds.), Vol. 7935. Springer, 194--214. DOI:https://doi.org/10.1007/978-3-642-38856-9_12Google Scholar
Hongki Lee, Sooncheol Won, Joonho Jin, Junhee Cho, and Sukyoung Ryu. 2012. SAFE: Formal specification and implementation of a scalable analysis framework for ECMAScript. In Proceedings of the International Workshop on Foundations of Object-Oriented Languages. ACM.Google Scholar
Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondrej Lhoták, José Nelson Amaral, Bor-Yuh Evan Chang, Samuel Z. Guyer, Uday P. Khedker, Anders Møller, and Dimitrios Vardoulakis. 2015. In defense of soundiness: A manifesto. Commun. ACM 58, 2 (2015), 44--46. DOI:https://doi.org/10.1145/2644805Google ScholarDigital Library
Isabella Mastroeni and Durica Nikolic. 2010. Abstract program slicing: From theory towards an implementation. In Proceedings of the 12th International Conference on Formal Engineering Methods (ICFEM’10) (Lecture Notes in Computer Science), Jin Song Dong and Huibiao Zhu (Eds.), Vol. 6447. Springer, 452--467.Google ScholarCross Ref
Isabella Mastroeni and Damiano Zanardini. 2017. Abstract program slicing: An abstract interpretation-based approach to program slicing. ACM Trans. Comput. Log. 18, 1 (2017), 7:1--7:58.Google ScholarDigital Library
N. Mavrogiannopoulos, N. Kisserli, and B. Preneel. 2011. A taxonomy of self-modifying code for obfuscation. Comput. Secur. 30, 8 (2011), 679--691.Google ScholarDigital Library
Fadi Meawad, Gregor Richards, Floréal Morandat, and Jan Vitek. 2012. Eval begone!: Semi-automated removal of eval from javascript programs. In Proceedings of the 27th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’12), Gary T. Leavens and Matthew B. Dwyer (Eds.). ACM, 607--620. DOI:https://doi.org/10.1145/2384616.2384660Google ScholarDigital Library
Yasuhiko Minamide. 2005. Static approximation of dynamically generated Web pages. In Proceedings of the 14th International Conference on World Wide Web (WWW’05), Allan Ellis and Tatsuya Hagino (Eds.). ACM, 432--441. DOI:https://doi.org/10.1145/1060745.1060809Google ScholarDigital Library
Anders Møller. 2015. Static analysis of JavaScript. In Proceedings of the 22nd International Symposium on Static Analysis (SAS’15).Google Scholar
Luca Negrini, Vincenzo Arceri, Pietro Ferrara, and Agostino Cortesi. 2020. Twinning automata and regular expressions for string static analysis. Retrieved from https://arxiv:cs.SE/2006.02715.Google Scholar
Flemming Nielson, Hanne Riis Nielson, and Chris Hankin. 1999. Principles of Program Analysis. Springer. DOI:https://doi.org/10.1007/978-3-662-03811-6Google ScholarDigital Library
Changhee Park and Sukyoung Ryu. 2015. Scalable and precise static analysis of JavaScript applications via loop-sensitivity. In Proceedings of the 29th European Conference on Object-Oriented Programming (ECOOP’15). 735--756.Google Scholar
Gregor Richards, Christian Hammer, Brian Burg, and Jan Vitek. 2011. The eval that men do—A large-scale study of the use of eval in JavaScript applications. In Proceedings of the 25th European Conference on Object-Oriented Programming (ECOOP’11) (Lecture Notes in Computer Science), Mira Mezini (Ed.), Vol. 6813. Springer, 52--78. DOI:https://doi.org/10.1007/978-3-642-22655-7_4Google Scholar
Helmut Seidl, Reinhard Wilhelm, and Sebastian Hack. 2012. Compiler Design—Analysis and Transformation. Springer.Google Scholar
Cristian-Alexandru Staicu, Michael Pradel, and Benjamin Livshits. 2018. SYNODE: Understanding and automatically preventing injection attacks on NODE.JS. In Proceedings of the 25th Annual Network and Distributed System Security Symposium (NDSS’18). The Internet Society. Retrieved from http://wp.internetsociety.org/ndss/wp-content/uploads/sites/25/2018/02/ndss2018_07A-2_Staicu_paper.pdf.Google ScholarCross Ref
Peter Thiemann. 2005. Grammar-based analysis of string expressions. In Proceedings of the ACM SIGPLAN International Workshop on Types in Languages Design and Implementation (TLDI’05), J. Gregory Morrisett and Manuel Fähndrich (Eds.). ACM, 59--70. DOI:https://doi.org/10.1145/1040294.1040300Google ScholarDigital Library
Arnaud Venet. 1999. Automatic analysis of pointer aliasing for untyped programs. Sci. Comput. Program. 35, 2 (1999), 223--248.Google ScholarDigital Library
Junjie Wang, Yinxing Xue, Yang Liu, and Tian Huat Tan. 2015. JSDC: A hybrid approach for JavaScript malware detection and classification. In Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security (ASIA CCS’15), Feng Bao, Steven Miller, Jianying Zhou, and Gail-Joon Ahn (Eds.). ACM, 109--120. DOI:https://doi.org/10.1145/2714576.2714620Google ScholarDigital Library
X. Wang, Y. Jhi, S. Zhu, and P. Liu. 2008. STILL: Exploit code detection via static taint and initialization analyses. In Proceedings of the Annual Computer Security Applications Conference (ACSAC’08). IEEE Computer Society, 289--298.Google Scholar
Yichen Xie and Alex Aiken. 2006. Static detection of security vulnerabilities in scripting languages. In Proceedings of the 15th USENIX Security Symposium, Angelos D. Keromytis (Ed.). USENIX Association. Retrieved from https://www.usenix.org/conference/15th-usenix-security-symposium/static-detection-security-vulnerabilities-scripting.Google Scholar
Yinxing Xue, Junjie Wang, Yang Liu, Hao Xiao, Jun Sun, and Mahinthan Chandramohan. 2015. Detection and classification of malicious JavaScript via attack behavior modelling. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’15), Michal Young and Tao Xie (Eds.). ACM, 48--59. DOI:https://doi.org/10.1145/2771783.2771814Google ScholarDigital Library
Fang Yu, Muath Alkhalaf, and Tevfik Bultan. 2011. Patching vulnerabilities with sanitization synthesis. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11), Richard N. Taylor, Harald C. Gall, and Nenad Medvidovic (Eds.). ACM, 251--260. DOI:https://doi.org/10.1145/1985793.1985828Google ScholarDigital Library

Index Terms

Analyzing Dynamic Code: A Sound Abstract Interpreter for Evil Eval
1. Security and privacy
  1. Formal methods and theory of security
    1. Logic and verification
  2. Software and application security
    1. Software security engineering

Recommendations

A sound abstract interpreter for dynamic code
SAC '20: Proceedings of the 35th Annual ACM Symposium on Applied Computing

Dynamic languages, such as JavaScript, employ string-to-code primitives to turn dynamically generated text into executable code at run-time. These features make standard static analysis extremely hard if not impossible because its essential data ...
Read More
Pushdown control-flow analysis for free
POPL '16

Traditional control-flow analysis (CFA) for higher-order languages introduces spurious connections between callers and callees, and different invocations of a function may pollute each other's return flows. Recently, three distinct approaches have been ...
Read More
Static program analysis of embedded executable assembly code
CASES '04: Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems

We consider the problem of automatically checking if coding standards have been followed in the development of embedded applications. The problem arises from practical considerations because DSP chip manufacturers (in our case Texas Instruments) want ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Privacy and Security Volume 24, Issue 2
May 2021
242 pages
ISSN:2471-2566
EISSN:2471-2574
DOI:10.1145/3446639
Editor:
Ninghui Li
Purdue University, USA
Issue’s Table of Contents
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 January 2021
- Accepted: 1 September 2020
- Revised: 1 June 2020
- Received: 1 February 2020
Published in tops Volume 24, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Abstract interpretation
dynamic languages
static analysis
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 204
  Total Downloads
- Downloads (Last 12 months)28
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Analyzing Dynamic Code: A Sound Abstract Interpreter for Evil Eval

ACM Transactions on Privacy and Security

Abstract

References

Cited By

Index Terms

Recommendations

A sound abstract interpreter for dynamic code

Pushdown control-flow analysis for free

Static program analysis of embedded executable assembly code

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Analyzing Dynamic Code: A Sound Abstract Interpreter for Evil Eval

ACM Transactions on Privacy and Security

Abstract

References

Cited By

Index Terms

Recommendations

A sound abstract interpreter for dynamic code

Pushdown control-flow analysis for free

Static program analysis of embedded executable assembly code

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media