Abstract
JavaScript is a popular programming language today with several implementations competing for market dominance. Although a specification document and a conformance test suite exist to guide engine development, bugs occur and have important practical consequences. Implementing correct engines is challenging because the spec is intentionally incomplete and evolves frequently. This paper investigates the use of test transplantation and differential testing for revealing functional bugs in JavaScript engines. The former technique runs the regression test suite of a given engine on another engine. The latter technique fuzzes existing inputs and then compares the output produced by different engines with a differential oracle. We conducted experiments with engines from five major players—Apple, Facebook, Google, Microsoft, and Mozilla—to assess the effectiveness of test transplantation and differential testing. Our results indicate that both techniques revealed several bugs, many of which are confirmed by developers. We reported 35 bugs with test transplantation (23 of these bugs confirmed and 19 fixed) and reported 24 bugs with differential testing (17 of these confirmed and 10 fixed). Results indicate that most of these bugs affected two engines—Apple’s JSC and Microsoft’s ChakraCore (24 and 26 bugs, respectively). To summarize, our results show that test transplantation and differential testing are easy to apply and very effective in finding bugs in complex software, such as JavaScript engines.
Similar content being viewed by others
Data availability
The scripts to run the experiments for this study will be available upon request. The data is publicly available https://github.com/STAR-RG/entente.
Notes
These files are created with the grammar-based fuzzer jsfunfuzz (Mozilla 2007a). Look for option “compare_jit” from funfuzz.
Microsoft announced in December 2018 that the Edge browser will be based on Chromium and ChakraCore development would be discontinued (Joe 2018).
The name JavaScript still prevails today, certainly for historical reasons.
6 ome is for out of memory error.
https://github.com/facebook/hermes/issues/<id>, with id 265, 266, 267.
We interpreted as a violation of an undocumented precondition
References
American fuzz loop. (2020). https://web.archive.org/web/20200530065331/http://lcamtuf.coredump.cx/afl/.
Apple. (2018). Severity levels WebKit bugs (JavaScriptCore). https://web.archive.org/web/20200530065331/https://webkit.org/bug-prioritization/.
Argyros, G., Stais, I., Jana, S., Keromytis, A. D., & Kiayias, A. (2016). Sfadiff: Automated evasion attacks and fingerprinting using black-box differential automata learning. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, CCS ’16 (pp. 1690–1701). New York: ACM. https://doi.org/10.1145/2976749.2978383.
Babel. (2020). Babel Project. https://web.archive.org/web/20200530065331/https://github.com/babel/babel.
BlogEngine.Net. (2020). BlogEngine.Net Project. https://web.archive.org/web/20200530065331/https://github.com/rxtur/BlogEngine.NET.
Brian, T. (2020). escli-host. https://web.archive.org/web/20200530065331/https://github.com/bterlson/eshost-cli.
Brumley, D., Caballero, J., Liang, Z., Newsome, J., & Song, D. (2007). Towards automatic discovery of deviations in binary implementations with applications to error detection and fingerprint generation. In Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, SS’07 (pp. 15:1–15:16). Berkeley: USENIX Association. http://dl.acm.org/citation.cfm?id=1362903.1362918.
Camilo Bruni–V8 engineer. (2018). for-in undefined behavior. https://web.archive.org/web/20200530065331/https://v8project.blogspot.com/2017/03/fast-for-in-in-v8.html.
Chen, Y., & Su, Z. (2015). Guided differential testing of certificate validation in ssl/tls implementations. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2015 (pp. 793–804). New York: ACM. https://doi.org/10.1145/2786805.2786835.
Chen, Y., Su, T., Sun, C., Su, Z., & Zhao, J. (2016). Coverage-directed differential testing of jvm implementations. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’16 (pp. 85–99). New York: ACM. https://doi.org/10.1145/2908080.2908095.
Chen, J., Bai, Y., Hao, D., Xiong, Y., Zhang, H., & Xie, B. (2017). Learning to prioritize test programs for compiler testing. In Proceedings of the 39th International Conference on Software Engineering, ICSE ’17 (pp. 700–711). Piscataway: IEEE Press. https://doi.org/10.1109/ICSE.2017.70.
Chen, C., Tian, C., Duan, Z., & Zhao, L. (2018). Rfc-Directed differential testing of certificate validation in ssl/tls implementations. In Proceedings of the 40th International Conference on Software Engineering, ICSE ’18 (pp. 859–870). New York: ACM. https://doi.org/10.1145/3180155.3180226.
Chromium. (2016). Issue 4247. https://web.archive.org/web/20200530065331/https://bit.ly/2O08uW2.
Chromium. (2020). V8 JavaScript engine. https://web.archive.org/web/20200530065331/https://chromium.googlesource.com/v8/v8.git.
de Ciencias de la Información y de Sistemas, C.I.F.A. (2017). Quickfuzz. https://web.archive.org/web/20200530065331/http://quickfuzz.org/.
Daniel, B., Dig, D., Garcia, K., & Marinov, D. (2007). Automated testing of refactoring engines. In Proceedings of the the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, ESEC-FSE ’07 (pp. 185–194). New York: ACM. https://doi.org/10.1145/1287624.1287651.
Donaldson, A. F., Evrard, H., Lascu, A., & Thomson, P. (2017). Automated testing of graphics shader compilers. Proc. ACM Program. Lang. 1(OOPSLA), 93, 1–93:29. https://doi.org/10.1145/3133917.
Duktape. (2020). Duktape. https://web.archive.org/web/20200530065331/https://github.com/svaarala/duktape.
Ecma Internacional. (1961). Ecma Internacional. https://www.ecma-international.org.
Ecma Internacional. (2019a). Number.toPrecision specification. https://www.ecma-international.org/ecma-262/8.0/index.html#sec-number.prototype.toprecision.
Ecma Internacional. (2019b). Static semantics: early errors. https://www.ecma-international.org/ecma-262/8.0/index.html#sec-break-statement-static-semantics-early-errors.
Ecma Internacional. (2020). Changes to EvalDeclarationInstantiation. https://www.ecma-international.org/ecma-262/8.0/index.html#sec-web-compat-evaldeclarationinstantiation.
ES. (2014). Array new attributed caused bugs. https://web.archive.org/web/20200530065331/https://esdiscuss.org/topic/array-prototype-values-is-not-web-compat-even-with-unscopables.
Eshkevari, L., Antoniol, G., Cordy, J. R., & Di Penta, M. (2014). Identifying and locating interference issues in php applications: The case of wordpress. In Proceedings of the 22Nd International Conference on Program Comprehension, ICPC 2014 (pp. 157–167). New York: ACM. https://doi.org/10.1145/2597008.2597153.
Facebook. (2015). React Native. https://web.archive.org/web/20200530065331/https://reactnative.dev/docs/hermes.
Getting started with libfuzzer at chromium. (2019). https://web.archive.org/web/20200530065331/https://chromium.googlesource.com/chromium/src/+/master/testing/libfuzzer/getting_started.md.
Google. (2016). libFuzzer Tutorial. https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md.
Google Chrome Lab. (2017). JSVU–JavaScript (engine) Version Updater. https://web.archive.org/web/20200530065331/https://github.com/GoogleChromeLabs/jsvu.
Grieco, G., Ceresa, M., & Buiras, P. (2016). Quickfuzz: an automatic random fuzzer for common file formats. In Proceedings of the 9th International Symposium on Haskell (pp. 13–20): ACM.
Helin, A. (2020). Radamsa fuzzer. https://web.archive.org/web/20200530065331/https://github.com/aoh/radamsa.
Hermes. (2020). Hermes Project. https://web.archive.org/web/20200530065331/https://github.com/facebook/hermes/.
Hodovan, R. (2018). Grammarinator. https://web.archive.org/web/20200530065331/https://github.com/renatahodovan/grammarinator.
Holler, C., Herzig, K., & Zeller, A. (2012). Fuzzing with code fragments. In Proceedings of the 21st USENIX Conference on Security Symposium, Security’12 (pp. 38–38). Berkeley: USENIX Association. http://dl.acm.org/citation.cfm?id=2362793.2362831.
JerryScript. (2015). JerryScript. https://web.archive.org/web/20200530065331/https://github.com/technosaurus/jsish.
JerryScript. (2018). JerryScript Project. https://web.archive.org/web/20200530065331/https://github.com/jerryscript-project/jerryscript.
Joe, B. (2018). Microsoft Edge: Making the web better through more open source collaboration. https://blogs.windows.com/windowsexperience/2018/12/06/microsoft-edge-making-the-web-better-through-more-open-source-collaboration/#IIycUFupVTBcbAY5.97.
Resig, J. (2018). JavaScript in Chrome. https://web.archive.org/web/20200530065331/https://johnresig.com/blog/javascript-in-chrome/.
Kangax. (2015). Ecmascript6 compatibility. https://web.archive.org/web/20200530065331/http://kangax.github.io/compat-table/es6/.
Kapus, T., & Cadar, C. (2017). Automatic testing of symbolic execution engines via program generation and differential testing. In Proceedings of the 32Nd IEEE/ACM International Conference on Automated Software Engineering, ASE 2017(pp. 590–600). Piscataway: IEEE Press. http://dl.acm.org/citation.cfm?id=3155562.3155636 .
Kim, S., Faerevaag, M., Jung, M., Jung, S., Oh, D., Lee, J., & Cha, S.K. (2017). Testing intermediate representations for binary analysis. In Proceedings of the 32Nd IEEE/ACM International Conference on Automated Software Engineering, ASE 2017 (pp. 353–364). Piscataway: IEEE Press. http://dl.acm.org/citation.cfm?id=3155562.3155609.
Kusner, M., Sun, Y., Kolkin, N., & Weinberger, K. (2015). From word embeddings to document distances. In International conference on machine learning (pp. 957–966).
Lämmel, R., & Schulte, W. (2006). Controllable combinatorial coverage in grammar-based testing. In Uyar, M. Ü., Duale, A. Y., & Fecko, M. A. (Eds.) Testing of communicating systems (pp. 19–38). Berlin: Springer.
Le, V., Afshari, M., & Su, Z. (2014). Compiler validation via equivalence modulo inputs. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’14 (pp. 216–226). New York: ACM. https://doi.org/10.1145/2594291.2594334.
Lehmann, D., & Pradel, M. (2018). Feedback-directed differential testing of interactive debuggers. In Proceedings of the 2018 ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (pp. 610–620). Lake Buena Vista: ESEC/SIGSOFT FSE 2018. https://doi.org/10.1145/3236024.3236037.
Libfuzzer. (2020a). https://web.archive.org/web/20200530065331/https://llvm.org/docs/LibFuzzer.html.
libfuzzer tutorial. (2020b). https://web.archive.org/web/20200530065331/https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md.
Lidbury, C., Lascu, A., Chong, N., & Donaldson, A. F. (2015). Many-core compiler fuzzing. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’15 (pp. 65–76). New York: ACM. https://doi.org/10.1145/2737924.2737986.
Lithium. (2020). https://web.archive.org/web/20200530065331/https://github.com/MozillaSecurity/lithium.
LLVM. (2020). clang documentation. https://web.archive.org/web/20200530065331/http://clang.llvm.org/docs/.
Manès, V.J.M., Han, H., Han, C., Cha, S.K., Egele, M., Schwartz, E.J., & Woo, M. (2019). The art, science, and engineering of fuzzing: a survey. IEEE Transactions on Software Engineering, 1–1.
Microsoft. (2018a). ChakraCore. https://web.archive.org/web/20200530065331/https://github.com/Microsoft/ChakraCore.
Microsoft. (2018b). Severity levels chakra bugs https://web.archive.org/web/20200530065331/https://github.com/Microsoft/ChakraCore/wiki/Label-Glossary.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013 (pp. 3111–3119). Lake Tahoe: Curran Associates Inc.
Miller, B.P. (2020). Fuzz testing. https://web.archive.org/web/20200530065331/http://pages.cs.wisc.edu/bart/fuzz/.
Miranda, B., Lima, I., Legunsen, O., & d’Amorim, M. (2020). Prioritizing runtime verification violations. In Proceedings of the 13th IEEE International Conference on Software Testing, Verification and Validation (ICST). To appear. https://doi.org/10.1109/ICST46399.2020.00038.
Mozilla. (2007a). jsfunfuzz. https://web.archive.org/web/20200530065331/https://github.com/MozillaSecurity/funfuzz/tree/master/src/funfuzz/js/jsfunfuzz.
Mozilla. (2007b). jsfunfuzz at Mozilla. https://web.archive.org/web/20200530065331/https://mzl.la/2LsctZL.
Mozilla. (2018a). SpiderMonkey Project. https://web.archive.org/web/20200530065331/https://github.com/mozilla/gecko-dev.
Mozilla. (2018b). Triage process for firefox components in mozilla-central and bugzilla https://web.archive.org/web/20200530065331/https://github.com/mozilla/bug-handling/blob/master/policy/triage-bugzilla.md.
Nguyen, H. V., Kästner, C., & Nguyen, T. N. (2014). Exploring variability-aware execution for testing plugin-based web applications. In ICSE (pp. 907–918).
Object.valueof documentation. (2020). https://web.archive.org/web/20200530065331/https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/ValueOf.
Paleari, R., Martignoni, L., Fresi Roglia, G., & Bruschi, D. (2010). N-version disassembly: Differential testing of x86 disassemblers. In Proceedings of the 19th International Symposium on Software Testing and Analysis, ISSTA ’10 (pp. 265–274). New York: ACM. https://doi.org/10.1145/1831708.1831741.
Patra, J., & Pradel, M. (2016). Learning to fuzz: Application-independent fuzz testing with probabilistic, generative models of input data. Technical report. TU Darmstadt, Department of Computer Science.
Patra, J., Dixit, P. N., & Pradel, M. (2018). Conflictjs: Finding and understanding conflicts between javascript libraries. In Proceedings of the 40th International Conference on Software Engineering, ICSE ’18 (pp. 741–751). New York: ACM. https://doi.org/10.1145/3180155.3180184.
Petsios, T., Tang, A., Stolfo, S., Keromytis, A. D., & Jana, S. (2017). Nezha: Efficient domain-independent differential testing. In 2017 IEEE Symposium on security and privacy (SP) (pp. 615–632). https://doi.org/10.1109/SP.2017.27.
Purdom, P. (1972). A sentence generator for testing parsers. BIT Numerical Mathematics, 12(3), 366–375. https://doi.org/10.1007/BF01932308.
RedMonk. (2018). The RedMonk Programming Language Rankings: June 2018. https://web.archive.org/web/20200530065331/https://redmonk.com/sogrady/2018/08/10/language-rankings-6-18/.
Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1986). Parallel distributed processing: Explorations in the microstructure of cognition. In chap. Learning Internal Representations by Error Propagation, (Vol. 1 pp. 318–362). Cambridge: MIT Press. http://dl.acm.org/citation.cfm?id=104279.104293.
Simply Technologies. (2018). Why is JavaScript So Popular? https://web.archive.org/web/20200530065331/https://www.simplytechnologies.net/blog/2018/4/11/why-is-javascript-so-popular.
Sivakorn, S., Argyros, G., Pei, K., Keromytis, A. D., & Jana, S. (2017). Hvlearn: Automated black-box analysis of hostname verification in SSL/TLS implementations. In 2017 IEEE Symposium on security and privacy, SP 2017 (pp. 521–538). San Jose. https://doi.org/10.1109/SP.2017.46.
Stackify. (2008). Most popular and influential programming languages of 2018. https://web.archive.org/web/20200530065331/https://stackify.com/popular-programming-languages-2018/.
StackOverflow community. (2018). Elements order in a “for (... in ...)” loop. https://web.archive.org/web/20200530065331/https://stackoverflow.com/questions/280713/elements-order-in-a-for-in-loop.
TC39. (2018a). Official ECMA262 Conformance Test Suite. https://web.archive.org/web/20200530065331/https://github.com/tc39/test262.
TC39. (2018b). TC39 GitHub repo. https://web.archive.org/web/20200530065331/http://tc39.github.io/.
TC39. (2018c). ECMA262 repository. https://web.archive.org/web/20200530065331/https://tc39.github.io/ecma262/.
TC39. (2018d). ECMA262 Spec. https://web.archive.org/web/20200530065331/https://www.ecma-international.org/ecma-262/8.0/ .
TC39. (2018e). TypeConversion, ToIndex function. https://web.archive.org/web/20200530065331/https://tc39.github.io/ecma262/#sec-toindex.
TC39. (2018). Array sort. https://web.archive.org/web/20200530065331/https://tc39.github.io/ecma262/#sec-array.prototype.sort.
The Chromium Project. (2018). Chromium bug labels https://web.archive.org/web/20200530065331/https://www.chromium.org/for-testers/bug-reporting-guidelines/chromium-bug-labels.
The Node.js Foundation. (2009). Node.js. https://web.archive.org/web/20200530065331/https://nodejs.org.
Tiny-js. (2020). Tiny-js. https://web.archive.org/web/20200530065331/https://github.com/gfwilliams/tiny-js.
Unknown. (2019). Mozilla. https://web.archive.org/web/20200530065331/https://github.com/mozilla/gecko-dev/tree/master/js/src/tests/non262.
Unary plus - es6 specifications. (2020). https://web.archive.org/web/20200530065331/https://www.ecma-international.org/ecma-262/8.0/index.html#sec-unary-plus-operator.
WebKit. (2018). WebKit Project. https://web.archive.org/web/20200530065331/https://github.com/WebKit/webkit/tree/master/Source/JavaScriptCore.
Yang, X., Chen, Y., Eide, E., & Regehr, J. (2011a). Finding and understanding bugs in c compilers. In Proceedings of the 32Nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’11 (pp. 283–294). New York: ACM. https://doi.org/10.1145/1993498.1993532.
Yang, X., Chen, Y., Eide, E., & Regehr, J. (2011b). Finding and understanding bugs in c compilers. In Proceedings of the 32Nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’11 (pp. 283–294). New York: ACM. https://doi.org/10.1145/1993498.1993532.
Zhang, T., & Kim, M. (2017). Automated transplantation and differential testing for clones. In Proceedings of the 39th International Conference on Software Engineering, ICSE ’17 (pp. 665–676). Piscataway: IEEE Press. https://doi.org/10.1109/ICSE.2017.67.
Funding
Igor is supported by the FACEPE fellowship IBPG-0123-1.03/17. This research was partially funded by INES 2.0, FACEPE grants PRONEX APQ 0388-1.03/14 and APQ-0399-1.03/17, and CNPq grant 465614/2014-0.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lima, I., Silva, J., Miranda, B. et al. Exposing bugs in JavaScript engines through test transplantation and differential testing. Software Qual J 29, 129–158 (2021). https://doi.org/10.1007/s11219-020-09537-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11219-020-09537-8