Abstract
Managed language virtual machines (VM) rely on dynamic or just-in-time (JIT) compilation to generate optimized native code at run-time to deliver high execution performance. Many VMs and JIT compilers collect profile data at run-time to enable profile-guided optimizations (PGO) that customize the generated native code to different program inputs. PGOs are generally considered integral for VMs to produce high-quality and performant native code.
In this work, we study and quantify the performance benefits of PGOs, understand the importance of profiling data quantity and quality/accuracy to effectively guide PGOs, and assess the impact of individual PGOs on VM performance. The insights obtained from this work can be used to understand the current state of PGOs, develop strategies to more efficiently balance the cost and exploit the potential of PGOs, and explore the implications of and challenges for the alternative ahead-of-time (AOT) compilation model used by VMs.
- Github. [n.d.]. DaCapo Batik benchmark fails. Retrieved from https://github.com/RedlineResearch/OLD-OpenJDK8/issues/1.Google Scholar
- Github [n.d.]. DaCapo Eclipse benchmark fails. Retrieved from https://github.com/RedlineResearch/OLD-OpenJDK8/issues/2.Google Scholar
- Matthew Arnold, Stephen Fink, David Grove, Michael Hind, and Peter F. Sweeney. 2005. A survey of adaptive optimization in virtual machines. Proc. IEEE 92, 2 (Feb. 2005), 449--466.Google Scholar
- Matthew Arnold, Stephen Fink, David Grove, Michael Hind, and Peter F. Sweeney. 2011. Adaptive optimization in the Jalapeno JVM. SIGPLAN Notices 46, 4 (May 2011), 65--83.Google ScholarDigital Library
- Matthew Arnold and David Grove. 2005. Collecting and exploiting high-accuracy call graph profiles in virtual machines. In Proceedings of the Symposium on Code Generation and Optimization. 51--62.Google ScholarDigital Library
- Matthew Arnold and David Grove. 2005. Collecting and exploiting high-accuracy call graph profiles in virtual machines. In Proceedings of the Symposium on Code Generation and Optimization (CGO’05). 51--62.Google ScholarDigital Library
- Matthew Arnold, Adam Welc, and V. T. Rajan. 2005. Improving virtual machine performance using a cross-run profile repository. In Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications (OOPSLA’05). ACM, New York, NY, 297--311. DOI:https://doi.org/10.1145/1094811.1094835Google Scholar
- Steve Blackburn, Daniel Frampton, Robin Garner, and John Zigman. 2009. dacapo-9.12-bach. Retrieved from http://dacapobench.org/RELEASE_NOTES.txt.Google Scholar
- Stephen M. Blackburn, Robin Garner, Chris Hoffmann, Asjad M. Khang, Kathryn S. McKinley, Rotem Bentzur, Amer Diwan, Daniel Feinberg, Daniel Frampton, Samuel Z. Guyer, Martin Hirzel, Antony Hosking, Maria Jump, Han Lee, J. Eliot B. Moss, B. Moss, Aashish Phansalkar, Darko Stefanović, Thomas VanDrunen, Daniel von Dincklage, and Ben Wiedermann. 2006. The DaCapo benchmarks: Java benchmarking development and analysis. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-oriented Programming Systems, Languages, and Applications (OOPSLA’06). ACM, 169--190.Google ScholarDigital Library
- William J. Bowman, Swaha Miller, Vincent St.-Amour, and R. Kent Dybvig. 2015. Profile-guided meta-programming. In Proceedings of the Conference on Programming Language Design and Implementation. 403--412.Google Scholar
- Pohua P. Chang, Scott A. Mahlke, and Wen Mei W. Hwu. 1991. Using profile information to assist classic code optimizations. Softw. Prac. Exp. 21 (1991), 1301--1321.Google ScholarDigital Library
- Dehao Chen, David Xinliang Li, and Tipp Moseley. 2016. AutoFDO: Automatic feedback-directed optimization for warehouse-scale applications. In Proceedings of the International Symposium on Code Generation and Optimization (CGO’16). ACM, New York, NY, 12--23. DOI:https://doi.org/10.1145/2854038.2854044Google ScholarDigital Library
- MichałCierniak, Guei-Yuan Lueh, and James M. Stichnoth. 2000. Practicing JUDO: Java under dynamic optimizations. In Proceedings of the Conference on Programming Language Design and Implementation. 13--26.Google Scholar
- Evelyn Duesterwald and Vasanth Bala. 2000. Software profiling for hot path prediction: Less is more. SIGPLAN Notices 35, 11 (Nov. 2000), 202--211.Google ScholarDigital Library
- Andy Georges, Dries Buytaert, and Lieven Eeckhout. 2007. Statistically rigorous Java performance evaluation. In Proceedings of the Conference on Object-oriented Programming Systems and Applications. 57--76.Google ScholarDigital Library
- Susan L. Graham, Peter B. Kessler, and Marshall K. Mckusick. 1982. Gprof: A call graph execution profiler. SIGPLAN Notices 17, 6 (1982), 120--126.Google ScholarDigital Library
- Urs Hölzle and David Ungar. 1996. Reconciling responsiveness with performance in pure object-oriented languages. ACM Trans. Program. Lang. Syst. 18, 4 (1996), 355--400.Google ScholarDigital Library
- Andrei Homescu, Steven Neisius, Per Larsen, Stefan Brunthaler, and Michael Franz. 2013. Profile-guided automated software diversity. In Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization (CGO’13). 1--11.Google ScholarDigital Library
- SungHyun Hong, Jin-Chul Kim, Jin Woo Shin, Soo-Mook Moon, Hyeong-Seok Oh, Jaemok Lee, and Hyung-Kyu Choi. 2007. Java client ahead-of-time compiler for embedded systems. In Proceedings of the Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES’07). 63--72.Google ScholarDigital Library
- Xianglong Huang, Stephen M. Blackburn, Kathryn S. McKinley, J Eliot B. Moss, Zhenlin Wang, and Perry Cheng. 2004. The garbage collection advantage: Improving program locality. In Proceedings of the Conference on Object-oriented Programming, Systems, Languages, and Applications (OOPSLA’04). 69--80.Google ScholarDigital Library
- Wen-Mei W. Hwu, Scott A. Mahlke, William Y. Chen, Pohua P. Chang, Nancy J. Warter, Roger A. Bringmann, Roland G. Ouellette, Richard E. Hank, Tokuzo Kiyohara, Grant E. Haab, John G. Holm, and Daniel M. Lavery. 1993. The superblock: An effective technique for VLIW and superscalar compilation. J. Supercomput. 7, 1--2 (1993), 229--248.Google ScholarDigital Library
- Michael R. Jantz, Forrest J. Robinson, Prasad A. Kulkarni, and Kshitij A. Doshi. 2015. Cross-layer memory management for managed language applications. In Proceedings of the Conference on Object-Oriented Programming, Systems, Languages, and Applications. 488--504.Google Scholar
- Calin Juravle. 2019. Improving app performance with ART optimizing profiles in the cloud. Retrieved from https://android-developers.googleblog.com/2019/04/improving-app-performance-with-art.html.Google Scholar
- Chandra Krintz, David Grove, Vivek Sarkar, and Brad Calder. 2000. Reducing the overhead of dynamic compilation. Softw. Pract. Exp. 31, 8 (Dec. 2000), 717--738.Google Scholar
- Prasad A. Kulkarni. 2011. JIT compilation policy for modern machines. In Proceedings of the Conference on Object Oriented Programming Systems Languages and Applications. 773--788.Google ScholarDigital Library
- Zoltan Majo, Tobias Hartmann, Marcel Mohler, and Thomas R. Gross. 2017. Integrating profile caching into the HotSpot multi-tier compilation system. In Proceedings of the 14th International Conference on Managed Languages and Runtimes (ManLang’17). ACM, New York, NY, 105--118. DOI:https://doi.org/10.1145/3132190.3132210Google Scholar
- Markus Mock, Craig Chambers, and Susan J. Eggers. 2000. Calpa: A tool for automating selective dynamic compilation. In Proceedings of the Symposium on Microarchitecture. 291--302.Google Scholar
- Tipp Moseley, Alex Shye, Vijay Janapa Reddi, Dirk Grunwald, and Ramesh Peri. 2007. Shadow profiling: Hiding instrumentation costs with parallelism. In Proceedings of the Symposium on Code Generation and Optimization (CGO’07). 198--208.Google ScholarDigital Library
- Todd Mytkowicz, Amer Diwan, Matthias Hauswirth, and Peter F. Sweeney. 2010. Evaluating the accuracy of Java profilers. In Proceedings of the Conference on Programming Language Design and Implementation (PLDI’10). 187--197.Google Scholar
- Michael Paleczny, Christopher Vick, and Cliff Click. 2001. The Java HotSpot server compiler. In Proceedings of the Symposium on Java Virtual Machine Research and Technology Symposium. 1--12.Google Scholar
- Karl Pettis and Robert C. Hansen. 1990. Profile guided code positioning. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’90). 16--27.Google Scholar
- Android Open Source Project. [n.d.]. Introducing ART. Retrieved from https://source.android.com/devices/tech/dalvik/.Google Scholar
- Forrest J. Robinson, Michael R. Jantz, and Prasad A. Kulkarni. 2016. Code cache management in managed language VMs to reduce memory consumption for embedded systems. In Proceedings of the Conference on Languages, Compilers, Tools, and Theory for Embedded Systems. 11--20.Google Scholar
- Shai Rubin, Rastislav Bodík, and Trishul Chilimbi. 2002. An efficient profile-analysis framework for data-layout optimizations. In Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’02). ACM, 140--153.Google ScholarDigital Library
- Mauricio Serrano, Rajesh Bordawekar, Sam Midkiff, and Manish Gupta. 2000. Quicksilver: A quasi-static compiler for Java. In Proceedings of the Conference on Object-oriented Programming, Systems, Languages, and Applications. 66--82.Google ScholarDigital Library
- Andreas Sewe, Mira Mezini, Aibek Sarimbekov, and Walter Binder. 2011. Da Capo con Scala: Design and analysis of a Scala benchmark suite for the Java virtual machine. In Proceedings of the 26th Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA’11). ACM, New York, NY, 657--676.Google ScholarDigital Library
- SPEC2008. 2008. SPECjvm2008 Benchmarks. Retrieved from http://www.spec.org/jvm2008/.Google Scholar
- Toshio Suganuma, Toshiaki Yasue, Motohiro Kawahito, Hideaki Komatsu, and Toshio Nakatani. 2005. Design and evaluation of dynamic optimizations for a Java just-in-time compiler. ACM Trans. Program. Lang. Syst. 27, 4 (July 2005), 732--785.Google ScholarDigital Library
- April W. Wade, Prasad A. Kulkarni, and Michael R. Jantz. 2017. AOT vs. JIT: Impact of profile data on code quality. In Proceedings of the 18th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES’17). ACM, New York, NY, 1--10. DOI:https://doi.org/10.1145/3078633.3081037Google Scholar
- Chih-Sheng Wang, Guillermo Perez, Yeh-Ching Chung, Wei-Chung Hsu, Wei-Kuan Shih, and Hong-Rong Hsu. 2011. A method-based ahead-of-time compiler for Android applications. In Proceedings of the Conference on Compilers, Architectures and Synthesis for Embedded Systems. 15--24.Google ScholarDigital Library
- Youfeng Wu and James R. Larus. 1994. Static branch frequency and program profile analysis. In Proceedings of the Symposium on Microarchitecture. 1--11.Google Scholar
Index Terms
- Exploring Impact of Profile Data on Code Quality in the HotSpot JVM
Recommendations
AOT vs. JIT: impact of profile data on code quality
LCTES 2017: Proceedings of the 18th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded SystemsJust-in-time (JIT) compilation during program execution and ahead-of-time (AOT) compilation during software installation are alternate techniques used by managed language virtual machines (VM) to generate optimized native code while simultaneously ...
Integrating Profile Caching into the HotSpot Multi-Tier Compilation System
ManLang 2017: Proceedings of the 14th International Conference on Managed Languages and RuntimesThe Java®HotSpot Virtual Machine includes a multi-tier compilation system that may invoke a compiler at any time. Lower tiers instrument the program to gather information for the highly optimizing compiler at the top tier, and this compiler bases its ...
AOT vs. JIT: impact of profile data on code quality
LCTES '17Just-in-time (JIT) compilation during program execution and ahead-of-time (AOT) compilation during software installation are alternate techniques used by managed language virtual machines (VM) to generate optimized native code while simultaneously ...
Comments