skip to main content
research-article
Open Access

Reliability-aware Garbage Collection for Hybrid HBM-DRAM Memories

Published:20 January 2021Publication History
Skip Abstract Section

Abstract

Emerging workloads in cloud and data center infrastructures demand high main memory bandwidth and capacity. Unfortunately, DRAM alone is unable to satisfy contemporary main memory demands. High-bandwidth memory (HBM) uses 3D die-stacking to deliver 4–8× higher bandwidth. HBM has two drawbacks: (1) capacity is low, and (2) soft error rate is high. Hybrid memory combines DRAM and HBM to promise low fault rates, high bandwidth, and high capacity. Prior OS approaches manage HBM by mapping pages to HBM versus DRAM based on hotness (access frequency) and risk (susceptibility to soft errors). Unfortunately, these approaches operate at a coarse-grained page granularity, and frequent page migrations hurt performance.

This article proposes a new class of reliability-aware garbage collectors for hybrid HBM-DRAM systems that place hot and low-risk objects in HBM and the rest in DRAM. Our analysis of nine real-world Java workloads shows that: (1) newly allocated objects in the nursery are frequently written, making them both hot and low-risk, (2) a small fraction of the mature objects are hot and low-risk, and (3) allocation site is a good predictor for hotness and risk. We propose RiskRelief, a novel reliability-aware garbage collector that uses allocation site prediction to place hot and low-risk objects in HBM. Allocation sites are profiled offline and RiskRelief uses heuristics to classify allocation sites as DRAM and HBM. The proposed heuristics expose Pareto-optimal trade-offs between soft error rate (SER) and execution time. RiskRelief improves SER by 9× compared to an HBM-Only system while at the same time improving performance by 29% compared to a DRAM-Only system. Compared to a state-of-the-art OS approach for reliability-aware data placement, RiskRelief eliminates all page migration overheads, which substantially improves performance while delivering similar SER. Reliability-aware garbage collection opens up a new opportunity to manage emerging HBM-DRAM memories at fine granularity while requiring no extra hardware support and leaving the programming model unchanged.

References

  1. Shoaib Akram, Jennifer B. Sartor, and Lieven Eeckhout. 2016. DVFS performance prediction for managed multithreaded applications. In Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’16). 12--23.Google ScholarGoogle ScholarCross RefCross Ref
  2. Shoaib Akram, Jennifer B. Sartor, Kathryn S. McKinley, and Lieven Eeckhout. 2019. Crystal gazer: Profile-driven write-rationing garbage collection for hybrid memories. Proc. ACM Measure. Anal. Comput. Syst. 3, 1 (2019), 1--27.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Shoaib Akram, Jennifer B. Sartor, Kathryn S. McKinley, and Lieven Eeckhout. 2019. Emulating and evaluating hybrid memory for managed languages on NUMA hardware. In Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’19). 93--105.Google ScholarGoogle ScholarCross RefCross Ref
  4. Shoaib Akram, Jennifer B. Sartor, Kathryn S. McKinley, and Lieven Eeckhout. 2018. Write-rationing garbage collection for hybrid memories. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’18). 62--77.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bowen Alpern, C. Richard Attanasio, John J. Barton, Michael G. Burke, Perry Cheng, Jong-Deok Choi, Anthony Cocchi, Stephen J. Fink, David Grove, Michael Hind, Susan Flynn Hummel, Derek Lieber, Vassily Litvinov, Mark F. Mergen, Ton Ngo, James R. Russell, Vivek Sarkar, Mauricio J. Serrano, Janice C. Shepherd, Stephen E. Smith, Vugranam C. Sreedhar, Harini Srinivasan, and John Whaley. 2000. The Jalapeño virtual machine. IBM Syst. J. 39, 1 (2000), 211--238.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Bowen Alpern, Steve Augart, Stephen M. Blackburn, Maria A. Butrico, Anthony Cocchi, Perry Cheng, Julian Dolby, Stephen J. Fink, David Grove, Michael Hind, Kathryn S. McKinley, Mark Mergen, J. Eliot B. Moss, Ton Anh Ngo, Vivek Sarkar, and Martin Trapp. 2005. The Jikes RVM project: Building an open source research community. IBM Syst. J. 44, 2 (2005), 399--418.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. AMD. [n.d.]. High Bandwidth Memory. Retrieved from https://www.amd.com/en/technologies/hbm.Google ScholarGoogle Scholar
  8. Andrew W. Appel. 1989. Simple generational garbage collection and fast allocation. Softw.: Pract. Exper. 19, 2 (1989), 171--183.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Amro Awad, Arkaprava Basu, Sergey Blagodurov, Yan Solihin, and Gabriel H. Loh. 2017. Avoiding TLB shootdowns through self-invalidating TLB entries. In Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques (PACT’17). 273--287.Google ScholarGoogle Scholar
  10. Bryan Black, Murali Annavaram, Ned Brekelbaum, John DeVale, Lei Jiang, Gabriel H. Loh1, Don McCauley, Pat Morrow, Donald W. Nelson, Daniel Pantuso, Paul Reed, Jeff Rupley, Sadasivan Shankar, John Shen, and Clair Webb. 2006. Die stacking (3D) microarchitecture. In Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’06). 469--479.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Stephen M. Blackburn, Perry Cheng, and Kathryn S. McKinley. 2004. Myths and realities: The performance impact of garbage collection. In Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’04). 25--36.Google ScholarGoogle Scholar
  12. Stephen M. Blackburn, Perry Cheng, and Kathryn S. McKinley. 2004. Oil and water? High performance garbage collection in Java with MMTk. In Proceedings of the International Conference on Software Engineering (ICSE’04). 137--146.Google ScholarGoogle Scholar
  13. Stephen M. Blackburn, Robin Garner, Chris Hoffmann, Asjad M. Khang, Kathryn S. McKinley, Rotem Bentzur, Amer Diwan, Daniel Feinberg, Daniel Frampton, Samuel Z. Guyer, Martin Hirzel, Antony Hosking, Maria Jump, Han Lee, J. Eliot B. Moss, Aashish Phansalkar, Darko Stefanović, Thomas VanDrunen, Daniel von Dincklage, and Ben Wiedermann. 2006. The DaCapo benchmarks: Java benchmarking development and analysis. In Proceedings of the Annual ACM SIGPLAN Conference on Object-oriented Programming Systems, Languages, and Applications (OOPSLA’06). 169--190.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Stephen M. Blackburn and Kathryn S. McKinley. 2008. Immix: A mark-region garbage collector with space efficiency, fast collection, and mutator performance. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’08). 22--32.Google ScholarGoogle Scholar
  15. Stephen M. Blackburn, Kathryn S. McKinley, Robin Garner, Chris Hoffmann, Asjad M. Khan, Rotem Bentzur, Amer Diwan, Daniel Feinberg, Daniel Frampton, Samuel Z. Guyer, Martin Hirzel, Antony Hosking, Maria Jump, Han Lee, J. Eliot B. Moss, Aashish Phansalkar, Darko Stefanovik, Thomas VanDrunen, Daniel von Dincklage, and Ben Wiedermann. 2008. Wake up and smell the Coffee: Evaluation methodology for the 21st century. Commun. ACM 51, 8 (2008), 83--89.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Trevor E. Carlson, Wim Heirman, Stijn Eyerman, Ibrahim Hur, and Lieven Eeckhout. 2014. An evaluation of high-level mechanistic core models. ACM Trans. Architect. Code Optim. 11, 3 (2014), 1--25.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. ChiaChen Chou, Aamer Jaleel, and Moinuddin K. Qureshi. 2014. CAMEO: A two-level memory organization with capacity of main memory and flexibility of hardware-managed cache. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’14). 1--12.Google ScholarGoogle Scholar
  18. ChiaChen Chou, Aamer Jaleel, and Moinuddin K. Qureshi. 2015. BEAR: Techniques for mitigating bandwidth bloat in gigascale DRAM caches. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA’15). 198--210.Google ScholarGoogle Scholar
  19. Chiachen Chou, Aamer Jaleel, and Moinuddin Qureshi. 2017. BATMAN: Techniques for maximizing system bandwidth of memory systems with stacked-DRAM. In Proceedings of the International Symposium on Memory Systems (MEMSYS’17). 268--280.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. NVIDIA Corp.2016. NVIDIA Pascal Architecture. Retrieved from https://www.nvidia.com/en-us/data-center/pascal-gpu-architecture/.Google ScholarGoogle Scholar
  21. Timothy J. Dell. 1997. A white paper on the benefits of Chipkill-correct ECC for PC server main memory. IBM Microelectronics Division.Google ScholarGoogle Scholar
  22. Xiangyu Dong, Yuan Xie, Naveen Muralimanohar, and Norman P. Jouppi. 2010. Simple but effective heterogeneous main memory with on-chip memory controller support. In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC’10). 1--11.Google ScholarGoogle Scholar
  23. Kristof Du Bois, Jennifer B. Sartor, Stijn Eyerman, and Lieven Eeckhout. 2013. Bottle graphs: Visualizing scalability bottlenecks in multi-threaded applications. In Proceedings of the ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA’13). 355--372.Google ScholarGoogle Scholar
  24. Bogdan F. Romanescu, Alvin R. Lebeck, Daniel J. Sorin, and Anne Bracy. 2010. UNified instruction/translation/data (UNITD) coherence: One protocol to rule them all. In Proceedings of the 16th International Symposium on High-Performance Computer Architecture (HPCA’10). 1--12.Google ScholarGoogle ScholarCross RefCross Ref
  25. Daniel Frampton, Stephen M. Blackburn, Perry Cheng, Robin J. Garner, David Grove, J. Eliot B. Moss, and Sergey I. Salishev. 2009. Demystifying magic: High-level low-level programming. In Proceedings of the International Conference on Virtual Execution Environments (VEE’09). 81--90.Google ScholarGoogle Scholar
  26. Tiejun Gao, Karin Strauss, Stephen M. Blackburn, Kathryn S. McKinley, Doug Burger, and James Larus. 2013. Using managed runtime systems to tolerate holes in wearable memories. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’13). 297--308.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Manish Gupta, Vilas Sridharan, David Roberts, Andreas Prodromou, Ashish Venkat, Dean Tullsen, and Rajesh Gupta. 2018. Reliability-aware data placement for heterogeneous memory architecture. In Proceedings of the 24th IEEE International Symposium on High Performance Computer Architecture (HPCA’18). 583--595.Google ScholarGoogle ScholarCross RefCross Ref
  28. Gabriel H. Loh and Mark D. Hill. 2011. Efficiently enabling conventional block sizes for very large die-stacked DRAM caches. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’11). 454--564.Google ScholarGoogle Scholar
  29. Jungwoo Ha, Magnus Gustafsson, Stephen M. Blackburn, and Kathryn S. McKinley. 2008. Microarchitectural characterization of production JVMs and Java workloads. In Proceedings of the IBM CAS Workshop.Google ScholarGoogle Scholar
  30. Mu-Yue Hsiao. 1970. A class of optimal minimum odd-weight-column SEC-DED codes. IBM J. Res. Dev. 14, 4 (1970), 395--401.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Jipeng Huang and Michael D. Bond. 2013. Efficient context sensitivity for dynamic analyses via calling context uptrees and customized memory management. In Proceedings of the ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA’13). 53--72.Google ScholarGoogle Scholar
  32. Xianglong Huang, Stephen M. Blackburn, Kathryn S. McKinley, J. Eliot B. Moss, Zhenlin Wang, and Perry Cheng. 2004. The garbage collection advantage: Improving mutator locality. In Proceedings of the ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA’04). 69--80.Google ScholarGoogle Scholar
  33. ITRS. 2005. Internatial Technology Roadmap for Semiconductors: Assembly and Packaging. https://www.semiconductors.org/resources/2005-international-technology-roadmap-for-semiconductors-itrs/.Google ScholarGoogle Scholar
  34. Prashant J. Nair, David A. Roberts, and Moinuddin K. Qureshi. 2014. Citadel: Efficiently protecting stacked memory from large granularity failures. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’14). 51--62.Google ScholarGoogle Scholar
  35. JEDEC. [n.d.]. High Bandwidth Memory. Retrieved from https://www.jedec.org/standards-documents/docs/jesd235a.Google ScholarGoogle Scholar
  36. Hyeran Jeon, Gabriel H. Loh, and Murali Annavaram. 2014. Efficient RAS support for die-stacked DRAM. In Proceedings of the International Test Conference (ITC’14). 1--10.Google ScholarGoogle ScholarCross RefCross Ref
  37. Djordje Jevdjic, Gabriel H. Loh, Cansu Kaynak, and Babak Falsafi. 2014. Unison cache: A scalable and effective die-stacked DRAM cache. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’14). 25--37.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Djordje Jevdjic, Stavros Volos, and Babak Falsafi. 2013. Die-stacked DRAM caches for servers: Hit ratio, latency, or bandwidth? Have it all with footprint cache. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA’13). 404--415.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Xiaowei Jiang, Niti Madan, Li Zhao, Mike Upton, Ravishankar Iyer, Srihari Makineni, Donald Newell, Yan Solihin, and Rajeev Balasubramonian. 2010. CHOP: Adaptive filter-based DRAM caching for CMP server platforms. In Proceedings of the 16th International Symposium on High-Performance Computer Architecture (HPCA’10). 1--12.Google ScholarGoogle ScholarCross RefCross Ref
  40. Richard Jones and Rafael Lins. 1996. Garbage Collection: Algorithms for Automatic Dynamic Memory Management. John Wiley 8 Sons.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Moinuddin K. Qureshi and Gabe H. Loh. 2012. Fundamental latency trade-off in architecting DRAM caches: Outperforming impractical SRAM-tags with a simple and practical design. In Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’12). 235--246.Google ScholarGoogle Scholar
  42. Yoongu Kim, Ross Daly, Jeremie Kim, Chris Fallin, Ji Hye Lee, Donghyuk Lee, Chris Wilkerson, Konrad Lai, and Onur Mutlu. 2014. Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors. In Proceeding of the 41st Annual International Symposium on Computer Architecuture (ISCA’14). 361--372.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Yongjun Lee, Jongwon Kim, Hakbeom Jang, Hyunggyun Yang, Jangwoo Kim, Jinkyu Jeong, and Jae W. Leet. 2015. A fully associative, tagless DRAM cache. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA’15). 211--222.Google ScholarGoogle Scholar
  44. Xiao Liu, David Roberts, Rachata Ausavarungnirun, Onur Mutlu, and Jishen Zhao. 2019. Binary star: Coordinated reliability in heterogeneous memory systems for high performance and scalability. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’19). 807--820.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building customized program analysis tools with dynamic instrumentation. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’05). 190--200.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Matthias Meyer. 2006. A true hardware read barrier. In Proceedings of the 5th International Symposium on Memory Management (ISMM’06). 3--16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Justin Meza, Qiang Wu, Sanjeev Kumar, and Onur Mutlu. 2015. Revisiting memory errors in large-scale production data centers: Analysis and modeling of new trends from the field. In Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’15). 415--526.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Micron. 2007. TN-41-01: Calculating memory system power for DDR3. https://www.micron.com/-/media/client/global/documents/products/technical-note/dram/tn41_01ddr3_power.pdf.Google ScholarGoogle Scholar
  49. Prashant J. Nair, David A. Roberts, and Moinuddin K. Qureshi. 2015. FaultSim: A fast, configurable memory-reliability simulator for conventional and 3D-stacked systems. ACM Trans. Architect. Code Optim. 12, 4 (2015), 1--24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Khanh Nguyen, Lu Fang, Guoqing Xu, Brian Demsky, Shan Lu, Sanazsadat Alamian, and Onur Mutlu. 2016. Yak: A high-performance big-data-friendly garbage collector. In Proceedings of the USENIX Conference on Operating Systems Design and Implementation (OSDI’16). 349--365.Google ScholarGoogle Scholar
  51. Mark Oskin and Gabriel H. Loh. 2015. A software-managed approach to die-stacked DRAM. In Proceedings of the International Conference on Parallel Architecture and Compilation (PACT’15). 188--200.Google ScholarGoogle Scholar
  52. Mike O’Connor. 2014. Highlights of the high-bandwidth memory (HBM) standard. In Proceedings of the Memory Forum Workshop.Google ScholarGoogle Scholar
  53. I. B. Peng, R. Gioiosa, G. Kestor, P. Cicotti, E. Laure, and S. Markidis. 2017. Exploring the performance benefit of hybrid memory system on HPC environments. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW’17). 683--692.Google ScholarGoogle Scholar
  54. Andreas Prodromou, Mitesh Meswani, Nuwan Jayasena, Gabriel Loh, and Dean M. Tullsen. 2017. MemPod: A clustered architecture for efficient and scalable migration in flat address space multi-level memories. In Proceedings of the 23rd IEEE International Symposium on High Performance Computer Architecture (HPCA’17). 433--444.Google ScholarGoogle Scholar
  55. Mitesh R. Meswani, Sergey Blagodurov, David Roberts, John Slice, Mike Ignatowski, and Gabriel H. Loh. 2015. Heterogeneous memory architectures: A HW/SW approach for mixing die-stacked and off-package memories. In Proceedings of the 21st International Symposium on High Performance Computer Architecture (HPCA’15). 126--136.Google ScholarGoogle Scholar
  56. Brian M. Rogers, Anil Krishna, Gordon B. Bell, Ken Vu, Xiaowei Jiang, and Yan Solihin. 2009. Scaling the bandwidth wall: Challenges in and avenues for CMP scaling. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA’09). 371--382.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Shubhendu S. Mukherjee, Christopher Weaver, Joel Emer, Steven K. Reinhardt, and Todd Austin. 2003. A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. In Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’03). 29--40.Google ScholarGoogle ScholarCross RefCross Ref
  58. Jennifer B. Sartor, Wim Heirman, Stephen M. Blackburn, Lieven Eeckhout, and Kathryn S. McKinley. 2014. Cooperative cache scrubbing. In Proceedings of the International Conference on Parallel Architectures and Compilation (PACT’14). 15--26.Google ScholarGoogle Scholar
  59. Bianca Schroeder, Eduardo Pinheiro, and Wolf-Dietrich Weber. 2009. DRAM errors in the wild: A large-scale field study. ACM SIGMETRICS Perform. Eval. Rev. 37, 1 (2009), 193--204.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Rifat Shahriyar, Stephen M. Blackburn, Xi Yang, and Kathryn S. McKinley. 2013. Taking off the gloves with reference counting Immix. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages 8 Applications (OOPSLA’13). 93--110.Google ScholarGoogle Scholar
  61. Jaewoong Sim, Gabriel H. Loh, Hyesoon Kim, Mike O’Connor, and Mithuna Thottethodi. 2012. A mostly clean DRAM cache for effective hit speculation and self-balancing dispatch. In Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’12). 247--257.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Jaewoong Sim, Alaa R. Alameldeen, Zeshan Chishti, Chris Wilkerson, and Hyesoon Kim. 2014. Transparent hardware management of stacked DRAM as part of memory. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’14). 13--24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Vilas Sridharan, Nathan DeBardeleben, Sean Blanchard, Kurt B. Ferreira, Jon Stearley, John Shalf, and Sudhanva Gurumurthi. 2015. Memory errors in modern systems: The good, the bad, and the ugly. In Proceedings of the 20th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’15). 297--310.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Vilas Sridharan and Dean Liberty. 2012. A study of DRAM failures in the field. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC’12). 1--11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. David Ungar. 1984. Generation scavenging: A non-disruptive high performance storage reclamation algorithm. In Proceedings of the 1st ACM SIGSOFT/SIGPLAN Software Engineering Symposium on Practical Software Development Environments (SDE’84). 157--167.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. David Ungar and Frank Jackson. 1992. An adaptive tenuring policy for generation scavengers. ACM Trans. Program. Lang. Syst. 14, 1 (1992), 1--27.Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Carlos Villavieja, Vasileios Karakostas, Lluis Vilanova, Yoav Etsion, Alex Ramirez, Avi Mendelson, Nacho Navarro, Adrian Cristal, and Osman S. Unsal. 2011. DiDi: Mitigating the performance impact of TLB shootdowns using a shared TLB directory. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’11). 340--349.Google ScholarGoogle Scholar
  68. Chenxi Wang, Huimin Cui, Ting Cao, John Zigman, Haris Volos, Onur Mutlu, Fang Lv, Xiaobing Feng, and Guoqing Harry Xu. 2019. Panthera: Holistic memory management for big data processing over hybrid memories. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’19). 347--362.Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Xi Yang, Stephen M. Blackburn, Daniel Frampton, and Antony L. Hosking. 2012. Barriers reconsidered, friendlier still! In Proceedings of the ACM SIGPLAN International Symposium on Memory Management (ISMM’12). 37--48.Google ScholarGoogle Scholar
  70. Xi Yang, Stephen M. Blackburn, Daniel Frampton, Jennifer B. Sartor, and Kathryn S. McKinley. 2011. Why nothing matters: The impact of zeroing. In Proceedings of the ACM Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’11). 307--324.Google ScholarGoogle Scholar
  71. Vinson Young, Chiachen Chou, Aamer Jaleel, and Moinuddin Qureshi. 2018. ACCORD: Enabling associativity for gigascale DRAM caches by coordinating way-install and way-prediction. In Proceedings of the 45th Annual International Symposium on Computer Architecture (ISCA’18). 328--339.Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Yi Zhao, Jin Shi, Kai Zheng, Haichuan Wang, Haibo Lin, and Ling Shao. 2009. Allocation wall: A limiting factor of Java applications on emerging multi-core platforms. In Proceedings of the ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA’09). 361--376.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Reliability-aware Garbage Collection for Hybrid HBM-DRAM Memories

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Architecture and Code Optimization
          ACM Transactions on Architecture and Code Optimization  Volume 18, Issue 1
          March 2021
          402 pages
          ISSN:1544-3566
          EISSN:1544-3973
          DOI:10.1145/3446348
          Issue’s Table of Contents

          Copyright © 2021 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 20 January 2021
          • Revised: 1 October 2020
          • Accepted: 1 October 2020
          • Received: 1 May 2020
          Published in taco Volume 18, Issue 1

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format