Skip to main content
Log in

A Memory Reliability Enhancement Technique for Multi Bit Upsets

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

Technological advances allow the production of increasingly complex electronic systems. Nevertheless, technology and voltage scaling increased dramatically the susceptibility of new devices not only to Single Bit Upsets (SBU), but also to Multiple Bit Upsets (MBU). In safety critical applications, it is mandatory to provide fault-tolerant systems, providing high reliability while meeting applications requirements. The problem of reliability is particularly expressed within the memory which represents more than 80 % of systems on chips. To tackle this problem we propose a new memory reliability techniques referred to as DPSR: Double Parity Single Redundancy. DPSR is designed to enhance computing systems resilience to SBU and MBU. Based on a thorough fault injection experiments, DPSR shows promising results; It detects and corrects more than 99.6 % of encountered MBU and has an average time overhead of less than 3 %.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18

Similar content being viewed by others

References

  1. Ziegler, J.F., Curtis, H.W., Muhlfeld, H.P., Montrose, C.J., Chin, B., Nicewicz, M., Russell, C.A., Wang, W.Y., Freeman, L.B., Hosier, P., LaFave, L.E., Walsh, J.L., Orro, J.M., Unger, G.J., Ross, J.M., O’Gorman, T.J., Messina, B., Sullivan, T.D., Sykes, A.J., Yourke, H., Enger, T.A., Tolat, V., Scott, T.S., Taber, A.H., Sussman, R.J., Klein, W.A., & Wahaus, C.W. (1996). Ibm experiments in soft fails in computer electronics (1978–1994). IBM Journal of Research and Development, 40(1), 3–18.

    Article  Google Scholar 

  2. Dixit, A., & Wood, A. (2011). Impact of new technology on soft error rates. Reliability Physics Symposim (IRPS), 486–492.

  3. Semiconductor industry association, international technology roadmap for semiconductors. http://www.itrs.net.

  4. Rehman, S., Shafique, M., & Henkel, J. (2016). Reliable software for unreliable hardware: A cross layer perspective. https://doi.org/10.1007/978-3-319-25772-3.

  5. Pintard, L. (2015). From safety analysis to experimental validation by fault injection - case of automotive embedded systems, Ph.D. thesis, University of Toulouse, France.

  6. Avizienis, A., Laprie, J.C., Randell, B., & Landwehr, C. (2004). Basic concepts and taxonomy of dependable and secure computing. IEEE Transactions on Dependable and Secure Computing, 1(1), 11–33.

    Article  Google Scholar 

  7. Hsueh, M.-C., Tsai, T.K., & Iyer, R.K. (1997). Fault injection techniques and tools. Computer, 30(4), 75–82.

    Article  Google Scholar 

  8. Hazucha, P., & Svensson, C. (2001). Impact of cmos technology scaling on the atmospheric neutron soft error rate. IEEE Transactions on Nuclear Science, 47, 2586–2594. https://doi.org/10.1109/23.903813.

    Article  Google Scholar 

  9. Velazco, R., Fouillat, P., & Reis, R. (2007). Radiation effects on embedded systems. Berlin: Springer.

    Book  Google Scholar 

  10. Moore, G.E. Creaming more components onto integrated circuits. Electronics 38(8).

  11. Radaelli, D., Puchner, H., Wong, S., & Daniel, S. (2005). Investigation of multi-bit upsets in a 150 nm technology sram device. IEEE Transactions on Nuclear Science, 52(6), 2433–2437.

    Article  Google Scholar 

  12. Borkar, S. (2005). Designing reliable systems from unreliable components: the challenges of transistor variability and degradation. IEEE Micro, 25(6), 10–16. https://doi.org/10.1109/MM.2005.110.

    Article  Google Scholar 

  13. Hartman, A.S., Thomas, D.E., & Meyer, B.H. (2010). A case for lifetime-aware task mapping in embedded chip multiprocessors. In 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS) (pp. 145–154).

  14. Zhu, D., & Aydin, H. (2009). Reliability-aware energy management for periodic real-time tasks. IEEE Transactions on Computers, 58(10), 1382–1397.

    Article  MathSciNet  Google Scholar 

  15. FIDES-Group, Reliability Methodology for Electronic Systems (2010).

  16. Hardie, F.H., & Suhocki, R.J. (1967). Design and use of fault simulation for saturn computer design. IEEE Transactions on Electronic Computers EC, 16(4), 412–429. https://doi.org/10.1109/PGEC.1967.264644.

    Article  Google Scholar 

  17. Kooli, M., Bosio, A., Benoit, P., & Torres, L. (2015). Software testing and software fault injection. In 2015 10th International Conference on Design Technology of Integrated Systems in Nanoscale Era (DTIS) (pp. 1–6).

  18. Kooli, M., & Natale, G.D. (2014). A survey on simulation-based fault injection tools for complex systems. In 2014 9th IEEE International Conference on Design Technology of Integrated Systems in Nanoscale Era (DTIS) (pp. 1–6).

  19. Anceau, S., Bleuet, P., Clédière, J., Maingault, L., luc Rainard, J., & Tucoulou, R. (2017). Nanofocused x-ray beam to reprogram secure circuits. In Cryptographic Hardware and Embedded Systems – CHES 2017 of Lecture Notes in Computer Science. https://doi.org/10.1007/978-3-319-66787-4_9, (Vol. 10529 pp. 175–188): Springer.

  20. Abbasitabar, H., Zarandi, H.R., & Salamat, R. (2012). Susceptibility analysis of leon3 embedded processor against multiple event transients and upsets. In 2012 IEEE 15th International Conference on Computational Science and Engineering. https://doi.org/10.1109/ICCSE.2012.81 (pp. 548–553).

  21. Benjamin, P., Erraguntla, M., Delen, D., & Mayer, R. (1998). Simulation modeling at multiple levels of abstraction. In 1998 Winter Simulation Conference. Proceedings (Cat. No.98CH36274). https://doi.org/10.1109/WSC.1998.745013, (Vol. 1 pp. 391–398).

  22. Gajski, D.D., & Kuhn, R.H. (1983). New vlsi tools. Computer, 16(12), 11–14. https://doi.org/10.1109/MC.1983.1654264.

    Article  Google Scholar 

  23. Accellera, Systemc standard download page (2011). http://www.accellera.org/downloads/standards/systemc.

  24. Kanawati, G.A., Kanawati, N.A., & Abraham, J.A. (1995). Ferrari: a flexible software-based fault and error injection system. IEEE Transactions on Computers, 44(2), 248–260.

    Article  Google Scholar 

  25. Sanches, B.P., Basso, T., & Moraes, R. (2011). J-swfit: A java software fault injection tool. In 2011 5th Latin-American Symposium on Dependable Computing (pp. 106–115).

  26. Li, D., Vetter, J.S., & Yu, W. (2012). Classifying soft error vulnerabilities in extreme-scale scientific applications using a binary instrumentation tool. In SC ’12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. https://doi.org/10.1109/SC.2012.29 (pp. 1–11).

  27. Hari, S.K.S., Tsai, T., Stephenson, M., Keckler, S.W., & Emer, J. (2017). Sassifi: An architecture-level fault injection tool for gpu application resilience evaluation. In 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). https://doi.org/10.1109/ISPASS.2017.7975296 (pp. 249–258).

  28. Kaliorakis, M., Tselonis, S., Chatzidimitriou, A., Foutris, N., & Gizopoulos, D. (2015). Differential fault injection on microarchitectural simulators. In 2015 IEEE international symposium on Workload characterization (IISWC).

  29. Cheng, E., Mirkhani, S., Szafaryn, L.G., Cher, C., Cho, H., Skadron, K., Stan, M.R., Lilja, K., Abraham, J.A., Bose, P., & Mitra, S. CLEAR: cross-layer exploration for architecting resilience - combining hardware and software techniques to tolerate soft errors in processor cores, 1604.03062.

  30. Ozdemir, S., Sinha, D., Memik, G., Adams, J., & Zhou, H. (2006). Yield-aware cache architectures. In 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’06). https://doi.org/10.1109/MICRO.2006.52 (pp. 15–25).

  31. Quach, N. (2000). High availability and reliability in the itanium processor. IEEE Micro, 20(5), 61–69. https://doi.org/10.1109/40.877951.

    Article  Google Scholar 

  32. Alouani, I., Niar, S., Kurdahi, F., & Abid, M. (2012). Parity-based mono-copy cache for low power consumption and high reliability. In 2012 23rd IEEE International Symposium on Rapid System Prototyping (RSP). https://doi.org/10.1109/RSP.2012.6380689 (pp. 44–48).

  33. Qureshi, M.K., & Chishti, Z. (2013). Operating secded-based caches at ultra-low voltage with flair. In 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). https://doi.org/10.1109/DSN.2013.6575314 (pp. 1–11).

  34. Chen, C.L., & Hsiao, M. Y. (1984). Error-correcting codes for semiconductor memory applications: A state-of-the-art review. IBM J. Res. Dev., 28(2), 124–134. https://doi.org/10.1147/rd.282.0124.

    Article  Google Scholar 

  35. Saiz-Adalid, L., Reviriego, P., Gil, P., Pontarelli, S., & Maestro, J. A. (2015). Mcu tolerance in srams through low-redundancy triple adjacent error correction. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 23(10), 2332–2336.

    Article  Google Scholar 

  36. Kim, J., Hardavellas, N., Mai, K., Falsafi, B., & Hoe, J. (2007). Multi-bit error tolerant caches using two-dimensional error coding. In 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007) (pp. 197–209).

  37. Kim, J., Hardavellas, N., Mai, K., Falsafi, B., & Hoe, J. (2007). Multi-bit error tolerant caches using two-dimensional error coding. In 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007). https://doi.org/10.1109/MICRO.2007.19 (pp. 197–209).

  38. Bagatin, M., Gerardin, S., Paccagnella, A., Andreani, C., Gorini, G., & Frost, C. (2012). Temperature dependence of neutron-induced soft errors in srams. Microelectronics Reliability, 52(1), 289– 293.

    Article  Google Scholar 

  39. Kagiyama, Y., Okumura, S., Yanagida, K., Yoshimoto, S., Nakata, Y., Izumi, S., Kawaguchi, H., & Yoshimoto, M. (2012). Bit error rate estimation in sram considering temperature fluctuation. In Thirteenth International Symposium on Quality Electronic Design (ISQED) (pp. 516–519).

  40. Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T.M., Mudge, T., & Brown, R.B. (2001). Mibench: A free, commercially representative embedded benchmark suite, 3–14.

  41. Carlson, T.E., Heirman, W., & Eeckhout, L. (2011). Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulations. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC) (pp. 52:1–52:12).

  42. Chabot, A., Alouani, I., Niar, S., & Nouacer, R. (2018). A comprehensive fault injection strategy for embedded systems reliability assessment. In 2018 International Symposium on Rapid System Prototyping (RSP). https://doi.org/10.1109/RSP.2018.8631986 (pp. 22–28).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexandre Chabot.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chabot, A., Alouani, I., Nouacer, R. et al. A Memory Reliability Enhancement Technique for Multi Bit Upsets. J Sign Process Syst 93, 439–459 (2021). https://doi.org/10.1007/s11265-020-01603-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-020-01603-5

Keywords

Navigation