Skip to main content

Advertisement

Log in

Potential analysis of a superscalar core employing a reconfigurable array for improving instruction-level parallelism

  • Published:
Design Automation for Embedded Systems Aims and scope Submit manuscript

Abstract

As technology scaling reduces pace and energy efficiency becomes a new important design constraint, superscalar processor designs are reaching their performance limits due to area and power restrictions. As a result, new microarchitectural paradigms need to be developed. This work proposes a new organization for x86 processors, based on a traditional superscalar design coupled to a reconfigurable array. The system exploits the fact that few basic blocks are responsible for most of the instructions that execute in the processor, and transforms these basic blocks into configurations for the reconfigurable array. Each configuration encodes the semantics and dependencies for all instructions in the block, so that the ones already mapped can execute bypassing the fetch, decode and dependency checks stages and improving instruction throughput. Our study on the potential of the architecture shows that performance gains of up to 2.5\(\times \) with respect to a traditional superscalar can be achieved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Altman E, Kaeli D, Sheffer Y (2000) Welcome to the opportunities of binary translation. Computer 33(3):40–45. doi:10.1109/2.825694

    Article  Google Scholar 

  2. Beck ACS, Rutzig MB, Gaydadjiev G, Carro L (2008) Transparent reconfigurable acceleration for heterogeneous embedded applications. In: Proceedings of the conference on design, automation and test in Europe (DATE ’08). ACM Press, p 1208. doi:10.1145/1403375.1403669

  3. Beck ACS, Carro L (2010) Dynamic reconfigurable architectures and transparent optimization techniques. Springer, Berlin

    MATH  Google Scholar 

  4. Beck ACS, Lisboa CAL, Carro L (2012) Adaptable embedded systems. Springer, London

    Google Scholar 

  5. Beck ACS, Rutzig MB, Carro L (2014) A transparent and adaptive reconfigurable system. Microprocess Microsyst 38(5):509–524. doi:10.1016/j.micpro.2014.03.004

    Article  Google Scholar 

  6. Berticelli Lo T, Beck ACS, Rutzig MB, Carro L (2010) A low-energy approach for context memory in reconfigurable systems. In: 2010 IEEE international symposium on parallel & distributed processing, workshops and phd forum (IPDPSW), IEEE, pp 1–8. doi:10.1109/IPDPSW.2010.5470745

  7. Borkar S, Chien AA (2011) The future of microprocessors. Commun ACM 54(5):67. doi:10.1145/1941487.1941507

    Article  Google Scholar 

  8. Clark N, Kudlur M, Mahlke S, Flautner K (2004) Application-specific processing on a general-purpose core via transparent instruction set customization. In: 37th international symposium on microarchitecture (MICRO-37’04), pp 30–40. doi:10.1109/MICRO.2004.5

  9. Compton K, Hauck S (2002) Reconfigurable computing: a survey of systems and software. ACM Comput Surv 34(2):171–210. doi:10.1145/508352.508353

    Article  Google Scholar 

  10. Dixon M, Hammarlund P, Jourdan S, Singhal R (2010) The next generation intel core microarchitecture. Intel Technol J 14(3):8–28

    Google Scholar 

  11. Fajardo J, Rutzig MB, Carro L, Beck ACS (2013) Towards a multiple-ISA embedded system. J Syst Archit 59(2):103–119. doi:10.1016/j.sysarc.2012.10.001

    Article  Google Scholar 

  12. Flynn M, Hung P (2005) Microprocessor design issues: thoughts on the road ahead. Micro IEEE 25(3):16–31. doi:10.1109/MM.2005.56

    Article  Google Scholar 

  13. Folegnani D, Gonzalez A (2001) Energy-effective issue logic. In: Proceedings 28th annual international symposium on computer architecture. IEEE Computer Society, pp 230–239. doi:10.1109/ISCA.2001.937452

  14. Gupta SB, Feng S, Ansari A, Mahlke S, August D (2011) Bundled execution of recurring traces for energy-efficient general purpose processing. In: Proceedings of the annual international symposium on microarchitecture (MICRO), pp 12–23. doi:10.1145/2155620.2155623

  15. Guthaus M, Ringenberg J, Ernst D, Austin T, Mudge T, Brown R (2001) MiBench: A free, commercially representative embedded benchmark suite. In: Proceedings of the fourth annual IEEE international workshop on workload characterization. WWC-4 (Cat. No.01EX538), pp 3–14. doi:10.1109/WWC.2001.990739

  16. Henessy JL, David A Patterson (2011) Computer architecture: a quantitative approach, 5th edn. Morgan Kaufmann, San Francisco

    Google Scholar 

  17. Hinton G, Sager D, Upton M, Boggs D, Carmean D, Kyker A, Roussel P (2001) The microarchitecture of the Pentium 4 processor. Intel Technol J 5(1):1–13

    Google Scholar 

  18. Intel (2014) Intel 64 and IA-32 Architectures optimization reference manual. http://www.intel.com.br/content/dam/doc/manual/64-ia-32-architectures-optimization-manual.pdf

  19. Lysecky R, Stitt G, Vahid F (2006) Warp processors. ACM Trans Des Autom Electron Syst 11(3):659–681. doi:10.1145/1142980.1142986

    Article  Google Scholar 

  20. Olukotun K, Hammond L (2005) The future of microprocessors. Queue 3(7):26. doi:10.1145/1095408.1095418

    Article  Google Scholar 

  21. Rotenberg E, Bennett S, Smith JE (1996) Trace cache: a low latency approach to high bandwidth instruction fetching. In: Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture, pp 24–35

  22. Ubal R, Jang B, Mistry P, Schaa D, Kaeli D (2012) Multi2Sim: a simulation framework for CPU-GPU computing. In: Proceedings of the 21st international conference on Parallel architectures and compilation techniques (PACT ’12). ACM Press, New York, p 335. doi:10.1145/2370816.2370865

  23. Wall DW (1991) Limits of instruction-level parallelism. ACM SIGPLAN Notices 26(4):176–188. doi:10.1145/106973.106991

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marcelo Brandalero.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Brandalero, M., Beck, A.C.S. Potential analysis of a superscalar core employing a reconfigurable array for improving instruction-level parallelism. Des Autom Embed Syst 20, 155–169 (2016). https://doi.org/10.1007/s10617-016-9174-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10617-016-9174-4

Keywords

Navigation