Data clustering for efficient approximate computing

Jordan, Michael G.; Brandalero, Marcelo; Malfatti, Guilherme M.; Oliveira, Geraldo F.; Lorenzon, Arthur F.; da Silva, Bruno C.; Carro, Luigi; Rutzig, Mateus B.; Beck, Antonio Carlos S.

doi:10.1007/s10617-019-09228-z

Data clustering for efficient approximate computing

Published: 09 November 2019

Volume 24, pages 3–22, (2020)
Cite this article

Design Automation for Embedded Systems Aims and scope Submit manuscript

Michael G. Jordan ORCID: orcid.org/0000-0002-5776-2626³,
Marcelo Brandalero³,
Guilherme M. Malfatti³,
Geraldo F. Oliveira³,
Arthur F. Lorenzon¹,
Bruno C. da Silva³,
Luigi Carro³,
Mateus B. Rutzig² &
…
Antonio Carlos S. Beck³

319 Accesses
2 Citations
Explore all metrics

Abstract

Given the saturation of single-threaded performance improvements in General-Purpose Processor, novel architectural techniques are required to meet emerging demands. In this paper, we propose a generic acceleration framework for approximate algorithms that replaces function execution by table look-up accesses in dedicated memories. A strategy based on the K-Means Clustering algorithm is used to learn mappings from arbitrary function inputs to frequently occurring outputs at compile-time. At run-time, these learned values are fetched from dedicated look-up tables and the best result is selected using the Nearest-Centroid Classifier, which is implemented in hardware. The proposed approach improves over the state-of-the-art neural acceleration solution, with nearly 3X times better performance, \(18.72\%\) up to \(90.99\%\) energy reductions and \(17\%\) area savings under similar levels of quality, thus opening new opportunities for performance harvesting in approximate accelerators.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automated machine learning: past, present and future

Article Open access 18 April 2024

A review of convolutional neural network architectures and their optimizations

Article 22 June 2022

A Hybrid Machine Learning Model for Code Optimization

Article 22 September 2023

Notes

Presenting the details of this algorithm or ANN training is beyond the scope of this paper. We present here an overview with only enough details to allow a comparison with the approximation approach we developed.

References

Beck ACS, Lisba CAL, Carro L (2012) Adaptable embedded systems. Springer Publishing Company, Incorporated, Berlin
Google Scholar
Xu Q, Mytkowicz T, Kim NS (2016) Approximate computing: a survey. IEEE Des Test 33(1):8–22
Article Google Scholar
Mittal S (2016) A survey of techniques for approximate computing. ACM Comput Surv 48(4):1–33
Google Scholar
Sidiroglou-Douskos S, Misailovic S, Hoffmann H, Rinard M (2011) Managing performance versus accuracy trade-offs with loop perforation. In: Proceedings of the ACM SIGSOFT symposium and European conference on foundations of software engineering (SIGSOFT/FSE)
Brandalero M, da Silveira LA, Souza JD, Beck ACS (2017) Accelerating error-tolerant applications with approximate function reuse. Sci Comput Progr 165:54–67
Article Google Scholar
Hegde R, Shanbhag NR (1999) Energy-efficient signal processing via algorithmic noise-tolerance. In: Proceedings of the international symposium on low power electronics and design (ISPLED)
Mohapatra D, Chippa VK, Raghunathan A, Roy K (2011) Design of voltage-scalable meta-functions for approximate computing. In: Proceedings of the design, automation & test in Europe (DATE), pp 1–6
Brandalero M, Beck ACS, Carro L, Shafique M (2018) Approximate on-the-fly coarse-grained reconfigurable acceleration for general-purpose applications. In: Design automation conference (DAC), pp 1–6
Esmaeilzadeh H, Sampson A, Ceze L, Burger D (2012) Neural acceleration for general-purpose approximate programs. In: Proceedings of the international symposium on microarchitecture (MICRO), pp 449–460
Yazdanbakhsh A, Park J, Sharma, Lotfi-Kamran P, Esmaeilzadeh H (2015) Neural acceleration for GPU throughput processors. In: Proceedings of the international symposium on microarchitecture (MICRO), pp 482–493
Moreau T et al. (2015) SNNAP: approximate computing on programmable SoCs via neural acceleration. In: Proceedings of the international symposium on high performance computer architecture (HPCA), pp 603–614
St. Amant R et al (2014) General-purpose code acceleration with limited-precision analog computation. ACM SIGARCH Comput Arch News 42(3):505–516
Article Google Scholar
Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366
Article Google Scholar
Chaudhuri S, Gulwani S, Lublinerman R, Navidpour S (2011) Proving programs robust. In: Proceedings of the ACM SIGSOFT symposium and european conference on foundations of software engineering (SIGSOFT/FSE), p 102
Yazdanbakhsh A, Mahajan D, Lotfi-Kamran P, Esmaeilzadeh H (2016) AxBench: a multiplatform benchmark suite for approximate computing. IEEE Des Test 34(2):60–68
Article Google Scholar
Muralimanohar N, Balasubramonian R, Jouppi NP (2009) CACTI 6.0: a tool to model large caches. Technical Report, HP Laboratories
Browne S, Dongarra J, Garner N, Ho G, Mucci P (2000) A portable programming interface for performance evaluation on modern processors. Int J High Perform Comput Appl 14(3):189–204
Article Google Scholar
Han J, Orshansky M (2013) Approximate computing: an emerging paradigm for energy-efficient design. In: Proceedings of the European test symposium (ETS), pp 1–6
Hoffmann H et al. (2011) Dynamic knobs for responsive power-aware computing. In: ACM SIGARCH computer architecture news, vol 39, no 1. ACM, pp 199–212
Misailovic S, Sidiroglou S, Hoffmann H, Rinard M (2010) Quality of service profiling. In: Proceedings of the international conference on software engineering (ICSE), p 25
Mengte J, Raghunathan A, Chakradhar S, Byna S (2010) Exploiting the forgiving nature of applications for scalable parallel execution. In: IEEE international symposium on parallel and distributed processing (IPDPS). IEEE, pp 1–12
Misailovic S, Sidiroglou S, Rinard MC (2012) Dancing with uncertainty. In: Proceedings of the 2012 ACM workshop on relaxing synchronization for multicore and manycore scalability. ACM, pp 51–60
Recht B, Re C, Wright S, Niu F (2011) Hogwild: a lock-free approach to parallelizing stochastic gradient descent. Adv Neural Inf Process Syst 693–701
Renganarayana L, Srinivasan V, Nair R, Prener D (2012) Programming with relaxed synchronization. In: Proceedings of the 2012 ACM workshop on relaxing synchronization for multicore and manycore scalability. ACM, pp 41–50
Grigorian B, Farahpour N, Reinman G (2015) BRAINIAC: bringing reliable accuracy into neurally-implemented approximate computing. In: International symposium on high performance computer architecture (HPCA), pp 615–626
Chen T et al. (2012) BenchNN: on the broad potential application scope of hardware neural network accelerators. In: Proceedings of the international symposium on workload characterization (IISWC), pp 36–45
Ionica MH, Gregg D (2015) The movidius myriad architecture’s potential for scientific computing. IEEE Micro 35(1):6–14
Article Google Scholar
Chen Y-H, Krishna T, Emer JS, Sze V (2016) Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J Solid-State Circuits 52(1):127–138
Article Google Scholar
Yoffie DB (2014) Mobileye: the future of driverless cars. Harvard Business School Case, Boston, pp 421–715
Google Scholar
Pham P-H et al (2012) Neuflow: dataflow vision processing system-on-a-chip. In: IEEE 55th international midwest symposium on circuits and systems (MWSCAS). IEEE, pp 1044–1047
Shoushtari M, BanaiyanMofrad A, Dutt N (2015) Exploiting partially-forgetful memories for approximate computing. IEEE Embed Syst Lett 7(1):19–22
Article Google Scholar
Shafique M, Hafiz R, Rehman S, El-Harouni W, Henkel J (2016) Cross-layer approximate computing: from logic to architectures. In: Design automation conference (DAC), pp 1–6
Alvarez C, Corbal J, Valero M (2005) Fuzzy memoization for floating-point multimedia applications. IEEE Trans Comput 54(7):922–927
Article Google Scholar
Liu S, Pattabiraman K, Moscibroda T, Zorn BG (2009) Flicker: saving refresh-power in mobile devices through critical data partitioning. In: Proceedings of the international conference on architectural support for programming languages and operating systems (ASPLOS’09). Citeseer
Lucas J, Alvarez-Mesa M, Andersch M, Juurlink B (2014) Sparkk: quality-scalable approximate storage in dram. In: Memory Forum 1–9
Chang IJ, Mohapatra D, Roy K (2011) A priority-based 6t/8t hybrid sram architecture for aggressive voltage scaling in video applications. IEEE Trans Circuits Syst Video Technol 21(2):101–112
Article Google Scholar
Werbos PJ (1990) Backpropagation through time: what it does and how to do it. Proc IEEE 78(10):1550–1560
Article Google Scholar
Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recognit Lett 31(8):651–666
Article Google Scholar
Suresh A, Swamy BN, Rohou E, Seznec A (2015) Intercepting functions for memoization: a case study using transcendental functions. ACM Trans Archit Code Optim (TACO) 12(2):18
Google Scholar
Sampson A et al (2011) EnerJ: approximate data types for safe and general low-power computation. In: Proceedings of the conference on programming language design and implementation (PLDI), vol 46, no 6, p 164
Baek W, Chilimbi TM (2010) Green: a framework for supporting energy-conscious programming using controlled approximation. In: ACM sigplan notices, vol 45, no 6. ACM, pp 198–209
Esmaeilzadeh H, Sampson A, Ceze L, Burger D (2012) Architecture support for disciplined approximate programming. In: ACM SIGPLAN notices, vol 47, no 4. ACM, pp 301–312

Download references

Author information

Authors and Affiliations

Campus Alegrete, Universidade Federal do Pampa (UNIPAMPA), Bagé, Brazil
Arthur F. Lorenzon
Universidade Federal de Santa Maria (UFSM), Santa Maria, Brazil
Mateus B. Rutzig
Institute of Informatics, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, Brazil
Michael G. Jordan, Marcelo Brandalero, Guilherme M. Malfatti, Geraldo F. Oliveira, Bruno C. da Silva, Luigi Carro & Antonio Carlos S. Beck

Authors

Michael G. Jordan
View author publications
You can also search for this author in PubMed Google Scholar
Marcelo Brandalero
View author publications
You can also search for this author in PubMed Google Scholar
Guilherme M. Malfatti
View author publications
You can also search for this author in PubMed Google Scholar
Geraldo F. Oliveira
View author publications
You can also search for this author in PubMed Google Scholar
Arthur F. Lorenzon
View author publications
You can also search for this author in PubMed Google Scholar
Bruno C. da Silva
View author publications
You can also search for this author in PubMed Google Scholar
Luigi Carro
View author publications
You can also search for this author in PubMed Google Scholar
Mateus B. Rutzig
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Carlos S. Beck
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael G. Jordan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This study was financed in part by the CoordenaÇão de AperfeiÇoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001. The authors would also like to thank CNPq and FAPERGS for partial support.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jordan, M.G., Brandalero, M., Malfatti, G.M. et al. Data clustering for efficient approximate computing. Des Autom Embed Syst 24, 3–22 (2020). https://doi.org/10.1007/s10617-019-09228-z

Download citation

Received: 23 May 2019
Accepted: 01 November 2019
Published: 09 November 2019
Issue Date: March 2020
DOI: https://doi.org/10.1007/s10617-019-09228-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data clustering for efficient approximate computing

Abstract

Access this article

Similar content being viewed by others

Automated machine learning: past, present and future

A review of convolutional neural network architectures and their optimizations

A Hybrid Machine Learning Model for Code Optimization

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Data clustering for efficient approximate computing

Abstract

Access this article

Similar content being viewed by others

Automated machine learning: past, present and future

A review of convolutional neural network architectures and their optimizations

A Hybrid Machine Learning Model for Code Optimization

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation