Area and energy efficient shift and accumulator unit for object detection in IoT applications

https://doi.org/10.1016/j.aej.2021.04.099Get rights and content
Under a Creative Commons license
open access

Abstract

Convolutional Neural Networks (CNNs) exhibit significant performance enhancements in several machine learning tasks such as surveillance, intelligent transportation, smart grids and healthcare systems. With the proliferation of physical things being connected to internet and enabled with sensory capabilities to form an Internet of Thing (IoT) network, it is increasingly important to run CNN inference, a computationally intensive application, on the resource constrained IoT devices. Object detection is a fundamental computer vision problem that provides information for image understanding in several artificial intelligence (AI) applications in smart cities. Among various object detection algorithms, CNN has emerged as a new paradigm to improve the overall performance. The Multiply-accumulate (MAC) operations, which are used repeatedly in the convolution layers of CNN, hold extreme computational complexity. Hence, the overall computational workloads and their respective energy consumption of any CNN applications are on the rise. To overcome these escalating challenges, approximate computing mechanism has played a vital role in reducing power and area of computation intensive CNN applications. In this paper, we have designed an approximate MAC architecture, termed Shift and Accumulator Unit (SAC), for the error-resilient CNN based object detection algorithm targeting embedded platforms. The proposed computing unit deliberately trades accuracy to reduce design complexity and power consumption, thus suiting the resource constrained IoT devices. The pipeline architecture of the SAC unit saves approximately 1.8× clock cycles than the non-pipeline SAC architecture. The performance evaluation shows that the proposed computing unit has better energy efficiency and resource utilization than the accurate multiplier and state-of-the-art approximate multipliers without noticeable deterioration in overall performance.

Keywords

Convolution operation
Object detection
MAC unit
Approximate computing
Embedded platform

Cited by (0)

Peer review under responsibility of Faculty of Engineering, Alexandria University.