样式: 排序: IF: - GO 导出 标记为已读
-
NAVIDRO, a CARES architectural style for configuring drone co-simulation ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-03-17 Loic Salmon, Pierre-Yves Pillain, Goulven Guillou, Jean-Philippe Babau
One primary objective of drone simulation is to evaluate diverse drone configurations and contexts aligned with specific user objectives. The initial challenge for simulator designers involves managing the heterogeneity of drone components, encompassing both software and hardware systems, as well as the drone’s behavior. To facilitate the integration of these diverse models, the Functional Mock-Up
-
REC: REtime Convolutional layers to fully exploit harvested energy for ReRAM-based CNN accelerators ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-03-15 Kunyu Zhou, Keni Qiu
As the Internet of Things (IoTs) increasingly combines AI technology, it is a trend to deploy neural network algorithms at edges and make IoT devices more intelligent than ever. Moreover, energy-harvesting technology-based IoT devices have shown the advantages of green and low-carbon economy, convenient maintenance, and theoretically infinite lifetime, etc. However, the harvested energy is often unstable
-
Implementing Privacy Homomorphism with Random Encoding and Computation Controlled by a Remote Secure Server ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-03-08 Kevin Hutto, Vincent Mooney
Remote IoT devices face significant security risks due to their inherent physical vulnerability. An adversarial actor with sufficient capability can monitor the devices or exfiltrate data to access sensitive information. Remotely deployed devices such as sensors need enhanced resilience against memory leakage if performing privileged tasks. To increase the security and trust of these devices we present
-
Toward Energy Efficient STT-MRAM-based Near Memory Computing Architecture for Embedded Systems ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-03-07 Yueting Li, Xueyan Wang, He Zhang, Biao Pan, Keni Qiu, Wang Kang, Jun Wang, Weisheng Zhao
Convolutional Neural Networks (CNNs) have significantly impacted embedded system applications across various domains. However, this exacerbates the real-time processing and hardware resource-constrained challenges of embedded systems. To tackle these issues, we propose spin-transfer torque magnetic random-access memory (STT-MRAM)-based near memory computing (NMC) design for embedded systems. We optimize
-
Energy Management for Fault-Tolerant (m,k)-Constrained Real-Time Systems that Use Standby-Sparing ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-02-21 Linwei Niu, Danda B. Rawat, Dakai Zhu, Jonathan Musselwhite, Zonghua Gu, Qingxu Deng
Fault tolerance, energy management, and quality of service (QoS) are essential aspects for the design of real-time embedded systems. In this work, we focus on exploring methods that can simultaneously address the above three critical issues under standby-sparing. The standby-sparing mechanism adopts a dual-processor architecture in which each processor plays the role of the backup for the other one
-
Elements of Timed Pattern Matching ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-02-10 Dogan Ulus, Thomas Ferrère, Eugene Asarin, Dejan Nickovic, Oded Maler
The rise of machine learning and cloud technologies has led to a remarkable influx of data within modern cyber-physical systems. However, extracting meaningful information from this data has become a significant challenge due to its volume and complexity. Timed pattern matching has emerged as a powerful specification-based runtime verification and temporal data analysis technique to address this challenge
-
SPIMulator: A Spintronic Processing-In-Memory Simulator for Racetracks ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-02-08 Pavia Bera, Stephen Cahoon, Sanjukta Bhanja, Alex Jones
In-memory processing is becoming a popular method to alleviate the memory bottleneck of the von Neumann computing model. With the goal of improving both latency and energy cost associated with such in-memory processing, emerging non-volatile memory technologies, such as Spintronic magnetic memory, are of particular interest as they can provide a near-SRAM read/write performance and eliminate nearly
-
STDF: Spatio-Temporal Deformable Fusion for Video Quality Enhancement on Embedded Platforms ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-02-08 Jianing Deng, Shunjie Dong, Lvcheng Chen, Jingtong Hu, Cheng Zhuo
With the development of embedded systems and deep learning, it is feasible to combine them for offering various and convenient human-centered services, which is based on high-quality (HQ) videos. However, due to the limit of video traffic load and unavoidable noise, the visual quality of an image from an edge camera may degrade significantly, influencing the overall video and service quality. To maintain
-
A Space-Grained Cleaning Method to Reduce Long-Tail Latency of DM-SMR Disks ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-02-05 Chin-Hsien Wu, Cheng-Tze Lee, Yi-Ren Tsai, Cheng-Yen Wu
DM-SMR (device-managed shingled magnetic recording) disks allocate a portion of disk space as the persistent cache (PC) to address the issue of overlapping tracks during data updates. When the PC space becomes insufficient, a space cleaning is triggered to reclaim its invalid space. However, the space cleaning is time-consuming and contributes to the long-tail latency of DM-SMR disks. In the paper
-
Compact Instruction Set Extensions for Dilithium ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-02-02 Lu Li, Qi Tian, Guofeng Qin, Shuaiyu Chen, Weijia Wang
Post-quantum cryptography is considered to provide security against both traditional and quantum computer attacks. Dilithium is a digital signature algorithm that derives its security from the challenge of finding short vectors in lattices. It has been selected as one of the standardizations in the NIST post-quantum cryptography project. Hardware-software co-design is a commonly adopted implementation
-
Flexible Updating of Internet of Things Computing Functions through Optimizing Dynamic Partial Reconfiguration ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-02-01 George Kornaros, Svoronos Leivadaros, Filippos Kolimbianakis
With applications to become increasingly compute- and data-intensive requiring more processing power, many internet-of-things (IoT) platforms in robots, drones, and autonomous vehicles which implement neural network inference, cryptographic functions or signal processing (e.g., multimedia, communication), employ field programmable gate arrays (FPGAs). At the same time, dynamic partial reconfiguration
-
Customized FPGA Implementation of Authenticated Lightweight Cipher Fountain for IoT Systems ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-26 Zhengyuan Shi, Cheng Chen, Gangqiang Yang, Hongchao Zhou, Hailiang Xiong, Zhiguo Wan
Authenticated Encryption with Associated-Data (AEAD) can ensure both confidentiality and integrity of information in encrypted communication. Distinctive variants are customized from AEAD to satisfy various requirements. In this paper, we take a 128-bit lightweight AEAD stream cipher Fountain as an example. We provide a general cryptographic solution with three Fountain variants. These three variants
-
Intelligent Caching for Vehicular Dew Computing in Poor Network Connectivity Environments ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-25 Liang Zhao, Hongxuan Li, Enchao Zhang, Ammar Hawbani, Mingwei Lin, Shaohua Wan, Mohsen Guizani
In vehicular networks, some edge servers may not function properly due to the time-varying load condition and the uneven computing resource distribution, resulting in a low quality of caching services. To overcome this challenge, we develop a Vehicular dew computing (VDC) architecture for the first time by combining dew computing with vehicular networks, which can achieve wireless communication between
-
PolyARBerNN: A Neural Network Guided Solver and Optimizer for Bounded Polynomial Inequalities ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-24 Wael Fatnassi, Yasser Shoukry
Constraints solvers play a significant role in the analysis, synthesis, and formal verification of complex cyber-physical systems. In this paper, we study the problem of designing a scalable constraints solver for an important class of constraints named polynomial constraint inequalities (also known as nonlinear real arithmetic theory). In this paper, we introduce a solver named PolyARBerNN that uses
-
Adversarial Transferability in Embedded Sensor Systems: An Activity Recognition Perspective ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-22 Ramesh Kumar Sah, Hassan Ghasemzadeh
Machine learning algorithms are increasingly used for inference and decision-making in embedded systems. Data from sensors are used to train machine learning models for various smart functions of embedded and cyber-physical systems ranging from applications in healthcare, autonomous vehicles, and national security. However, recent studies have shown that machine learning models can be fooled by adding
-
Stash: Flexible Energy Storage for Intermittent Sensors ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-19 Arwa Alsubhi, Simeon Babatunde, Nicole Tobias, Jacob Sorber
Batteryless sensors promise a sustainable future for sensing, but they face significant challenges when storing and using environmental energy. Incoming energy can fluctuate unpredictably between periods of scarcity and abundance, and device performance depends on both incoming energy and how much a device can store. Existing batteryless devices have used fixed or run-time selectable front-end capacitor
-
Multi-Compression Scale DNN Inference Acceleration based on Cloud-Edge-End Collaboration ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-19 Huamei Qi, Fang Ren, Leilei Wang, Ping Jiang, Shaohua Wan, Xiaoheng Deng
Edge intelligence has emerged as a promising paradigm to accelerate DNN inference by model partitioning, which is particularly useful for intelligent scenarios that demand high accuracy and low latency. However, the dynamic nature of the edge environment and the diversity of end devices pose a significant challenge for DNN model partitioning strategies. Meanwhile, limited resources of the edge server
-
Robust Embedded Autonomous Driving Positioning System Fusing LiDAR and Inertial Sensors ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-19 Zhijian He, Bohuan Xue, Xiangcheng Hu, Zhaoyan Shen, Xiangyue Zeng, Ming Liu
Autonomous driving emphasizes precise multi-sensor fusion positioning on limit resource embedded systems. LiDAR-centered sensor fusion system serves as a mainstream navigation system due to its insensitivity to illumination and viewpoint change. However, these types of systems suffer from handling large-scale sequential LiDAR data using limited resources on board, leading LiDAR-centralized sensor fusion
-
LL-GNN: Low Latency Graph Neural Networks on FPGAs for High Energy Physics ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-15 Zhiqiang Que, Hongxiang Fan, Marcus Loo, He Li, Michaela Blott, Maurizio Pierini, Alexander Tapper, Wayne Luk
This work presents a novel reconfigurable architecture for Low Latency Graph Neural Network (LL-GNN) designs for particle detectors, delivering unprecedented low latency performance. Incorporating FPGA-based GNNs into particle detectors presents a unique challenge since it requires sub-microsecond latency to deploy the networks for online event selection with a data rate of hundreds of terabytes per
-
COBRRA: COntention-aware cache Bypass with Request-Response Arbitration ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-10 Aritra Bagchi, Dinesh Joshi, Preeti Ranjan Panda
In modern multi-processor systems-on-chip (MPSoCs), requests from different processor cores, accelerators, and their responses from the lower-level memory contend for the shared cache bandwidth, making it a critical performance bottleneck. Prior research on shared cache management has considered requests from cores but has ignored crucial contributions from their responses. Prior cache bypass techniques
-
A Hierarchical Classification Method for High-accuracy Instruction Disassembly with Near-field EM Measurements ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-10 Vishnuvardhan V. Iyer, Aditya Thimmaiah, Michael Orshansky, Andreas Gerstlauer, Ali E. Yilmaz
Electromagnetic (EM) fields have been extensively studied as potent side-channel tools for testing the security of hardware implementations. In this work, a low-cost side-channel disassembler that uses fine-grained EM signals to predict a program's execution trace with high accuracy is proposed. Unlike conventional side-channel disassemblers, the proposed disassembler does not require extensive randomized
-
PArtNNer: Platform-Agnostic Adaptive Edge-Cloud DNN Partitioning for Minimizing End-to-End Latency ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-10 Soumendu Kumar Ghosh, Arnab Raha, Vijay Raghunathan, Anand Raghunathan
The last decade has seen the emergence of Deep Neural Networks (DNNs) as the de facto algorithm for various computer vision applications. In intelligent edge devices, sensor data streams acquired by the device are processed by a DNN application running on either the edge device itself or in the cloud. However, “edge-only” and “cloud-only” execution of State-of-the-Art DNNs may not meet an application’s
-
Minimal-Overlap Centrality for Multi-Gateway Designation in Real-Time TSCH Networks ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-10 Miguel Gutiérrez Gaitán, Luís Almeida, Pedro M. D’orey, Pedro M. Santos, Thomas Watteyne
This article presents a novel centrality-driven gateway designation framework for the improved real-time performance of low-power wireless sensor networks (WSNs) at system design time. We target time-synchronized channel hopping (TSCH) WSNs with centralized network management and multiple gateways with the objective of enhancing traffic schedulability by design. To this aim, we propose a novel network
-
Introduction to the Special Issue on Real-Time Computing in the IoT-to-Edge-to-Cloud Continuum ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-10 Daniel Casini, Dakshina Dasari, Matthias Becker, Giorgio Buttazzo
No abstract available.
-
Distributed Task Offloading and Resource Purchasing in NOMA-Enabled Mobile Edge Computing: Hierarchical Game Theoretical Approaches ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-10 Ying Chen, Jie Zhao, Jintao Hu, Shaohua Wan, Jiwei Huang
As the computing resources and the battery capacity of mobile devices are usually limited, it is a feasible solution to offload the computation-intensive tasks generated by mobile devices to edge servers (ESs) in mobile edge computing (MEC). In this article, we study the multi-user multi-server task offloading problem in MEC systems, where all users compete for the limited communication resources and
-
Multi-criteria Optimization of Real-time DAGs on Heterogeneous Platforms under P-EDF ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-10 Tommaso Cucinotta, Alexandre Amory, Gabriele Ara, Francesco Paladino, Marco Di Natale
This article tackles the problem of optimal placement of complex real-time embedded applications on heterogeneous platforms. Applications are composed of directed acyclic graphs of tasks, with each directed-acyclic-graph (DAG) having a minimum inter-arrival period for its activation requests and an end-to-end deadline within which all of the computations need to terminate since each activation. The
-
Hierarchical Resource Orchestration Framework for Real-time Containers ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-10 Václav Struhár, Silviu S. Craciunas, Mohammad Ashjaei, Moris Behnam, Alessandro V. Papadopoulos
Container-based virtualization is a promising deployment model in fog and edge computing applications, because it allows a seamless co-existence of virtualized applications in a heterogeneous environment without introducing significant overhead. Certain application domains (e.g., industrial automation, automotive, or aerospace) mandate that applications exhibit a certain degree of temporal predictability
-
Criticality-aware Monitoring and Orchestration for Containerized Industry 4.0 Environments ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-10 Marco Barletta, Marcello Cinque, Luigi De Simone, Raffaele Della Corte
The evolution of industrial environments makes the reconfigurability and flexibility key requirements to rapidly adapt to changeable market needs. Computing paradigms like Edge/Fog computing are able to provide the required flexibility and scalability while guaranteeing low latencies and response times. Orchestration systems play a key role in these environments, enforcing automatic management of resources
-
Secure and Lightweight Blockchain-based Truthful Data Trading for Real-Time Vehicular Crowdsensing ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-10 Haitao Xu, Saiyu Qi, Yong Qi, Wei Wei, Naixue Xiong
As the number of smart cars grows rapidly, vehicular crowdsensing (VCS) is gradually becoming popular. In a VCS infrastructure, sensing devices and computing units hold on smart cars as well as cloud servers form an IoT-edge-cloud continuum to perform real-time sensing tasks. In order to encourage the smart cars to participate in the real-time VCS process, blockchain technology can be combined with
-
Deadline-Aware Task Offloading for Vehicular Edge Computing Networks Using Traffic Light Data ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-10 Pratham Oza, Nathaniel Hudson, Thidapat Chantem, Hana Khamfroush
As vehicles have become increasingly automated, novel vehicular applications have emerged to enhance the safety and security of the vehicles and improve user experience. This brings ever-increasing data and resource requirements for timely computation by the vehicle’s on-board computing systems. To meet these demands, prior work proposes deploying vehicular edge computing (VEC) resources in road-side
-
Energy-Aware Adaptive Mixed-Criticality Scheduling with Semi-Clairvoyance and Graceful Degradation ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-10 Yi-Wen Zhang, Hui Zheng, Zonghua Gu
The classic Mixed-Criticality System (MCS) task model is a non-clairvoyance model in which the change of the system behavior is based on the completion of high-criticality tasks while dropping low-criticality tasks in high-criticality mode. In this paper, we simultaneously consider graceful degradation and semi-clairvoyance in MCS. We first propose the analysis for adaptive mixed-criticality with semi-clairvoyance
-
Virtual Environment Model Generation for CPS Goal Verification using Imitation Learning ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-10 Yong-Jun Shin, Donghwan Shin, Doo-Hwan Bae
Cyber-Physical Systems (CPS) continuously interact with their physical environments through embedded software controllers that observe the environments and determine actions. Field Operational Tests (FOT) are essential to verify to what extent the CPS under analysis can achieve certain CPS goals, such as satisfying the safety and performance requirements, while interacting with the real operational
-
Modeling and Analysis of ETC Control System with Colored Petri Net and Dynamic Slicing ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-10 Wangyang Yu, Jinming Kong, Zhijun Ding, Xiaojun Zhai, Zhiqiang Li, Qi Guo
Nowadays, Electronic Toll Collection (ETC) control systems have been widely adopted to smoothen traffic flow on highways. However, as it is a complex business interaction system, there are inevitably flaws in its control logic process, such as the problem of vehicle fee evasion. We find that there is more than one way for vehicles to evade fees. This shows that it is difficult to ensure the completeness
-
Securing Pacemakers using Runtime Monitors over Physiological Signals ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2024-01-06 Abhinandan Panda, Srinivas Pinisetty, Partha Roop
Wearable and implantable medical devices (IMDs) are increasingly deployed to diagnose, monitor, and provide therapy for critical medical conditions. Such medical devices are safety-critical cyber-physical systems (CPSs). These systems support wireless features introducing potential security vulnerabilities. Although these devices undergo rigorous safety certification processes, runtime security attacks
-
A Design Flow for Scheduling Spiking Deep Convolutional Neural Networks on Heterogeneous Neuromorphic System-on-Chip ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-12-02 Anup Das
Neuromorphic systems-on-chip (NSoCs) integrate CPU cores and neuromorphic hardware accelerators on the same chip. These platforms can execute spiking deep convolutional neural networks (SDCNNs) with a low energy footprint. Modern NSoCs are heterogeneous in terms of their computing, communication, and storage resources. This makes scheduling SDCNN operations a combinatorial problem of exploring an exponentially-large
-
IoV-Fog-Assisted Framework for Accident Detection and Classification ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-11-24 Navin Kumar, Sandeep Kumar Sood, Munish Saini
The evolution of vehicular research into an effectuating area like the Internet of Vehicles (IoV) was verified by technical developments in hardware. The integration of the Internet of Things (IoT) and Vehicular Ad-hoc Networks (VANET) has significantly impacted addressing various problems, from dangerous situations to finding practical solutions. During a catastrophic collision, the vehicle experiences
-
Accelerating Attention Mechanism on FPGAs based on Efficient Reconfigurable Systolic Array ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-11-09 Wenhua Ye, Xu Zhou, Joey Zhou, Cen Chen, Kenli Li
Transformer model architectures have recently received great interest in natural language, machine translation, and computer vision, where attention mechanisms are their building blocks. However, the attention mechanism is expensive because of its intensive matrix computations and complicated data flow. The existing hardware architecture has some disadvantages for the computing structure of attention
-
High-performance Reconfigurable DNN Accelerator on a Bandwidth-limited Embedded System ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-11-09 Xianghong Hu, Hongmin Huang, Xueming Li, Xin Zheng, Qinyuan Ren, Jingyu He, Xiaoming Xiong
Deep convolutional neural networks (DNNs) have been widely used in many applications, particularly in machine vision. It is challenging to accelerate DNNs on embedded systems because real-world machine vision applications should reserve a lot of external memory bandwidth for other tasks, such as video capture and display, while leaving little bandwidth for accelerating DNNs. In order to solve this
-
Scheduling Dynamic Software Updates in Mobile Robots ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-11-09 Ahmed El Yaacoub, Luca Mottola, Thiemo Voigt, Philipp Rümmer
We present NeRTA (Next Release Time Analysis), a technique to enable dynamic software updates for low-level control software of mobile robots. Dynamic software updates enable software correction and evolution during system operation. In mobile robotics, they are crucial to resolve software defects without interrupting system operation or to enable on-the-fly extensions. Low-level control software for
-
SG-Float: Achieving Memory Access and Computing Power Reduction Using Self-Gating Float in CNNs ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-11-09 Jun-Shen Wu, Tsen-Wei Hsu, Ren-Shuo Liu
Convolutional neural networks (CNNs) are essential for advancing the field of artificial intelligence. However, since these networks are highly demanding in terms of memory and computation, implementing CNNs can be challenging. To make CNNs more accessible to energy-constrained devices, researchers are exploring new algorithmic techniques and hardware designs that can reduce memory and computation
-
FD-CNN: A Frequency-Domain FPGA Acceleration Scheme for CNN-Based Image-Processing Applications ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-11-09 Xiaoyang Wang, Zhe Zhou, Zhihang Yuan, Jingchen Zhu, Yulong Cao, Yao Zhang, Kangrui Sun, Guangyu Sun
In the emerging edge-computing scenarios, FPGAs have been widely adopted to accelerate convolutional neural network (CNN)–based image-processing applications, such as image classification, object detection, and image segmentation, and so on. A standard image-processing pipeline first decodes the collected compressed images from Internet of Things (IoTs) to RGB data, then feeds them into CNN engines
-
An Intermediate-Centric Dataflow for Transposed Convolution Acceleration on FPGA ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-11-09 Zhengzheng Ma, Tuo Dai, Xuechao Wei, Guojie Luo
Transposed convolution has been prevailing in convolutional neural networks (CNNs), playing an important role in multiple scenarios such as image segmentation and back-propagation process of training CNNs. This mainly benefits from the ability to up-sample the input feature maps by interpolating new information from the input feature pixels. However, the backward-stencil computation constrains its
-
On the RTL Implementation of FINN Matrix Vector Unit ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-11-09 Syed Asad Alam, David Gregg, Giulio Gambardella, Thomas Preusser, Michaela Blott
Field-programmable gate array (FPGA)–based accelerators are becoming increasingly popular for deep neural network (DNN) inference due to their ability to scale performance with increasing degrees of specialization with dataflow architectures or custom data type precision. In order to reduce the barrier for software engineers and data scientists to adopt FPGAs, C++- and OpenCL-based design entries with
-
ACDSE: A Design Space Exploration Method for CNN Accelerator based on Adaptive Compression Mechanism ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-11-09 Kaijie Feng, Xiaoya Fan, Jianfeng An, Chuxi Li, Kaiyue Di, Jiangfei Li
Customized accelerators for Convolutional Neural Network (CNN) can achieve better energy efficiency than general computing platforms. However, the design of a high-performance accelerator should take into account a variety of parameters and physical constraints. The increasing parameters and tighter constraints gradually complicate the design space, which poses new challenges to the capacity and efficiency
-
TH-iSSD: Design and Implementation of a Generic and Reconfigurable Near-Data Processing Framework ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-11-09 Jiwu Shu, Kedong Fang, Youmin Chen, Shuo Wang
We present the design and implementation of TH-iSSD, a near-data processing framework to address the data movement problem. TH-iSSD does not pose any restriction to the hardware selection and is highly reconfigurable—its core components, such as the on-device compute unit (e.g., FPGA, embedded CPUs) and data collectors (e.g., camera, sensors), can be easily replaced to adapt to different use cases
-
RegKey: A Register-based Implementation of ECC Signature Algorithms Against One-shot Memory Disclosure ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-11-09 Yu Fu, Jingqiang Lin, Dengguo Feng, Wei Wang, Mingyu Wang, Wenjie Wang
To ensure the security of cryptographic algorithm implementations, several cryptographic key protection schemes have been proposed to prevent various memory disclosure attacks. Among them, the register-based solutions do not rely on special hardware features and offer better applicability. However, due to the size limitation of register resources, the performance of register-based solutions is much
-
SensiX++: Bringing MLOps and Multi-tenant Model Serving to Sensory Edge Devices ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-11-09 Chulhong Min, Akhil Mathur, Utku Günay Acer, Alessandro Montanari, Fahim Kawsar
We present SensiX++, a multi-tenant runtime for adaptive model execution with integrated MLOps on edge devices, e.g., a camera, a microphone, or IoT sensors. SensiX++ operates on two fundamental principles: highly modular componentisation to externalise data operations with clear abstractions and document-centric manifestation for system-wide orchestration. First, a data coordinator manages the lifecycle
-
Online Distributed Schedule Randomization to Mitigate Timing Attacks in Industrial Control Systems ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-11-09 Ankita Samaddar, Arvind Easwaran
Industrial control systems (ICSs) consist of a large number of control applications that are associated with periodic real-time flows with hard deadlines. To facilitate large-scale integration, remote control, and co-ordination, wireless sensor and actuator networks form the main communication framework in most ICSs. Among the existing wireless sensor and actuator network protocols, WirelessHART is
-
Energy-Efficient Communications for Improving Timely Progress of Intermittent-Powered BLE Devices ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-11-09 Chen-Tui Hung, Kai Xuan Lee, Yi-Zheng Liu, Ya-Shu Chen, Zhong-Han Chan
Battery-less devices offer potential solutions for maintaining sustainable Internet of Things (IoT) networks. However, limited energy harvesting capacity can lead to power failures, limiting the system’s quality of service (QoS). To improve timely task progress, we present ETIME, a scheduling framework that enables energy-efficient communication for intermittent-powered IoT devices. To maximize energy
-
A Comprehensive Model for Efficient Design Space Exploration of Imprecise Computational Blocks ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-11-09 Mohammad Haji Seyed Javadi, Mohsen Faryabi, Hamid Reza Mahdiani
After almost a decade of research, development of more efficient imprecise computational blocks is still a major concern in imprecise computing domain. There are many instances of the introduced imprecise components of different types, while their main difference is that they propose different precision-cost-performance trade-offs. In this paper, a novel comprehensive model for the imprecise components
-
Dynamic Thermal Management of 3D Memory through Rotating Low Power States and Partial Channel Closure ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-11-09 Lokesh Siddhu, Aritra Bagchi, Rajesh Kedia, Isaar Ahmad, Shailja Pandey, Preeti Ranjan Panda
Modern high-performance and high-bandwidth three-dimensional (3D) memories are characterized by frequent heating. Prior art suggests turning off hot channels and migrating data to the background DDR memory, incurring significant performance and energy overheads. We propose three Dynamic Thermal Management (DTM) approaches for 3D memories, reducing these overheads. The first approach, Rotating-channel
-
Enabling Binary Neural Network Training on the Edge ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-11-09 Erwei Wang, James J. Davis, Daniele Moro, Piotr Zielinski, Jia Jie Lim, Claudionor Coelho, Satrajit Chatterjee, Peter Y. K. Cheung, George A. Constantinides
The ever-growing computational demands of increasingly complex machine learning models frequently necessitate the use of powerful cloud-based infrastructure for their training. Binary neural networks are known to be promising candidates for on-device inference due to their extreme compute and memory savings over higher-precision alternatives. However, their existing training methods require the concurrent
-
Design and Analysis of High Performance Heterogeneous Block-based Approximate Adders ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-11-09 Ebrahim Farahmand, Ali Mahani, Muhammad Abdullah Hanif, Muhammad Shafique
Approximate computing is an emerging paradigm to improve the power and performance efficiency of error-resilient applications. As adders are one of the key components in almost all processing systems, a significant amount of research has been carried out toward designing approximate adders that can offer better efficiency than conventional designs; however, at the cost of some accuracy loss. In this
-
Reachability Analysis of Sigmoidal Neural Networks ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-10-17 Sung Woo Choi, Michael Ivashchenko, Luan V. Nguyen, Hoang-Dung Tran
This paper extends the star set reachability approach to verify the robustness of feed-forward neural networks (FNNs) with sigmoidal activation functions such as Sigmoid and TanH. The main drawbacks of the star set approach in Sigmoid/TanH FNN verification are scalability, feasibility, and optimality issues in some cases due to the linear programming solver usage. We overcome this challenge by proposing
-
Deterministic Coordination Across Multiple Timelines ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-10-16 Marten Lohstroh, Soroush Bateni, Christian Menard, Alexander Schulz-Rosengarten, Jeronimo Castrillon, Edward A. Lee
We discuss a novel approach for constructing deterministic reactive systems that revolves around a temporal model that incorporates a multiplicity of timelines. This model is central to Lingua Franca (LF), a polyglot coordination language and compiler toolchain we are developing for the definition and composition of concurrent components called reactors, which are objects that react to and emit discrete
-
Synchronised Shared Memory and Model Checking ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-10-02 Joaquín Aguado, Alejandra Duenas
In this paper, a formal generic framework for defining and reasoning about deterministic concurrency in synchronous systems is implemented in the Spin model checker. Concretely, the paper implements the clock-synchronised shared memory (csm) theory, which extends synchronous programming with more and higher level csm data types. These csm data types are equipped with a synchronisation policy prescribing
-
Evolution Function Based Reach-Avoid Verification for Time-varying Systems with Disturbances ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-09-28 Ruiqi Hu, Kairong Liu, Zhikun She
In this work, we investigate the reach-avoid problem of a class of time-varying analytic systems with disturbances described by uncertain parameters. Firstly, by proposing the concepts of maximal and minimal reachable sets, we connect the avoidability and reachability with maximal and minimal reachable sets respectively. Then, for a given disturbance parameter, we introduce the evolution function for
-
An Asynchronous Compaction Acceleration Scheme for Near-Data Processing-enabled LSM-Tree-based KV Stores ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-09-29 Hui Sun, Bendong Lou, Chao Zhao, Deyan Kong, Chaowei Zhang, Jianzhong Huang, Yinliang Yue, Xiao Qin
LSM-tree-based key-value stores (KV stores) convert random-write requests to sequence-write ones to achieve high I/O performance. Meanwhile, compaction operations in KV stores update SSTables in forms of reorganizing low-level data components to high-level ones, thereby guaranteeing an orderly data layout in each component. Repeated writes caused by compaction (a.k.a, write amplification) impacts I/O
-
AMULET: a Mutation Language Enabling Automatic Enrichment of SysML Models ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-09-16 Bastien Sultan, Léon Frénot, Ludovic Apvrille, Philippe Jaillon, Sophie Coudert
SysML models are widely used for designing and analyzing complex systems. Model-based design methods often require successive modifications of the models, whether for incrementally refining the design (e.g. in agile development methods) or for testing different design options. Such modifications, or mutations, are also used in mutation-based testing approaches. However, the definition of mutation operators
-
A Robust and Energy Efficient Hyperdimensional Computing System for Voltage-scaled Circuits ACM Trans. Embed. Comput. Syst. (IF 2.0) Pub Date : 2023-09-11 Dehua Liang, Hiromitsu Awano, Noriyuki Miura, Jun Shiomi
Voltage scaling is one of the most promising approaches for energy efficiency improvement but also brings challenges to fully guaranteeing stable operation in modern VLSI. To tackle such issues, we further extend the DependableHD to the second version DependableHDv2, a HyperDimensional Computing (HDC) system that can tolerate bit-level memory failure in the low voltage region with high robustness.