当前期刊: IEEE Transactions on Computers Go to current issue    加入关注   
显示样式:        排序: IF: - GO 导出
我的关注
我的收藏
您暂时未登录!
登录
  • Guest Editorial: IEEE TC Special Issue on Domain-Specific Architectures for Emerging Applications
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-07-08
    Lisa Wu Wills; Karthik Swaminathan

    The papers in this special section examine domain-specific architectures for emerging applications. Presents innovative research in domain-specific architectures across a broad range of emerging applications.

    更新日期:2020-07-10
  • Neuromorphic System for Spatial and Temporal Information Processing
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-06-05
    Abdullah M. Zyarah; Kevin Gomez; Dhireesha Kudithipudi

    Neuromorphic systems that learn and predict from streaming inputs hold significant promise in pervasive edge computing and its applications. In this article, a neuromorphic system that processes spatio-temporal information on the edge is proposed. Algorithmically, the system is based on hierarchical temporal memory that inherently offers online learning, resiliency, and fault tolerance. Architecturally

    更新日期:2020-07-10
  • Accelerating Deep Neural Network In-Situ Training With Non-Volatile and Volatile Memory Based Hybrid Precision Synapses
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-06-05
    Yandong Luo; Shimeng Yu

    Compute-in-memory (CIM) with emerging non-volatile memories (eNVMs) is time and energy efficient for deep neural network (DNN) inference. However, challenges still remain for DNN in-situ training with eNVMs due to the asymmetric weight update behavior, high programming latency and energy consumption. To overcome these challenges, a hybrid precision synapse combining eNVMs with capacitor has been proposed

    更新日期:2020-07-10
  • PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-Efficient ReRAM
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-05-29
    Aayush Ankit; Izzat El Hajj; Sai Rahul Chalamalasetti; Sapan Agarwal; Matthew Marinella; Martin Foltin; John Paul Strachan; Dejan Milojicic; Wen-Mei Hwu; Kaushik Roy

    The wide adoption of deep neural networks has been accompanied by ever-increasing energy and performance demands due to the expensive nature of training them. Numerous special-purpose architectures have been proposed to accelerate training: both digital and hybrid digital-analog using resistive RAM (ReRAM) crossbars. ReRAM-based accelerators have demonstrated the effectiveness of ReRAM crossbars at

    更新日期:2020-07-10
  • FPDeep: Scalable Acceleration of CNN Training on Deeply-Pipelined FPGA Clusters
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-06-08
    Tianqi Wang; Tong Geng; Ang Li; Xi Jin; Martin Herbordt

    Deep convolutional Neural Networks (CNNs) have revolutionized numerous applications, but the demand for ever more performance remains unabated. Scaling CNN computations to larger clusters is generally done by distributing tasks in batch mode using methods such as distributed synchronous SGD. Among the issues with this approach is that, to make the distributed cluster work with high utilization, the

    更新日期:2020-07-10
  • Accelerating Hyperdimensional Computing on FPGAs by Exploiting Computational Reuse
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-05-06
    Sahand Salamat; Mohsen Imani; Tajana Rosing

    Brain-inspired hyperdimensional (HD) computing emulates cognition by computing with long-size vectors. HD computing consists of two main modules: encoder and associative search. The encoder module maps inputs into high dimensional vectors, called hypervectors. The associative search finds the closest match between the trained model (set of hypervectors) and a query hypervector by calculating a similarity

    更新日期:2020-07-10
  • Accelerating Generative Neural Networks on Unmodified Deep Learning Processors—A Software Approach
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-06-09
    Dawen Xu; Cheng Liu; Ying Wang; Kaijie Tu; Bingsheng He; Lei Zhang

    Generative neural network is a new category of neural networks and it has been widely utilized in many applications such as content generation, unsupervised learning, segmentation, and pose estimation. It typically involves massive computing-intensive deconvolution operations that cannot be fitted to conventional neural network processors directly. However, prior works mainly investigated specialized

    更新日期:2020-07-10
  • PaRTAA: A Real-Time Multiprocessor for Mixed-Criticality Airborne Systems
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-06-16
    Shibarchi Majumder; Jens Frederik Dalsgaard Nielsen; Thomas Bak

    Mixed-criticality systems, where multiple systems with varying criticality-levels share a single hardware platform, require isolation between tasks with different criticality-levels. Isolation can be achieved with software-based solutions or can be enforced by a hardware level partitioning. An asymmetric multiprocessor architecture offers hardware-based isolation at the cost of underutilized hardware

    更新日期:2020-07-10
  • Collaborative Accelerators for Streamlining MapReduce on Scale-up Machines With Incremental Data Aggregation
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-06-22
    Abraham Addisie; Valeria Bertacco

    The MapReduce programming paradigm has been increasingly adopted to implement data-intensive applications processing both small and large scale datasets. As most jobs in data centers have a data footprint in the order of gigabytes, emerging high-end scale-up machines are capable of running most data center processing tasks, thus significantly improving power and server density. However, this approach

    更新日期:2020-07-10
  • Distributed Training of Support Vector Machine on a Multiple-FPGA System
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-05-11
    Jyotikrishna Dass; Yashwardhan Narawane; Rabi N. Mahapatra; Vivek Sarin

    Support Vector Machine (SVM) is a supervised machine learning model for classification tasks. Training SVM on a large number of data samples is challenging due to the high computational cost and memory requirement. Hence, model training is supported on a high-performance server which typically runs a sequential training algorithm on centralized data. However, as we move towards massive workloads, it

    更新日期:2020-05-11
  • LAWS: Locality-AWare Scheme for Automatic Speech Recognition
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-04-28
    Reza Yazdani; Jose-Maria Arnau; Antonio González

    Automatic Speech Recognition (ASR) systems are changing the way people interact with different applications on mobile devices. Fulfilling such user-interactivity requires not only a highly accurate, large-vocabulary recognition system, but also a real-time, energy-efficient solution. However, these ASR systems need high memory bandwidth and power budget, which may be impractical for most of small form-factor

    更新日期:2020-04-28
  • Tetris: Using Software/Hardware Co-Design to Enable Handheld, Physics-Limited 3D Plane-Wave Ultrasound Imaging
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-04-23
    Brendan L. West; Jian Zhou; Ronald G. Dreslinksi; Oliver D. Kripfgans; J. Brian Fowlkes; Chaitali Chakrabarti; Thomas F. Wenisch

    High volume acquisition rates are imperative for certain medical ultrasound imaging applications, such as 3D elastography and 3D vector flow imaging. As ultrasound imaging transitions from 2D to 3D, the massive data bandwidth and billions of trigonometric operations required to reconstruct each volume leaves conventional computer architectures falling short. Despite recent algorithmic improvements

    更新日期:2020-04-23
  • Adaptive Model-Based Scheduling in Software Transactional Memory
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-11-19
    Pierangelo Di Sanzo; Alessandro Pellegrini; Marco Sannicandro; Bruno Ciciani; Francesco Quaglia

    Software Transactional Memory (STM) stands as powerful concurrent programming paradigm, enabling atomicity, and isolation while accessing shared data. On the downside, STM may suffer from performance degradation due to excessive conflicts among concurrent transactions, which cause waste of CPU-cycles and energy because of transaction aborts. An approach to cope with this issue consists of putting in

    更新日期:2020-04-22
  • Branch Prediction Attack on Blinded Scalar Multiplication
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-12-09
    Sarani Bhattacharya; Clémentine Maurice; Shivam Bhasin; Debdeep Mukhopadhyay

    In recent years, performance counters have been used as a side channel source to monitor branch mispredictions, in order to attack cryptographic algorithms. However, the literature considers blinding techniques as effective countermeasures against such attacks. In this article, we present the first template attack on the branch predictor. We target blinded scalar multiplications with a side-channel

    更新日期:2020-04-22
  • A Modeling Framework for Reliability of Erasure Codes in SSD Arrays
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-12-27
    Mostafa Kishani; Saba Ahmadian; Hossein Asadi

    Emergence of Solid-State Drives (SSDs) have evolved the data storage industry where they are rapidly replacing Hard Disk Drives (HDDs) due to their superiority in performance and power. Meanwhile, SSDs have reliability issues due to bit errors, bad blocks, and bad chips. To help reliability, Redundant Array of Independent Disks (RAID) configurations, originally proposed to increase both performance

    更新日期:2020-04-22
  • CryptSQLite: SQLite With High Data Security
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-12-31
    Yongzhi Wang; Yulong Shen; Cuicui Su; Jiawen Ma; Lingtong Liu; Xuewen Dong

    SQLite, one of the most popular light-weighted database system, has been widely used in various systems. However, the compact design of SQLite did not make enough consideration on user data security. Specifically, anyone who has obtained the access to the database file will be able to read or tamper the data. Existing encryption-based solutions can only protect data on storage, while still exposing

    更新日期:2020-04-22
  • Incremental Throughput Allocation of Heterogeneous Storage With No Disruptions in Dynamic Setting
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-12-31
    ZhiSheng Huo; Limin Xiao; Minyi Guo; Xiaoling Rong

    Solid-state drives (SSDs) have been added into storage systems for improving their performance, which will bring the heterogeneity into the storage medium. The throughput is one of the essential resources in heterogeneous storage systems, and how to allocate the throughput plays a crucial role in user performance. There are many types of research on the throughput allocation of heterogeneous storage

    更新日期:2020-04-22
  • Fast Encoding Algorithms for Reed–Solomon Codes With Between Four and Seven Parity Symbols
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-01-03
    Leilei Yu; Zhichang Lin; Sian-Jheng Lin; Yunghsiang S. Han; Nenghai Yu

    This article describes a fast Reed–Solomon encoding algorithm with four and seven parity symbols in between. First, we show that the syndrome of Reed–Solomon codes can be computed via the Reed–Muller transform. Based on this result, the fast encoding algorithm is then derived. Analysis shows that the proposed approach asymptotically requires 3 XORs per data bit, representing an improvement over previous

    更新日期:2020-04-22
  • All-Digital Control-Theoretic Scheme to Optimize Energy Budget and Allocation in Multi-Cores
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-01-03
    Davide Zoni; Luca Cremona; William Fornaciari

    The Internet-of-Things (IoT) revolution fueled new challenges and opportunities to achieve computational efficiency goals. Embedded devices are required to execute multiple applications for which a suitable distribution of the computing power must be adapted at run-time. Such complex hardware platforms have to sustain the continuous acquisition and processing of data under severe energy budget constraints

    更新日期:2020-04-22
  • Joint Management of CPU and NVDIMM for Breaking Down the Great Memory Wall
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-01-06
    Chun-Feng Wu; Yuan-Hao Chang; Ming-Chang Yang; Tei-Wei Kuo

    To provide larger memory space with lower costs, NVDIMM is a production-ready device. However, directly placing NVDIMM as the main memory would seriously degrade the system performance because of the “great memory wall” caused by the fact that in NVDIMM, the slow memory (e.g., flash memory) is several orders of magnitude slower than the fast memory (e.g., DRAM). In this article, we present a joint

    更新日期:2020-04-22
  • Crossbar-Constrained Technology Mapping for ReRAM Based In-Memory Computing
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-01-07
    Debjyoti Bhattacharjee; Yaswanth Tavva; Arvind Easwaran; Anupam Chattopadhyay

    In-memory computing has gained significant attention due to the potential for dramatic improvement in speed and energy. Redox-based resistive RAMs (ReRAMs), capable of non-volatile storage and logic operations simultaneously have been used for logic-in-memory computing approaches. To this effect, we propose Re RAM based V LIW A rchitecture for in- M emory com P uting (ReVAMP), supported by a detailed

    更新日期:2020-04-22
  • Automated Performance Modeling of HPC Applications Using Machine Learning
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-01-10
    Jingwei Sun; Guangzhong Sun; Shiyan Zhan; Jiepeng Zhang; Yong Chen

    Automated performance modeling and performance prediction of parallel programs are highly valuable in many use cases, such as in guiding task management and job scheduling, offering insights of application behaviors, and assisting resource requirement estimation. The performance of parallel programs is affected by numerous factors, including but not limited to hardware, applications, algorithms, and

    更新日期:2020-04-22
  • A Neural Network Based Fault Management Scheme for Reliable Image Processing
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-01-10
    Matteo Biasielli; Cristiana Bolchini; Luca Cassano; Erdem Koyuncu; Antonio Miele

    Traditional reliability approaches introduce relevant costs to achieve unconditional correctness during data processing. However, many application environments are inherently tolerant to a certain degree of inexactness or inaccuracy. In this article, we focus on the practical scenario of image processing in space, a domain where faults are a threat, while the applications are inherently tolerant to

    更新日期:2020-04-22
  • HEAWS: An Accelerator for Homomorphic Encryption on the Amazon AWS FPGA
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-04-20
    Furkan Turan; Sujoy Sinha Roy; Ingrid Verbauwhede

    Homomorphic Encryption makes privacy preserving computing possible in a third party owned cloud by enabling computation on the encrypted data of users. However, software implementations of homomorphic encryption are very slow on general purpose processors. With the emergence of ‘FPGAs as a service’, hardware-acceleration of computationally heavy workloads in the cloud are getting popular. In this article

    更新日期:2020-04-20
  • DS3: A System-Level Domain-Specific System-on-Chip Simulation Framework
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-04-20
    Samet E. Arda; Anish Krishnakumar; A. Alper Goksoy; Nirmal Kumbhare; Joshua Mack; Anderson L. Sartor; Ali Akoglu; Radu Marculescu; Umit Y. Ogras

    Heterogeneous systems-on-chip (SoCs) are highly favorable computing platforms due to their superior performance and energy efficiency potential compared to homogeneous architectures. They can be further tailored to a specific domain of applications by incorporating processing elements (PEs) that accelerate frequently used kernels in these applications. However, this potential is contingent upon optimizing

    更新日期:2020-04-20
  • WooKong: A Ubiquitous Accelerator for Recommendation Algorithms With Custom Instruction Sets on FPGA
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-04-20
    Chao Wang; Lei Gong; Xiang Ma; Xi Li; Xuehai Zhou

    Recommendation algorithms, such as Neighborhood-based Collaborative- Filtering (CF), have been widely applied in various emerging machine learning applications. However, under the circumstance of the explosive big data, it poses significant challenges to CF recommendation algorithms as it is becoming quite time and energy-consuming. It has to be optimized and accelerated by powerful engines to process

    更新日期:2020-04-20
  • MViD: Sparse Matrix-Vector Multiplication in Mobile DRAM for Accelerating Recurrent Neural Networks
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-04-02
    Byeongho Kim; Jongwook Chung; Eojin Lee; Wonkyung Jung; Sunjung Lee; Jaewan Choi; Jaehyun Park; Minbok Wi; Sukhan Lee; Jung Ho Ahn

    Recurrent Neural Networks (RNNs) spend most of their execution time performing matrix-vector multiplication (MV-mul). Because the matrices in RNNs have poor reusability and the ever-increasing size of the matrices becomes too large to fit in the on-chip storage of mobile/IoT devices, the performance and energy efficiency of MV-mul is determined by those of main-memory DRAM. Therefore, computing MV-mul

    更新日期:2020-04-02
  • $\pi$π-BA: Bundle Adjustment Hardware Accelerator Based on Distribution of 3D-Point Observations
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-04-02
    Qiang Liu; Shuzhen Qin; Bo Yu; Jie Tang; Shaoshan Liu

    Bundle adjustment (BA) is a fundamental optimization technique used in many crucial applications, including 3D scene reconstruction, robotic localization, camera calibration, autonomous driving, street view map generation, and even space exploration etc. Essentially, BA is a joint non-linear optimization problem, and one which can consume a significant amount of time and power, especially for large

    更新日期:2020-04-02
  • Machine Learning Computers With Fractal von Neumann Architecture
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-03-20
    Yongwei Zhao; Zhe Fan; Zidong Du; Tian Zhi; Ling Li; Qi Guo; Shaoli Liu; Zhiwei Xu; Tianshi Chen; Yunji Chen

    Machine learning techniques are pervasive tools for emerging commercial applications and many dedicated machine learning computers on different scales have been deployed in embedded devices, servers, and data centers. Currently, most machine learning computer architectures still focus on optimizing performance and energy efficiency instead of programming productivity. However, with the fast development

    更新日期:2020-03-20
  • Crane: Mitigating Accelerator Under-utilization Caused by Sparsity Irregularities in CNNs
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-03-18
    Yijin Guan; Guangyu Sun; Zhihang Yuan; Xingchen Li; Ningyi Xu; Shu Chen; Jason Cong; Yuan Xie

    Convolutional neural networks (CNNs) have achieved great success in numerous AI applications. To improve inference efficiency of CNNs, researchers have proposed various pruning techniques to reduce both computation intensity and storage overhead. These pruning techniques result in multi-level sparsity irregularities in CNNs. Together with that in activation matrices, which is induced by employment

    更新日期:2020-03-18
  • State of the Journal
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-03-11
    Ahmed Louri

    Presents the introductory editorial for this issue of the publication.

    更新日期:2020-03-16
  • Approximate Restoring Dividers Using Inexact Cells and Estimation From Partial Remainders
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-11-15
    Elizabeth Adams; Suganthi Venkatachalam; Seok-Bum Ko

    Approximate computing can be used in error-resilient applications to reduce power consumption and increase overall circuit performance. This article introduces two approximate dividers with restoring array-based architecture that achieve substantial hardware savings while maintaining high accuracy when compared to existing approximate designs. The first design replaces exact restoring divider cells

    更新日期:2020-03-16
  • Exploiting Asymmetric Errors for LDPC Decoding Optimization on 3D NAND Flash Memory
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-12-18
    Qiao Li; Liang Shi; Yufei Cui; Chun Jason Xue

    By stacking layers vertically, the adoption of 3D NAND has significantly increased the capacity for storage systems. The complex structure of 3D NAND introduces more errors than planer flash. To address the reliability issue, low-density parity-check (LDPC) code with a strong error correction capability is now widely applied on 3D NAND flash memory. However, LDPC has long decoding latency when the

    更新日期:2020-03-16
  • Arithmetic Approaches for Rigorous Design of Reliable Fixed-Point LTI Filters
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-10-31
    Anastasia Volkova; Thibault Hilaire; Christoph Lauter

    In this paper we target the Fixed-Point (FxP) implementation of Linear Time-Invariant (LTI) filters evaluated with state-space equations. We assume that wordlengths are fixed and that our goal is to determine binary point positions that guarantee the absence of overflows while maximizing accuracy. We provide a model for the worst-case error analysis of FxP filters that gives tight bounds on the output

    更新日期:2020-03-16
  • Graph Similarity and its Applications to Hardware Security
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-11-26
    Marc Fyrbiak; Sebastian Wallat; Sascha Reinhard; Nicolai Bissantz; Christof Paar

    Hardware reverse engineering is a powerful and universal tool for both security engineers and adversaries. From a defensive perspective, it allows for detection of intellectual property infringements and hardware Trojans, while it simultaneously can be used for product piracy and malicious circuit manipulations. From a designer's perspective, it is crucial to have an estimate of the costs associated

    更新日期:2020-03-16
  • NTTU: An Area-Efficient Low-Power NTT-Uncoupled Architecture for NTT-Based Multiplication
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-12-09
    Neng Zhang; Qiao Qin; Hang Yuan; Chenggao Zhou; Shouyi Yin; ShaoJun Wei; Leibo Liu

    Large integer multiplication, or large degree polynomial multiplication, is the most time-consuming operation in fully homomorphic encryption (FHE). Low area and power consumption are difficult to maintain while achieving high performance for a large size multiplier. To address this issue, an area-efficient low-power architecture for multiplication, named NTTU, is proposed in this article. First, a

    更新日期:2020-03-16
  • High Throughput/Gate AES Hardware Architectures Based on Datapath Compression
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-12-04
    Rei Ueno; Sumio Morioka; Noriyuki Miura; Kohei Matsuda; Makoto Nagata; Shivam Bhasin; Yves Mathieu; Tarik Graba; Jean-Luc Danger; Naofumi Homma

    This article proposes highly efficient Advanced Encryption Standard (AES) hardware architectures that support encryption and both encryption and decryption. New operation-reordering and register-retiming techniques presented in this article allow us to unify the inversion circuits in SubBytes and InvSubBytes without any delay overhead. In addition, a new optimization technique for minimizing linear

    更新日期:2020-03-16
  • A Management Scheme of Multi-Level Retention-Time Queues for Improving the Endurance of Flash-Memory Storage Devices
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-11-20
    David Kuang-Hui Yu; Jen-Wei Hsieh

    As flash memory technology has been scaled down to 1x nm and more bits can be stored in a cell, the storage density of flash memory has been significantly improved. However, these technical trends also severely hurt the programming speed and endurance of flash memory. The internal data retention time is the duration for which a flash cell can correctly hold data. By relaxing internal data retention

    更新日期:2020-03-16
  • Performance Analysis for Heterogeneous Cloud Servers Using Queueing Theory
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-11-28
    Shuang Wang; Xiaoping Li; Rubén Ruiz

    In this article, we consider the problem of selecting appropriate heterogeneous servers in cloud centers for stochastically arriving requests in order to obtain an optimal tradeoff between the expected response time and power consumption. Heterogeneous servers with uncertain setup times are far more common than homogenous ones. The heterogeneity of servers and stochastic requests pose great challenges

    更新日期:2020-03-16
  • Bufferless Network-on-Chips With Bridged Multiple Subnetworks for Deflection Reduction and Energy Savings
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-12-18
    Xiyue Xiang; Purushottam Sigdel; Nian-Feng Tzeng

    A bufferless network-on-chip (NoC) can deliver high energy efficiency, but such a NoC is subject to growing deflection when its traffic load rises. This article proposes Deflection Containment (DeC) for the bufferless NoC to address its notorious shortcomings of excessive deflection for performance improvement and energy savings. With multiple subnetworks bridged by an added link between two corresponding

    更新日期:2020-03-16
  • PRS: A Pattern-Directed Replication Scheme for Heterogeneous Object-Based Storage
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-11-19
    Jiang Zhou; Yong Chen; Wei Xie; Dong Dai; Shuibing He; Weiping Wang

    Data replication is a key technique to achieve high data availability, reliability, and optimized performance in distributed storage systems. In recent years, with emerged new storage devices, heterogeneous object-based storage systems, such as a storage system with a mix of hard disk drives, solid state drives, and other non-volatile memory devices have become increasingly attractive since they combine

    更新日期:2020-03-16
  • Mangrove: An Inference-Based Dynamic Invariant Mining for GPU Architectures
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-11-18
    Nicola Bombieri; Federico Busato; Alessandro Danese; Luca Piccolboni; Graziano Pravadelli

    Likely invariants model properties that hold in operating conditions of a computing system. Dynamic mining of invariants aims at extracting logic formulas representing such properties from the system execution traces, and it is widely used for verification of intellectual property (IP) blocks. Although the extracted formulas represent likely invariants that hold in the considered traces, there is no

    更新日期:2020-03-16
  • CIMAT: A Compute-In-Memory Architecture for On-chip Training Based on Transpose SRAM Arrays
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-03-13
    Hongwu Jiang; Xiaochen Peng; Shanshi Huang; Shimeng Yu

    Rapid development in deep neural networks (DNNs) is enabling many intelligent applications. However, on-chip training of DNNs is challenging due to the extensive computation and memory bandwidth requirements. To solve the bottleneck of the memory wall problem, compute-in-memory (CIM) approach exploits the analog computation along the bit line of the memory array thus significantly speeds up the vector-matrix

    更新日期:2020-03-13
  • A Power- and Performance-Aware Software Framework for Control System Applications
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-03-05
    Michael Giardino; Eric Klawitter; Bonnie Ferri; Aldo Ferri

    This article describes the development of a software architectural framework for implementing compute-aware control systems, where the term “compute-aware” describes controllers that can modify existing low-level computing platform power managers in response to the needs of the physical system controller. This level of interaction means that high-level decisions can be made as to when to operate the

    更新日期:2020-03-05
  • Addressing Irregularity in Sparse Neural Networks Through a Cooperative Software/Hardware Approach
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-03-05
    Xi Zeng; Tian Zhi; Xuda Zhou; Zidong Du; Qi Guo; Shaoli Liu; Bingrui Wang; Yuanbo Wen; Chao Wang; Xuehai Zhou; Ling Li; Tianshi Chen; Ninghui Sun; Yunji Chen

    Neural networks have become the dominant algorithms rapidly as they achieve state-of-the-art performance in a broad range of applications such as image recognition, speech recognition, and natural language processing. However, neural networks keep moving toward deeper and larger architectures, posing a great challenge to hardware systems due to the huge amount of data and computations. Although sparsity

    更新日期:2020-03-05
  • Algorithmics of Cost-Driven Computation Offloading in the Edge-Cloud Environment
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-02-28
    Mingzhe Du; Yang Wang; Kejiang Ye; Chengzhong Xu

    Computation offloading between the edge and the cloud is an effective way for deployed service to fully utilize the resources at both sides for its QoS improvement and overall cost reduction. Although the offloading problem has been intensively studied in the context of mobile computing, existing algorithms in most cases cannot be effectively migrated to the edge-cloud environment because their inter-partition

    更新日期:2020-02-28
  • Pipelined Hardware Implementation of COPA, ELmD, and COLM
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-02-28
    Lilian Bossuet; Cuauhtemoc Mancillas-López; Brisbane Ovilla-Martínez

    Authenticated encryption algorithms offer privacy, authentication, and data integrity, as well. In recent years, they have received special attention after the call for submissions of Competition for Authenticated Encryption: Security, Applicability, and Robustness (CAESAR) was published. The CAESAR goal is to generate a portfolio with recommendations of authenticated encryption algorithms for three

    更新日期:2020-02-28
  • PARMA: Parallelization-Aware Run-Time Management for Energy-Efficient Many-Core Systems
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-02-24
    Mohammed A. Noaman Al-hayanni; Ashur Rafiev; Fei Xia; Rishad Shafik; Alexander Romanovsky; Alex Yakovlev

    Performance and energy efficiency considerations have shifted computing paradigms from single-core to many-core architectures. At the same time, traditional speedup models such as Amdahl's Law face challenges in the run-time reasoning for system performance and energy efficiency, because these models typically assume limited variations of the parallel fraction. Moreover, the parallel fraction, which

    更新日期:2020-02-24
  • Enabling Efficient Fast Convolution Algorithms on GPUs via MegaKernels
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-02-21
    Liancheng Jia; Yun Liang; Xiuhong Li; Liqiang Lu; Shengen Yan

    Modern Convolutional Neural Networks (CNNs) require a massive amount of convolution operations. To address the overwhelming computation problem, Winograd and FFT fast algorithms have been used as effective approaches to reduce the number of multiplications. Inputs and filters are transformed into special domains then perform element-wise multiplication, which can be transformed into batched GEMM operation

    更新日期:2020-02-21
  • LUT Input Reordering to Reduce Aging Impact on FPGA LUTs
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-02-18
    Mohammad Ebrahimi; Rezgar Sadeghi; Zainalabedin Navabi

    In this article, we propose a fine-grained FPGA aging mitigation method. Our method focuses on Look Up Tables (LUTs) on which Boolean functions are mapped. Based on our observations, for any configuration, even if it is carefully selected, a number of LUT transistors experience severe stress rates. Therefore, an algorithm is presented to select several alternative configurations for each LUT. Alternative

    更新日期:2020-02-18
  • Separable Binary Convolutional Neural Network on Embedded Systems
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-02-17
    Renping Liu; Xianzhang Chen; Duo Liu; Yingjian Ling; Weilue Wang; Yujuan Tan; Chunhua Xiao; Chaoshu Yang; Runyu Zhang; Liang Liang

    We have witnessed the tremendous success of deep neural networks. However, this success comes with the considerable memory and computational costs which make it difficult to deploy these networks directly on resource-constrained embedded systems. To address this problem, we propose TaijiNet, a separable binary network, to reduce the storage and computational overhead while maintaining a comparable

    更新日期:2020-02-17
  • Schedulability Analysis of Global Scheduling for Multicore Systems With Shared Caches
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-02-17
    Jun Xiao; Sebastian Altmeyer; Andy D. Pimentel

    Shared caches in multicore processors introduce serious difficulties in providing guarantees on the real-time properties of embedded software due to the interaction and the resulting contention in the shared caches. To address this problem, we develop a new schedulability analysis for real-time multicore systems with shared caches, globally scheduled by Earliest Deadline First (EDF) and Fixed Priority

    更新日期:2020-02-17
  • A Neural Network-Based On-Device Learning Anomaly Detector for Edge Devices
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-02-17
    Mineto Tsukada; Masaaki Kondo; Hiroki Matsutani

    Semi-supervised anomaly detection is an approach to identify anomalies by learning the distribution of normal data. Backpropagation neural networks (i.e., BP-NNs) based approaches have recently drawn attention because of their good generalization capability. In a typical situation, BP-NN-based models are iteratively optimized in server machines with input data gathered from the edge devices. However

    更新日期:2020-02-17
  • Enabling Energy-Efficient and Reliable Neural Network via Neuron-Level Voltage Scaling
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-02-14
    Jing Wang; Xin Fu; Xu Wang; Shubo Liu; Lan Gao; Weigong Zhang

    With the platforms of running deep neural networks (DNNs) move from large-scale data centers to handheld devices, power emerge as one of the most significant obstacles. Voltage scaling is a promising technique that enables power saving. Nevertheless, it raises reliability and performance concerns that may undesirably deteriorate NNs accuracy and performance. Consequently, an energy-efficient and reliable

    更新日期:2020-02-14
  • Object-Level Memory Allocation and Migration in Hybrid Memory Systems
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2020-02-12
    Haikun Liu; Renshan Liu; Xiaofei Liao; Hai Jin; Bingsheng He; Yu Zhang

    Hybrid memory systems composed of emerging non-volatile memory (NVM) and DRAM have drawn increasing attention in recent years. To fully exploit the advantages of both NVM and DRAM, a primary goal is to properly place application data on the hybrid memories. Previous studies have focused on page migration schemes to achieve higher performance and energy efficiency. However, those schemes all rely on

    更新日期:2020-02-12
  • REMOTE: Robust External Malware Detection Framework by Using Electromagnetic Signals
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-10-07
    Nader Sehatbakhsh; Alireza Nazari; Monjur Alam; Frank Werner; Yuanda Zhu; Alenka Zajic; Milos Prvulovic

    Cyber-physical systems (CPS) are controlling many critical and sensitive aspects of our physical world while being continuously exposed to potential cyber-attacks. These systems typically have limited performance, memory, and energy reserves, which limits their ability to run existing advanced malware protection, and that, in turn, makes securing them very challenging. To tackle these problems, this

    更新日期:2020-02-11
  • Lightweight Key Encapsulation Using LDPC Codes on FPGAs
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-10-21
    Jingwei Hu; Marco Baldi; Paolo Santini; Neng Zeng; San Ling; Huaxiong Wang

    In this paper, we present a lightweight hardware design for a recently proposed quantum-safe key encapsulation mechanism based on QC-LDPC codes called LEDAkem, which has been admitted as a round-2 candidate to the NIST post-quantum standardization project. Existing implementations focus on high speed while few of them take into account area or power efficiency, which are particularly decisive for low-cost

    更新日期:2020-02-11
  • Towards the Integration of Reverse Converters into the RNS Channels
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-10-21
    Leonel Sousa; Rogério Paludo; Paulo Martins; Hector Pettenghi

    The conversion from a Residue Number System (RNS) to a weighted representation is a costly inter-modulo operation that introduces delay and area overhead to RNS processors, while also increasing power consumption. This paper proposes a new approach to decompose the reverse conversion into operations that can be processed by the arithmetic units already present in the RNS independent channels. This

    更新日期:2020-02-11
  • ApGAN: Approximate GAN for Robust Low Energy Learning From Imprecise Components
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-10-23
    Arman Roohi; Shadi Sheikhfaal; Shaahin Angizi; Deliang Fan; Ronald F DeMara

    A Generative Adversarial Network (GAN) is an adversarial learning approach which empowers conventional deep learning methods by alleviating the demands of massive labeled datasets. However, GAN training can be computationally-intensive limiting its feasibility in resource-limited edge devices. In this paper, we propose an approximate GAN (ApGAN) for accelerating GANs from both algorithm and hardware

    更新日期:2020-02-11
  • Impeccable Circuits
    IEEE Trans. Comput. (IF 2.711) Pub Date : 2019-10-23
    Anita Aghaie; Amir Moradi; Shahram Rasoolzadeh; Aein Rezaei Shahmirzadi; Falk Schellenberg; Tobias Schneider

    By injecting faults, active physical attacks pose serious threats to cryptographic hardware where Concurrent Error Detection (CED) schemes are promising countermeasures. They are usually based on an Error-Detecting Code (EDC) which enables detecting certain injected faults depending on the specification of the underlying code. Here, we propose a methodology to enable correct, practical, and robust

    更新日期:2020-02-11
Contents have been reproduced by permission of the publishers.
导出
全部期刊列表>>
《自然》编辑与您分享如何成为优质审稿人-信息流
物理学研究前沿热点精选期刊推荐
科研绘图
欢迎报名注册2020量子在线大会
化学领域亟待解决的问题
材料学研究精选新
GIANT
自然职场线上招聘会
ACS ES&T Engineering
ACS ES&T Water
屿渡论文,编辑服务
阿拉丁试剂right
张晓晨
田蕾蕾
李闯创
刘天飞
隐藏1h前已浏览文章
课题组网站
新版X-MOL期刊搜索和高级搜索功能介绍
ACS材料视界
天合科研
x-mol收录
X-MOL
清华大学
廖矿标
陈永胜
试剂库存
down
wechat
bug