- 作 者: ( "naifeng jing" OR "nai-feng jing" )
-
IEEE Geoscience and Remote Sensing Letters (IF 5.343) Pub Date : 2022-07-06 ,DOI:10.1109/lgrs.2022.3188850Guochao Deng, Qin Wang, Jianfei Jiang, Qirun Hong, Naifeng Jing, Weiguang Sheng, Zhigang MaoIn recent years, many ship detection algorithms based on convolutional neural networks (CNNs) have been proposed to improve the performance of ship detection. However, with the increase in model complexity and size, it is challenging to deploy these models to resource-constrained edge platforms. In this letter, a low coupling algorithm that belongs to anchor-free methods is proposed for ship detection ...
-
ACM Transactions on Design Automation of Electronic Systems (IF 1.447) Pub Date : 2022-06-17 ,DOI:10.1145/3543852Zhuoran Song, Naifeng Jing*, Xiaoyao Liang
High-resolution video object recognition (VOR) evolves so fast but is very compute-intensive. This is because VOR leverages compute-intensive deep neural network (DNN) for better accuracy. Although many works have been proposed for speedup, they mostly focus on DNN algorithm and hardware acceleration on the edge side. We observe that most video streams need to be losslessly compressed before going ...
-
ACM Transactions on Design Automation of Electronic Systems (IF 1.447) Pub Date : 2022-05-23 ,DOI:10.1145/3510819Taozhong Li, Naifeng Jing, Jianfei Jiang, Qin Wang, Zhigang Mao, Yiran Chen
Resistive-RAM-based (ReRAM-based) computing shows great potential on accelerating DNN inference by its highly parallel structure. Regrettably, computing accuracy in practical is much lower than expected due to the non-ideal ReRAM device. Conventional computing flow with fixed wordline activation scheme can effectively protect computing accuracy, but at the cost of significant performance and energy ...
-
arXiv - CS - Machine Learning Pub Date : 2022-03-11 ,DOI:arxiv-2203.05705Zhuoran Song, Yihong Xu, Han Li, Naifeng Jing, Xiaoyao Liang, Li JiangThe training phases of Deep neural network~(DNN) consumes enormous processing time and energy. Compression techniques utilizing the sparsity of DNNs can effectively accelerate the inference phase of DNNs. However, it is hardly used in the training phase because the training phase involves dense matrix-multiplication using General-Purpose Computation on Graphics Processors (GPGPU), which endorse the ...
-
arXiv - CS - Hardware Architecture Pub Date : 2022-03-11 ,DOI:arxiv-2203.05705Zhuoran Song, Yihong Xu, Han Li, Naifeng Jing, Xiaoyao Liang, Li JiangThe training phases of Deep neural network~(DNN) consumes enormous processing time and energy. Compression techniques utilizing the sparsity of DNNs can effectively accelerate the inference phase of DNNs. However, it is hardly used in the training phase because the training phase involves dense matrix-multiplication using General-Purpose Computation on Graphics Processors (GPGPU), which endorse the ...
-
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2022-03-09 ,DOI:arxiv-2203.04570Zhuoran Song, Yihong Xu, Zhezhi He, Li Jiang, Naifeng Jing, Xiaoyao LiangVision transformer (ViT) has achieved competitive accuracy on a variety of computer vision applications, but its computational cost impedes the deployment on resource-limited mobile devices. We explore the sparsity in ViT and observe that informative patches and heads are sufficient for accurate image recognition. In this paper, we propose a cascade pruning framework named CP-ViT by predicting sparsity ...
-
arXiv - CS - Artificial Intelligence Pub Date : 2022-03-09 ,DOI:arxiv-2203.04570Zhuoran Song, Yihong Xu, Zhezhi He, Li Jiang, Naifeng Jing, Xiaoyao LiangVision transformer (ViT) has achieved competitive accuracy on a variety of computer vision applications, but its computational cost impedes the deployment on resource-limited mobile devices. We explore the sparsity in ViT and observe that informative patches and heads are sufficient for accurate image recognition. In this paper, we propose a cascade pruning framework named CP-ViT by predicting sparsity ...
-
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (IF 2.565) Pub Date : 2021-08-24 ,DOI:10.1109/tcad.2021.3107252Zihan Zhang, Jianfei Jiang, Yongxin Zhu, Qin Wang, Zhigang Mao, Naifeng JingResistive-RAM (RRAM)-based deep neural network (DNN) accelerator has shown a great potential as it is good at the matrix–vector multiplication (MVM) operator. However, it does not benefit non-MVM operators, such as transcendental activation or elementwise operations, which often require customized CMOS circuits in conventional DNN accelerator designs. In this article, we propose a new RRAM-based DNN ...
-
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (IF 2.565) Pub Date : 2021-07-15 ,DOI:10.1109/tcad.2021.3097288Taozhong Li, Naifeng Jing, Zhigang Mao, Yiran ChenRow-column-NVM (RC-NVM) is a new architecture for emerging nonvolatile memory (NVM), such as ReRAM, PCM, and STT-RAM. It leverages the symmetry of crossbar structure and supports both row and column memory accesses. The new architecture is well fit for the applications with different access patterns which suffer from low efficiency and high-energy consumption in traditional memory architecture. However ...
-
arXiv - CS - Machine Learning Pub Date : 2021-03-02 ,DOI:arxiv-2103.01705Fangxin Liu, Wenbo Zhao, Yilong Zhao, Zongwu Wang, Tao Yang, Zhezhi He, Naifeng Jing, Xiaoyao Liang, Li JiangResistive Random-Access-Memory (ReRAM) crossbar is a promising technique for deep neural network (DNN) accelerators, thanks to its in-memory and in-situ analog computing abilities for Vector-Matrix Multiplication-and-Accumulations (VMMs). However, it is challenging for crossbar architecture to exploit the sparsity in the DNN. It inevitably causes complex and costly control to exploit fine-grained sparsity ...
-
arXiv - CS - Hardware Architecture Pub Date : 2021-03-02 ,DOI:arxiv-2103.01705Fangxin Liu, Wenbo Zhao, Yilong Zhao, Zongwu Wang, Tao Yang, Zhezhi He, Naifeng Jing, Xiaoyao Liang, Li JiangResistive Random-Access-Memory (ReRAM) crossbar is a promising technique for deep neural network (DNN) accelerators, thanks to its in-memory and in-situ analog computing abilities for Vector-Matrix Multiplication-and-Accumulations (VMMs). However, it is challenging for crossbar architecture to exploit the sparsity in the DNN. It inevitably causes complex and costly control to exploit fine-grained sparsity ...
-
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-03-02 ,DOI:arxiv-2103.01705Fangxin Liu, Wenbo Zhao, Yilong Zhao, Zongwu Wang, Tao Yang, Zhezhi He, Naifeng Jing, Xiaoyao Liang, Li JiangResistive Random-Access-Memory (ReRAM) crossbar is a promising technique for deep neural network (DNN) accelerators, thanks to its in-memory and in-situ analog computing abilities for Vector-Matrix Multiplication-and-Accumulations (VMMs). However, it is challenging for crossbar architecture to exploit the sparsity in the DNN. It inevitably causes complex and costly control to exploit fine-grained sparsity ...
-
Laboratory Investigation (IF 5.502) Pub Date : 2021-02-01 ,DOI:10.1038/s41374-021-00537-1Jing Ke, Yiqing Shen, Yizhou Lu, Junwei Deng, Jason D. Wright, Yan Zhang, Qin Huang, Dadong Wang, Naifeng Jing, Xiaoyao Liang, Fusong JiangCervical cancer is one of the most frequent cancers in women worldwide, yet the early detection and treatment of lesions via regular cervical screening have led to a drastic reduction in the mortality rate. However, the routine examination of screening as a regular health checkup of women is characterized as time-consuming and labor-intensive, while there is lack of characteristic phenotypic profile ...
-
Analog Integrated Circuits and Signal Processing (IF 1.321) Pub Date : 2021-01-03 ,DOI:10.1007/s10470-020-01739-1Jinhao Li, Jianfei Jiang, Qin Wang, Naifeng Jing, Weiguang Sheng, Guanghui HeTime difference amplifier (TDA) is often used in time domain interconnection, computing and measurement. Gain and linearity control are two main design issues. To reduce the nonlinear distortion, a novel self-adaptive pulse shrink circuit is proposed for the SR-latch based time difference amplifier. The multi-stage self-adaptive pulse shrink unit can compensate for the gain error caused by the high-order ...
-
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (IF 2.565) Pub Date : 2021-01-01 ,DOI:10.1109/tcad.2020.2989373Zhuoran Song, Yanan Sun, Lerong Chen, Tianjian Li, Naifeng Jing, Xiaoyao Liang, Li JiangDeep neural networks (DNNs) have gained a strong momentum among various applications. The enormous matrix-multiplication exhibited in the above DNNs is computation and memory intensive. Resistive random-access memory crossbar (RRAM-crossbar) consisting of memristor cells can naturally carry out the matrix-vector multiplication. RRAM-crossbar-based accelerator, therefore, has two orders of magnitude ...
-
IEEE Transactions on Very Large Scale Integration (VLSI) Systems (IF 2.775) Pub Date : 2020-11-23 ,DOI:10.1109/tvlsi.2020.3036822Zhuojun Liang, Dongxu Lv, Chao Cui, Hai-Bao Chen, Weifeng He, Weiguang Sheng, Naifeng Jing, Zhigang Mao, Guanghui HeThis article presents an $8\times 8$ lattice-reduction-aided (LRA) soft-output multiple-input multiple-output (MIMO) detector for Chinese enhanced ultrahigh throughput (EUHT) wireless local area network (LAN) standard. The preprocessing algorithm combining simplified-sorting Cholesky decomposition and low-complexity decoupled lattice reduction (LDLR) is proposed to reduce computational complexity and ...
-
International Journal of Remote Sensing (IF 3.531) Pub Date : 2020-11-18 ,DOI:10.1080/01431161.2020.1811422Qin Wang, Fengyi Shen, Lifu Cheng, Jianfei Jiang, Guanghui He, Weiguang Sheng, Naifeng Jing, Zhigang MaoABSTRACT Automatic ship detection in optical remote-sensing (ORS) images has wide applications in civil and military fields. Research on ship detection in ORS images started late compared to synthetic aperture radar (SAR) images, and it is difficult for traditional image-processing algorithms to achieve high accuracy. Therefore, we propose a ship-detection method based on a deep convolutional neural ...
-
International Journal of Remote Sensing (IF 3.531) Pub Date : 2020-11-18 ,DOI:Qin Wang, Fengyi Shen, Lifu Cheng, Jianfei Jiang, Guanghui He, Weiguang Sheng, Naifeng Jing, Zhigang MaoABSTRACT Automatic ship detection in optical remote-sensing (ORS) images has wide applications in civil and military fields. Research on ship detection in ORS images started late compared to synthetic aperture radar (SAR) images, and it is difficult for traditional image-processing algorithms to achieve high accuracy. Therefore, we propose a ship-detection method based on a deep convolutional neural ...
-
IEEE Transactions on Very Large Scale Integration (VLSI) Systems (IF 2.775) Pub Date : 2020-10-01 ,DOI:10.1109/tvlsi.2020.3010647Guanghui He, Sijie Zheng, Naifeng JingThe SRAM-based field-programmable gate array (FPGA) is extremely susceptible to single event upsets (SEUs) on configuration memory which can lead to soft error and malfunction of the circuit. Facing the ever-growing number of configuration bits in modern FPGAs, traditional scrubbing is getting harder to find errors in time, resulting in mismatching between the SEU sensitivity and scrubbing performance ...
-
Remote Sensing (IF 5.349) Pub Date : 2020-04-08 ,DOI:10.3390/rs12071196Yijia Zhang, Weiguang Sheng, Jianfei Jiang, Naifeng Jing, Qin Wang, Zhigang Mao
Much attention is being paid to using high-performance convolutional neural networks (CNNs) in the area of ship detection in optical remoting sensing (ORS) images. However, the problem of false negatives (FNs) caused by side-by-side ships cannot be solved, and the number of false positives (FPs) remains high. This paper uses a DLA-34 network with deformable convolution layers as the backbone. The network ...
-
ACM Transactions on Design Automation of Electronic Systems (IF 1.447) Pub Date : 2019-08-16 ,DOI:10.1145/3342239Li Jiang, Zhuoran Song, Haiyue Song, Chengwen Xu, Qiang Xu, Naifeng Jing, Weifeng Zhang, Xiaoyao LiangApproximate computing is a promising design paradigm that introduces a new dimension—error—into the original design space. By allowing the inexact computation in error-tolerance applications, approximate computing can gain both performance and energy efficiency. A neural network (NN) is a universal approximator in theory and possesses a high level of parallelism. The emerging deep neural network accelerators ...
-
IEEE Geoscience and Remote Sensing Letters (IF 5.343) Pub Date : 2019-06-01 ,DOI:10.1109/lgrs.2018.2888887Shuo Zhang, Guanghui He, Hai-Bao Chen, Naifeng Jing, Qin WangObject detection in aerial images is widely applied in many applications. In recent years, faster region convolutional neural network shows a great improvement on object detecting in natural images. Considering the size and distribution characteristic of object in remote sensing images, the region proposal network (RPN) should be changed before being adopted. In this letter, a scale adaptive proposal ...
-
IEEE Transactions on Circuits and Systems II: Express Briefs (IF 3.691) Pub Date : 2019-05-01 ,DOI:10.1109/tcsii.2019.2908243Yanan Sun, Jiawei Gu, Weifeng He, Qin Wang, Naifeng Jing, Zhigang Mao, Weikang Qian, Li JiangA new nonvolatile static random access memory (nvSRAM) design based on the multi-level cell (MLC) characteristics of resistive RAMs (RRAMs) is presented in this brief to reduce the store energy of frequent-off and instant-on applications. The data store circuitry is designed to enable the energy-efficient multi-bit data backup of every two SRAM cells into a single four-level MLC RRAM of the proposed ...
-
ACM Transactions on Design Automation of Electronic Systems (IF 1.447) Pub Date : 2019-03-22 ,DOI:10.1145/3306495Taozhong Li, Qin Wang, Yongxin Zhu, Jianfei Jiang, Guanghui He, Jing Jin, Zhigang Mao, Naifeng JingThe coming era of big data revives the Processing-in-memory (PIM) architecture to relieve the memory wall problem that embarrasses the modern computing system. However, most existing PIM designs just put computing units closer to memory, rather than a complete integration of them due to their incompatibility in CMOS manufacturing. Fortunately, the emerging Resistive-RAM (ReRAM) offers new hope to this ...
-
IEEE Transactions on Very Large Scale Integration (VLSI) Systems (IF 2.775) Pub Date : 2019-02-01 ,DOI:10.1109/tvlsi.2018.2876906Qin Wang, Zechen Liu, Jianfei Jiang, Naifeng Jing, Weiguang ShengDue to the winding level of the thinned wafers and the surface roughness of silicon dies, the quality of through-silicon vias (TSVs) varies during the fabrication and bonding process, which greatly reduces the yield of 3-D-ICs. The basic method to repair faulty TSVs (FTSVs) is to transfer the signals on FTSVs through regular TSVs. Many redundant TSV (RTSV) structures have been proposed to repair uniformly ...
-
arXiv - CS - Machine Learning Pub Date : 2018-10-19 ,DOI:arxiv-1810.08379Haiyue Song, Chengwen Xu, Qiang Xu, Zhuoran Song, Naifeng Jing, Xiaoyao Liang, Li JiangNeural approximate computing gains enormous energy-efficiency at the cost of tolerable quality-loss. A neural approximator can map the input data to output while a classifier determines whether the input data are safe to approximate with quality guarantee. However, existing works cannot maximize the invocation of the approximator, resulting in limited speedup and energy saving. By exploring the mapping ...
-
arXiv - CS - Machine Learning Pub Date : 2018-07-27 ,DOI:arxiv-1807.10458Zhenghao Peng, Xuyang Chen, Chengwen Xu, Naifeng Jing, Xiaoyao Liang, Cewu Lu, Li JiangNeural network based approximate computing is a universal architecture promising to gain tremendous energy-efficiency for many error resilient applications. To guarantee the approximation quality, existing works deploy two neural networks (NNs), e.g., an approximator and a predictor. The approximator provides the approximate results, while the predictor predicts whether the input data is safe to approximate ...
-
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (IF 2.565) Pub Date : 2018-07-01 ,DOI:10.1109/tcad.2017.2695899Li Jiang, Tianjian Li, Naifeng Jing, Nam Sung Kim, Minyi Guo, Xiaoyao Liang
Carbon nanotube field effect transistor (CNFET), using the carbon nanotubes (CNTs) as the material for conducting, is a promising alternative of CMOS technology to overcome the “power wall” issue. Recently, a microprocessor solely based on CNFETs was fabricated and demonstrated, which is a big step forward to the industrial practice. However, CNFETs are inherently subject to much larger process variation ...
-
IEEE Transactions on Parallel and Distributed Systems (IF 3.757) Pub Date : 2018-03-01 ,DOI:10.1109/tpds.2017.2773516Jianfei Wang, Qin Wang, Li Jiang, Chao Li, Xiaoyao Liang, Naifeng JingGPGPU accelerated computing has revolutionized a broad range of applications. To serve between the ever-growing computing capability and external memory, the on-chip memory is becoming increasingly important to GPGPU performance for general-purpose computing. Inherited from the traditional CPUs, however, the contemporary GPGPU on-chip memory design is suboptimal to the SIMT (single instruction, multiple ...
-
Concurrency and Computation: Practice and Experience (IF 1.831) Pub Date : 2017-03-03 ,DOI:10.1002/cpe.4104Jianfei Wang, Fengfeng Fan, Li Jiang, Xiaoyao Liang, Naifeng JingShanghai Jiao Tong University; Shanghai 200240 ChinaContemporary general‐purpose graphic processing units (GPGPUs) successfully parallelize an application into thousands of concurrent threads with remarkably improved performance. Such massive threads will compete for the small‐sized first‐level data (L1D) cache, leading to an exaggerated cache‐thrashing problem, which may degrade the overall performance significantly. In this paper, we propose a selective ...

高级搜索
我的搜索条件