当前期刊: arXiv - CS - Computer Vision and Pattern Recognition Go to current issue    加入关注   
显示样式:        排序: IF: - GO 导出
我的关注
我的收藏
您暂时未登录!
登录
  • Learn to Dance with AIST++: Music Conditioned 3D Dance Generation
    arXiv.cs.CV Pub Date : 2021-01-21
    Ruilong Li; Shan Yang; David A. Ross; Angjoo Kanazawa

    In this paper, we present a transformer-based learning framework for 3D dance generation conditioned on music. We carefully design our network architecture and empirically study the keys for obtaining qualitatively pleasing results. The critical components include a deep cross-modal transformer, which well learns the correlation between the music and dance motion; and the full-attention with future-N

    更新日期:2021-01-22
  • A two-stage data association approach for 3D Multi-object Tracking
    arXiv.cs.CV Pub Date : 2021-01-21
    Minh-Quan Dao; Vincent Frémont

    Multi-object tracking (MOT) is an integral part of any autonomous driving pipelines because itproduces trajectories which has been taken by other moving objects in the scene and helps predicttheir future motion. Thanks to the recent advances in 3D object detection enabled by deep learning,track-by-detection has become the dominant paradigm in 3D MOT. In this paradigm, a MOT systemis essentially made

    更新日期:2021-01-22
  • DAF:re: A Challenging, Crowd-Sourced, Large-Scale, Long-Tailed Dataset For Anime Character Recognition
    arXiv.cs.CV Pub Date : 2021-01-21
    Edwin Arkel Rios; Wen-Huang Cheng; Bo-Cheng Lai

    In this work we tackle the challenging problem of anime character recognition. Anime, referring to animation produced within Japan and work derived or inspired from it. For this purpose we present DAF:re (DanbooruAnimeFaces:revamped), a large-scale, crowd-sourced, long-tailed dataset with almost 500 K images spread across more than 3000 classes. Additionally, we conduct experiments on DAF:re and similar

    更新日期:2021-01-22
  • Regularization via deep generative models: an analysis point of view
    arXiv.cs.CV Pub Date : 2021-01-21
    Thomas Oberlin; Mathieu Verm

    This paper proposes a new way of regularizing an inverse problem in imaging (e.g., deblurring or inpainting) by means of a deep generative neural network. Compared to end-to-end models, such approaches seem particularly interesting since the same network can be used for many different problems and experimental conditions, as soon as the generative model is suited to the data. Previous works proposed

    更新日期:2021-01-22
  • Image-to-Image Translation: Methods and Applications
    arXiv.cs.CV Pub Date : 2021-01-21
    Yingxue Pang; Jianxin Lin; Tao Qin; Zhibo Chen

    Image-to-image translation (I2I) aims to transfer images from a source domain to a target domain while preserving the content representations. I2I has drawn increasing attention and made tremendous progress in recent years because of its wide range of applications in many computer vision and image processing problems, such as image synthesis, segmentation, style transfer, restoration, and pose estimation

    更新日期:2021-01-22
  • MPASNET: Motion Prior-Aware Siamese Network for Unsupervised Deep Crowd Segmentation in Video Scenes
    arXiv.cs.CV Pub Date : 2021-01-21
    Jinhai Yang; Hua Yang

    Crowd segmentation is a fundamental task serving as the basis of crowded scene analysis, and it is highly desirable to obtain refined pixel-level segmentation maps. However, it remains a challenging problem, as existing approaches either require dense pixel-level annotations to train deep learning models or merely produce rough segmentation maps from optical or particle flows with physical models.

    更新日期:2021-01-22
  • Hierarchical Graph-RNNs for Action Detection of Multiple Activities
    arXiv.cs.CV Pub Date : 2021-01-21
    Sovan Biswas; Yaser Souri; Juergen Gall

    In this paper, we propose an approach that spatially localizes the activities in a video frame where each person can perform multiple activities at the same time. Our approach takes the temporal scene context as well as the relations of the actions of detected persons into account. While the temporal context is modeled by a temporal recurrent neural network (RNN), the relations of the actions are modeled

    更新日期:2021-01-22
  • Discovering Multi-Label Actor-Action Association in a Weakly Supervised Setting
    arXiv.cs.CV Pub Date : 2021-01-21
    Sovan Biswas; Juergen Gall

    Since collecting and annotating data for spatio-temporal action detection is very expensive, there is a need to learn approaches with less supervision. Weakly supervised approaches do not require any bounding box annotations and can be trained only from labels that indicate whether an action occurs in a video clip. Current approaches, however, cannot handle the case when there are multiple persons

    更新日期:2021-01-22
  • Activity Graph Transformer for Temporal Action Localization
    arXiv.cs.CV Pub Date : 2021-01-21
    Megha Nawhal; Greg Mori

    We introduce Activity Graph Transformer, an end-to-end learnable model for temporal action localization, that receives a video as input and directly predicts a set of action instances that appear in the video. Detecting and localizing action instances in untrimmed videos requires reasoning over multiple action instances in a video. The dominant paradigms in the literature process videos temporally

    更新日期:2021-01-22
  • An Effective Data Augmentation for Person Re-identification
    arXiv.cs.CV Pub Date : 2021-01-21
    Yunpeng Gong; Zhiyong Zeng

    In order to make full use of structural information of grayscale images and reduce adverse impact of illumination variation for person re-identification (ReID), an effective data augmentation method is proposed in this paper, which includes Random Grayscale Transformation, Random Grayscale Patch Replacement and their combination. It is discovered that structural information has a significant effect

    更新日期:2021-01-22
  • Progressive Co-Attention Network for Fine-grained Visual Classification
    arXiv.cs.CV Pub Date : 2021-01-21
    Tian Zhang; Dongliang Chang; Zhanyu Ma; Jun Guo

    Fine-grained visual classification aims to recognize images belonging to multiple sub-categories within a same category. It is a challenging task due to the inherently subtle variations among highly-confused categories. Most existing methods only take individual image as input, which may limit the ability of models to recognize contrastive clues from different images. In this paper, we propose an effective

    更新日期:2021-01-22
  • Fast and Robust Certifiable Estimation of the Relative Pose Between Two Calibrated Cameras
    arXiv.cs.CV Pub Date : 2021-01-21
    Mercedes Garcia-Salguero; Javier Gonzalez-Jimenez

    The Relative Pose problem (RPp) for cameras aims to estimate the relative orientation and translation (pose) given a set of pair-wise feature correspondences between two central and calibrated cameras. The RPp is stated as an optimization problem where the squared, normalized epipolar error is minimized over the set of normalized essential matrices. In this work, we contribute an efficient and complete

    更新日期:2021-01-22
  • Pre-training without Natural Images
    arXiv.cs.CV Pub Date : 2021-01-21
    Hirokatsu Kataoka; Kazushige Okayasu; Asato Matsumoto; Eisuke Yamagata; Ryosuke Yamada; Nakamasa Inoue; Akio Nakamura; Yutaka Satoh

    Is it possible to use convolutional neural networks pre-trained without any natural images to assist natural image understanding? The paper proposes a novel concept, Formula-driven Supervised Learning. We automatically generate image patterns and their category labels by assigning fractals, which are based on a natural law existing in the background knowledge of the real world. Theoretically, the use

    更新日期:2021-01-22
  • CM-NAS: Rethinking Cross-Modality Neural Architectures for Visible-Infrared Person Re-Identification
    arXiv.cs.CV Pub Date : 2021-01-21
    Chaoyou Fu; Yibo Hu; Xiang Wu; Hailin Shi; Tao Mei; Ran He

    Visible-Infrared person re-identification (VI-ReID) aims at matching cross-modality pedestrian images, breaking through the limitation of single-modality person ReID in dark environment. In order to mitigate the impact of large modality discrepancy, existing works manually design various two-stream architectures to separately learn modality-specific and modality-sharable representations. Such a manual

    更新日期:2021-01-22
  • Anti-UAV: A Large Multi-Modal Benchmark for UAV Tracking
    arXiv.cs.CV Pub Date : 2021-01-21
    Nan Jiang; Kuiran Wang; Xiaoke Peng; Xuehui Yu; Qiang Wang; Junliang Xing; Guorong Li; Guodong Guo; Jian Zhao; Zhenjun Han

    Unmanned Aerial Vehicle (UAV) offers lots of applications in both commerce and recreation. With this, monitoring the operation status of UAVs is crucially important. In this work, we consider the task of tracking UAVs, providing rich information such as location and trajectory. To facilitate research in this topic, we propose a dataset, Anti-UAV, with more than 300 video pairs containing over 580k

    更新日期:2021-01-22
  • FWB-Net:Front White Balance Network for Color Shift Correction in Single Image Dehazing via Atmospheric Light Estimation
    arXiv.cs.CV Pub Date : 2021-01-21
    Cong Wang; Yan Huang; Yuexian Zou; Yong Xu

    In recent years, single image dehazing deep models based on Atmospheric Scattering Model (ASM) have achieved remarkable results. But the dehazing outputs of those models suffer from color shift. Analyzing the ASM model shows that the atmospheric light factor (ALF) is set as a scalar which indicates ALF is constant for whole image. However, for images taken in real-world, the illumination is not uniformly

    更新日期:2021-01-22
  • COLLIDE-PRED: Prediction of On-Road Collision From Surveillance Videos
    arXiv.cs.CV Pub Date : 2021-01-21
    Deesha Chavan; Dev Saad; Debarati B. Chakraborty

    Predicting on-road abnormalities such as road accidents or traffic violations is a challenging task in traffic surveillance. If such predictions can be done in advance, many damages can be controlled. Here in our wok, we tried to formulate a solution for automated collision prediction in traffic surveillance videos with computer vision and deep networks. It involves object detection, tracking, trajectory

    更新日期:2021-01-22
  • Segmenting Transparent Object in the Wild with Transformer
    arXiv.cs.CV Pub Date : 2021-01-21
    Enze Xie; Wenjia Wang; Wenhai Wang; Peize Sun; Hang Xu; Ding Liang; Ping Luo

    This work presents a new fine-grained transparent object segmentation dataset, termed Trans10K-v2, extending Trans10K-v1, the first large-scale transparent object segmentation dataset. Unlike Trans10K-v1 that only has two limited categories, our new dataset has several appealing benefits. (1) It has 11 fine-grained categories of transparent objects, commonly occurring in the human domestic environment

    更新日期:2021-01-22
  • Fire Threat Detection From Videos with Q-Rough Sets
    arXiv.cs.CV Pub Date : 2021-01-21
    Debarati B. Chakrabortya; Vinay Detania; Shah Parshv Jigneshkumar

    This article defines new methods for unsupervised fire region segmentation and fire threat detection from video stream. Fire in control serves a number of purposes to human civilization, but it could simultaneously be a threat once its spread becomes uncontrolled. There exists many methods on fire region segmentation and fire non-fire classification. But the approaches to determine the threat associated

    更新日期:2021-01-22
  • All-Day Object Tracking for Unmanned Aerial Vehicle
    arXiv.cs.CV Pub Date : 2021-01-21
    Bowen Li; Changhon Fu; Fangqiang Ding; Junjie Ye; Fuling Lin

    Visual object tracking, which is representing a major interest in image processing field, has facilitated numerous real world applications. Among them, equipping unmanned aerial vehicle (UAV) with real time robust visual trackers for all day aerial maneuver, is currently attracting incremental attention and has remarkably broadened the scope of applications of object tracking. However, prior tracking

    更新日期:2021-01-22
  • Video Summarization: Study of various techniques
    arXiv.cs.CV Pub Date : 2021-01-21
    Ravi Raj; Varad Bhatnagar; Aman Kumar Singh; Sneha Mane; Nilima Walde

    A comparative study of various techniques which can be used for summarization of Videos i.e. Video to Video conversion is presented along with respective architecture, results, strengths and shortcomings. In all approaches, a lengthy video is converted into a shorter video which aims to capture all important events that are present in the original video. The definition of 'important event' may vary

    更新日期:2021-01-22
  • Generative Zero-shot Network Quantization
    arXiv.cs.CV Pub Date : 2021-01-21
    Xiangyu He; Qinghao Hu; Peisong Wang; Jian Cheng

    Convolutional neural networks are able to learn realistic image priors from numerous training samples in low-level image generation and restoration. We show that, for high-level image recognition tasks, we can further reconstruct "realistic" images of each category by leveraging intrinsic Batch Normalization (BN) statistics without any training data. Inspired by the popular VAE/GAN methods, we regard

    更新日期:2021-01-22
  • Rethinking Semantic Segmentation Evaluation for Explainability and Model Selection
    arXiv.cs.CV Pub Date : 2021-01-21
    Yuxiang Zhang; Sachin Mehta; Anat Caspi

    Semantic segmentation aims to robustly predict coherent class labels for entire regions of an image. It is a scene understanding task that powers real-world applications (e.g., autonomous navigation). One important application, the use of imagery for automated semantic understanding of pedestrian environments, provides remote mapping of accessibility features in street environments. This application

    更新日期:2021-01-22
  • Finger Vein Recognition by Generating Code
    arXiv.cs.CV Pub Date : 2021-01-21
    Zhongxia Zhang; Mingwen Wang

    Finger vein recognition has drawn increasing attention as one of the most popular and promising biometrics due to its high distinguishes ability, security and non-invasive procedure. The main idea of traditional schemes is to directly extract features from finger vein images or patterns and then compare features to find the best match. However, the features extracted from images contain much redundant

    更新日期:2021-01-22
  • MoG-QSM: Model-based Generative Adversarial Deep Learning Network for Quantitative Susceptibility Mapping
    arXiv.cs.CV Pub Date : 2021-01-21
    Ruimin Feng; Jiayi Zhao; He Wang; Baofeng Yang; Jie Feng; Yuting Shi; Ming Zhang; Chunlei Liu; Yuyao Zhang; Jie Zhuang; Hongjiang Wei

    Quantitative susceptibility mapping (QSM) estimates the underlying tissue magnetic susceptibility from the MRI gradient-echo phase signal and has demonstrated great potential in quantifying tissue susceptibility in various brain diseases. However, the intrinsic ill-posed inverse problem relating the tissue phase to the underlying susceptibility distribution affects the accuracy for quantifying tissue

    更新日期:2021-01-22
  • TDA-Net: Fusion of Persistent Homology and Deep Learning Features for COVID-19 Detection in Chest X-Ray Images
    arXiv.cs.CV Pub Date : 2021-01-21
    Mustafa Hajij; Ghada Zamzmi; Fawwaz Batayneh

    Topological Data Analysis (TDA) has emerged recently as a robust tool to extract and compare the structure of datasets. TDA identifies features in data such as connected components and holes and assigns a quantitative measure to these features. Several studies reported that topological features extracted by TDA tools provide unique information about the data, discover new insights, and determine which

    更新日期:2021-01-22
  • Nonparametric clustering for image segmentation
    arXiv.cs.CV Pub Date : 2021-01-20
    Giovanna Menardi

    Image segmentation aims at identifying regions of interest within an image, by grouping pixels according to their properties. This task resembles the statistical one of clustering, yet many standard clustering methods fail to meet the basic requirements of image segmentation: segment shapes are often biased toward predetermined shapes and their number is rarely determined automatically. Nonparametric

    更新日期:2021-01-22
  • Aesthetics, Personalization and Recommendation: A survey on Deep Learning in Fashion
    arXiv.cs.CV Pub Date : 2021-01-20
    Wei Gong; Laila Khalid

    Machine learning is completely changing the trends in the fashion industry. From big to small every brand is using machine learning techniques in order to improve their revenue, increase customers and stay ahead of the trend. People are into fashion and they want to know what looks best and how they can improve their style and elevate their personality. Using Deep learning technology and infusing it

    更新日期:2021-01-22
  • Text Line Segmentation for Challenging Handwritten Document Images Using Fully Convolutional Network
    arXiv.cs.CV Pub Date : 2021-01-20
    Berat Barakat; Ahmad Droby; Majeed Kassis; Jihad El-Sana

    This paper presents a method for text line segmentation of challenging historical manuscript images. These manuscript images contain narrow interline spaces with touching components, interpenetrating vowel signs and inconsistent font types and sizes. In addition, they contain curved, multi-skewed and multi-directed side note lines within a complex page layout. Therefore, bounding polygon labeling would

    更新日期:2021-01-22
  • Expectation-Maximization Regularized DeepLearning for Weakly Supervised Tumor Segmentation for Glioblastoma
    arXiv.cs.CV Pub Date : 2021-01-21
    Chao Li; Wenjian Huang; Xi Chen; Yiran Wei; Stephen J. Price; Carola-Bibiane Schönlieb

    We present an Expectation-Maximization (EM) Regularized Deep Learning (EMReDL) model for the weakly supervised tumor segmentation. The proposed framework was tailored to glioblastoma, a type of malignant tumor characterized by its diffuse infiltration into the surrounding brain tissue, which poses significant challenge to treatment target and tumor burden estimation based on conventional structural

    更新日期:2021-01-22
  • Self-Adaptive Training: Bridging the Supervised and Self-Supervised Learning
    arXiv.cs.CV Pub Date : 2021-01-21
    Lang Huang; Chao Zhang; Hongyang Zhang

    We propose self-adaptive training -- a unified training algorithm that dynamically calibrates and enhances training process by model predictions without incurring extra computational cost -- to advance both supervised and self-supervised learning of deep neural networks. We analyze the training dynamics of deep networks on training data that are corrupted by, e.g., random noise and adversarial examples

    更新日期:2021-01-22
  • Copycat CNN: Are Random Non-Labeled Data Enough to Steal Knowledge from Black-box Models?
    arXiv.cs.CV Pub Date : 2021-01-21
    Jacson Rodrigues Correia-Silva; Rodrigo F. Berriel; Claudine Badue; Alberto F. De Souza; Thiago Oliveira-Santos

    Convolutional neural networks have been successful lately enabling companies to develop neural-based products, which demand an expensive process, involving data acquisition and annotation; and model generation, usually requiring experts. With all these costs, companies are concerned about the security of their models against copies and deliver them as black-boxes accessed by APIs. Nonetheless, we argue

    更新日期:2021-01-22
  • Cain: Automatic Code Generation for Simultaneous Convolutional Kernels on Focal-plane Sensor-processors
    arXiv.cs.CV Pub Date : 2021-01-21
    Edward Stow; Riku Murai; Sajad Saeedi; Paul H. J. Kelly

    Focal-plane Sensor-processors (FPSPs) are a camera technology that enable low power, high frame rate computation, making them suitable for edge computation. Unfortunately, these devices' limited instruction sets and registers make developing complex algorithms difficult. In this work, we present Cain - a compiler that targets SCAMP-5, a general-purpose FPSP - which generates code from multiple convolutional

    更新日期:2021-01-22
  • Characterizing signal propagation to close the performance gap in unnormalized ResNets
    arXiv.cs.CV Pub Date : 2021-01-21
    Andrew Brock; Soham De; Samuel L. Smith

    Batch Normalization is a key component in almost all state-of-the-art image classifiers, but it also introduces practical challenges: it breaks the independence between training examples within a batch, can incur compute and memory overhead, and often results in unexpected bugs. Building on recent theoretical analyses of deep ResNets at initialization, we propose a simple set of analysis tools to characterize

    更新日期:2021-01-22
  • ItNet: iterative neural networks with tiny graphs for accurate and efficient anytime prediction
    arXiv.cs.CV Pub Date : 2021-01-21
    Thomas Pfeil

    Deep neural networks have usually to be compressed and accelerated for their usage in low-power, e.g. mobile, devices. Recently, massively-parallel hardware accelerators were developed that offer high throughput and low latency at low power by utilizing in-memory computation. However, to exploit these benefits the computational graph of a neural network has to fit into the in-computation memory of

    更新日期:2021-01-22
  • Geometric Moment Invariants to Motion Blur
    arXiv.cs.CV Pub Date : 2021-01-21
    Hongxiang Hao; Hanlin Mo; Hua Li

    In this paper, we focus on removing interference of motion blur by the derivation of motion blur invariants.Unlike earlier work, we don't restore any blurred image. Based on geometric moment and mathematical model of motion blur, we prove that geometric moments of blurred image and original image are linearly related. Depending on this property, we can analyse whether an existing moment-based feature

    更新日期:2021-01-22
  • GhostSR: Learning Ghost Features for Efficient Image Super-Resolution
    arXiv.cs.CV Pub Date : 2021-01-21
    Ying Nie; Kai Han; Zhenhua Liu; An Xiao; Yiping Deng; Chunjing Xu; Yunhe Wang

    Modern single image super-resolution (SISR) system based on convolutional neural networks (CNNs) achieves fancy performance while requires huge computational costs. The problem on feature redundancy is well studied in visual recognition task, but rarely discussed in SISR. Based on the observation that many features in SISR models are also similar to each other, we propose to use shift operation to

    更新日期:2021-01-22
  • Weighted Fuzzy-Based PSNR for Watermarking
    arXiv.cs.CV Pub Date : 2021-01-21
    Maedeh Jamali; Nader Karimi; Shadrokh Samavi

    One of the problems of conventional visual quality evaluation criteria such as PSNR and MSE is the lack of appropriate standards based on the human visual system (HVS). They are calculated based on the difference of the corresponding pixels in the original and manipulated image. Hence, they practically do not provide a correct understanding of the image quality. Watermarking is an image processing

    更新日期:2021-01-22
  • Exponential Moving Average Normalization for Self-supervised and Semi-supervised Learning
    arXiv.cs.CV Pub Date : 2021-01-21
    Zhaowei Cai; Avinash Ravichandran; Subhransu Maji; Charless Fowlkes; Zhuowen Tu; Stefano Soatto

    We present a plug-in replacement for batch normalization (BN) called exponential moving average normalization (EMAN), which improves the performance of existing student-teacher based self- and semi-supervised learning techniques. Unlike the standard BN, where the statistics are computed within each batch, EMAN, used in the teacher, updates its statistics by exponential moving average from the BN statistics

    更新日期:2021-01-22
  • Analysis of Information Flow Through U-Nets
    arXiv.cs.CV Pub Date : 2021-01-21
    Suemin Lee; Ivan V. Bajić

    Deep Neural Networks (DNNs) have become ubiquitous in medical image processing and analysis. Among them, U-Nets are very popular in various image segmentation tasks. Yet, little is known about how information flows through these networks and whether they are indeed properly designed for the tasks they are being proposed for. In this paper, we employ information-theoretic tools in order to gain insight

    更新日期:2021-01-22
  • Learning Ultrasound Rendering from Cross-Sectional Model Slices for Simulated Training
    arXiv.cs.CV Pub Date : 2021-01-20
    Lin Zhang; Tiziano Portenier; Orcun Goksel

    Purpose. Given the high level of expertise required for navigation and interpretation of ultrasound images, computational simulations can facilitate the training of such skills in virtual reality. With ray-tracing based simulations, realistic ultrasound images can be generated. However, due to computational constraints for interactivity, image quality typically needs to be compromised. Methods. We

    更新日期:2021-01-22
  • Chest X-ray lung and heart segmentation based on minimal training sets
    arXiv.cs.CV Pub Date : 2021-01-20
    Balázs Maga

    As the COVID-19 pandemic aggravated the excessive workload of doctors globally, the demand for computer aided methods in medical imaging analysis increased even further. Such tools can result in more robust diagnostic pipelines which are less prone to human errors. In our paper, we present a deep neural network to which we refer to as Attention BCDU-Net, and apply it to the task of lung and heart segmentation

    更新日期:2021-01-22
  • Can stable and accurate neural networks be computed? -- On the barriers of deep learning and Smale's 18th problem
    arXiv.cs.CV Pub Date : 2021-01-20
    Vegard Antun; Matthew J. Colbrook; Anders C. Hansen

    Deep learning (DL) has had unprecedented success and is now entering scientific computing with full force. However, DL suffers from a universal phenomenon: instability, despite universal approximating properties that often guarantee the existence of stable neural networks (NNs). We show the following paradox. There are basic well-conditioned problems in scientific computing where one can prove the

    更新日期:2021-01-22
  • AXM-Net: Cross-Modal Context Sharing Attention Network for Person Re-ID
    arXiv.cs.CV Pub Date : 2021-01-19
    Ammarah Farooq; Muhammad Awais; Josef Kittler; Syed Safwan Khalid

    Cross-modal person re-identification (Re-ID) is critical for modern video surveillance systems. The key challenge is to align inter-modality representations according to semantic information present for a person and ignore background information. In this work, we present AXM-Net, a novel CNN based architecture designed for learning semantically aligned visual and textual representations. The underlying

    更新日期:2021-01-21
  • SAR and Optical data fusion based on Anisotropic Diffusion with PCA and Classification using Patch-based with LBP
    arXiv.cs.CV Pub Date : 2021-01-20
    Achala Shakya; Mantosh Biswas; Mahesh Pal

    SAR (VV and VH polarization) and optical data are widely used in image fusion to use the complimentary information of each other and to obtain the better-quality image (in terms of spatial and spectral features) for the improved classification results. This paper uses anisotropic diffusion with PCA for the fusion of SAR and optical data and patch-based SVM Classification with LBP (LBP-PSVM). Fusion

    更新日期:2021-01-21
  • On The Consistency Training for Open-Set Semi-Supervised Learning
    arXiv.cs.CV Pub Date : 2021-01-19
    Huixiang Luo; Hao Cheng; Yuting Gao; Ke Li; Mengdan Zhang; Fanxu Meng; Xiaowei Guo; Feiyue Huang; Xing Sun

    Conventional semi-supervised learning (SSL) methods, e.g., MixMatch, achieve great performance when both labeled and unlabeled dataset are drawn from the same distribution. However, these methods often suffer severe performance degradation in a more realistic setting, where unlabeled dataset contains out-of-distribution (OOD) samples. Recent approaches mitigate the negative influence of OOD samples

    更新日期:2021-01-21
  • Video Relation Detection with Trajectory-aware Multi-modal Features
    arXiv.cs.CV Pub Date : 2021-01-20
    Wentao Xie; Guanghui Ren; Si Liu

    Video relation detection problem refers to the detection of the relationship between different objects in videos, such as spatial relationship and action relationship. In this paper, we present video relation detection with trajectory-aware multi-modal features to solve this task. Considering the complexity of doing visual relation detection in videos, we decompose this task into three sub-tasks: object

    更新日期:2021-01-21
  • Focal and Efficient IOU Loss for Accurate Bounding Box Regression
    arXiv.cs.CV Pub Date : 2021-01-20
    Yi-Fan Zhang; Weiqiang Ren; Zhang Zhang; Zhen Jia; Liang Wang; Tieniu Tan

    In object detection, bounding box regression (BBR) is a crucial step that determines the object localization performance. However, we find that most previous loss functions for BBR have two main drawbacks: (i) Both $\ell_n$-norm and IOU-based loss functions are inefficient to depict the objective of BBR, which leads to slow convergence and inaccurate regression results. (ii) Most of the loss functions

    更新日期:2021-01-21
  • Fooling thermal infrared pedestrian detectors in real world using small bulbs
    arXiv.cs.CV Pub Date : 2021-01-20
    Xiaopei Zhu; Xiao Li; Jianmin Li; Zheyao Wang; Xiaolin Hu

    Thermal infrared detection systems play an important role in many areas such as night security, autonomous driving, and body temperature detection. They have the unique advantages of passive imaging, temperature sensitivity and penetration. But the security of these systems themselves has not been fully explored, which poses risks in applying these systems. We propose a physical attack method with

    更新日期:2021-01-21
  • Self-supervised pre-training enhances change detection in Sentinel-2 imagery
    arXiv.cs.CV Pub Date : 2021-01-20
    Marrit Leenstra; Diego Marcos; Francesca Bovolo; Devis Tuia

    While annotated images for change detection using satellite imagery are scarce and costly to obtain, there is a wealth of unlabeled images being generated every day. In order to leverage these data to learn an image representation more adequate for change detection, we explore methods that exploit the temporal consistency of Sentinel-2 times series to obtain a usable self-supervised learning signal

    更新日期:2021-01-21
  • Few-shot Action Recognition with Prototype-centered Attentive Learning
    arXiv.cs.CV Pub Date : 2021-01-20
    Xiatian Zhu; Antoine Toisoul; Juan-Manuel Prez-Ra; Li Zhang; Brais Martinez; Tao Xiang

    Few-shot action recognition aims to recognize action classes with few training samples. Most existing methods adopt a meta-learning approach with episodic training. In each episode, the few samples in a meta-training task are split into support and query sets. The former is used to build a classifier, which is then evaluated on the latter using a query-centered loss for model updating. There are however

    更新日期:2021-01-21
  • 1st Place Solution to ECCV-TAO-2020: Detect and Represent Any Object for Tracking
    arXiv.cs.CV Pub Date : 2021-01-20
    Fei Du; Bo Xu; Jiasheng Tang; Yuqi Zhang; Fan Wang; Hao Li

    We extend the classical tracking-by-detection paradigm to this tracking-any-object task. Solid detection results are first extracted from TAO dataset. Some state-of-the-art techniques like \textbf{BA}lanced-\textbf{G}roup \textbf{S}oftmax (\textbf{BAGS}\cite{li2020overcoming}) and DetectoRS\cite{qiao2020detectors} are integrated during detection. Then we learned appearance features to represent any

    更新日期:2021-01-21
  • Scalable Deep Compressive Sensing
    arXiv.cs.CV Pub Date : 2021-01-20
    Zhonghao Zhang; Yipeng Liu; Xingyu Cao; Fei Wen; Ce Zhu

    Deep learning has been used to image compressive sensing (CS) for enhanced reconstruction performance. However, most existing deep learning methods train different models for different subsampling ratios, which brings additional hardware burden. In this paper, we develop a general framework named scalable deep compressive sensing (SDCS) for the scalable sampling and reconstruction (SSR) of all existing

    更新日期:2021-01-21
  • Macroscopic Control of Text Generation for Image Captioning
    arXiv.cs.CV Pub Date : 2021-01-20
    Zhangzi Zhu; Tianlei Wang; Hong Qu

    Despite the fact that image captioning models have been able to generate impressive descriptions for a given image, challenges remain: (1) the controllability and diversity of existing models are still far from satisfactory; (2) models sometimes may produce extremely poor-quality captions. In this paper, two novel methods are introduced to solve the problems respectively. Specifically, for the former

    更新日期:2021-01-21
  • FedNS: Improving Federated Learning for collaborative image classification on mobile clients
    arXiv.cs.CV Pub Date : 2021-01-20
    Yaoxin Zhuo; Baoxin Li

    Federated Learning (FL) is a paradigm that aims to support loosely connected clients in learning a global model collaboratively with the help of a centralized server. The most popular FL algorithm is Federated Averaging (FedAvg), which is based on taking weighted average of the client models, with the weights determined largely based on dataset sizes at the clients. In this paper, we propose a new

    更新日期:2021-01-21
  • Semi-supervised Keypoint Localization
    arXiv.cs.CV Pub Date : 2021-01-20
    Olga Moskvyak; Frederic Maire; Feras Dayoub; Mahsa Baktashmotlagh

    Knowledge about the locations of keypoints of an object in an image can assist in fine-grained classification and identification tasks, particularly for the case of objects that exhibit large variations in poses that greatly influence their visual appearance, such as wild animals. However, supervised training of a keypoint detection network requires annotating a large image dataset for each animal

    更新日期:2021-01-21
  • Non-Parametric Adaptive Network Pruning
    arXiv.cs.CV Pub Date : 2021-01-20
    Lin Mingbao; Ji Rongrong; Li Shaojie; Wang Yan; Wu Yongjian; Huang Feiyue; Ye Qixiang

    Popular network pruning algorithms reduce redundant information by optimizing hand-crafted parametric models, and may cause suboptimal performance and long time in selecting filters. We innovatively introduce non-parametric modeling to simplify the algorithm design, resulting in an automatic and efficient pruning approach called EPruner. Inspired by the face recognition community, we use a message

    更新日期:2021-01-21
  • Semantic Disentangling Generalized Zero-ShotLearning
    arXiv.cs.CV Pub Date : 2021-01-20
    Zhi Chen; Ruihong Qiu; Sen Wang; Zi Huang; Jingjing Li; Zheng Zhang

    Generalized Zero-Shot Learning (GZSL) aims to recognize images from both seen and unseen categories. Most GZSL methods typically learn to synthesize CNN visual features for the unseen classes by leveraging entire semantic information, e.g., tags and attributes, and the visual features of the seen classes. Within the visual features, we define two types of features that semantic-consistent and semantic-unrelated

    更新日期:2021-01-21
  • TCLR: Temporal Contrastive Learning for Video Representation
    arXiv.cs.CV Pub Date : 2021-01-20
    Ishan Dave; Rohit Gupta; Mamshad Nayeem Rizve; Mubarak Shah

    Contrastive learning has nearly closed the gap between supervised and self-supervised learning of image representations. Existing extensions of contrastive learning to the domain of video data however, rely on naive transposition of ideas from image-based methods and do not fully utilize the temporal dimension present in video. We develop a new temporal contrastive learning framework consisting of

    更新日期:2021-01-21
  • Class balanced underwater object detection dataset generated by class-wise style augmentation
    arXiv.cs.CV Pub Date : 2021-01-20
    Long Chen; Junyu Dong; Huiyu Zhou

    Underwater object detection technique is of great significance for various applications in underwater the scenes. However, class imbalance issue is still an unsolved bottleneck for current underwater object detection algorithms. It leads to large precision discrepancies among different classes that the dominant classes with more training data achieve higher detection precisions while the minority classes

    更新日期:2021-01-21
Contents have been reproduced by permission of the publishers.
导出
全部期刊列表>>
微生物研究
亚洲大洋洲地球科学
NPJ欢迎投稿
自然科研论文编辑
ERIS期刊投稿
欢迎阅读创刊号
自然职场,为您触达千万科研人才
spring&清华大学出版社
城市可持续发展前沿研究专辑
Springer 纳米技术权威期刊征稿
全球视野覆盖
施普林格·自然新
chemistry
物理学研究前沿热点精选期刊推荐
自然职位线上招聘会
欢迎报名注册2020量子在线大会
化学领域亟待解决的问题
材料学研究精选新
GIANT
ACS ES&T Engineering
ACS ES&T Water
屿渡论文,编辑服务
阿拉丁试剂right
上海中医药大学
浙江大学
西湖大学
化学所
北京大学
清华
隐藏1h前已浏览文章
课题组网站
新版X-MOL期刊搜索和高级搜索功能介绍
ACS材料视界
清华大学-1
武汉大学
浙江大学
天合科研
x-mol收录
试剂库存
down
wechat
bug