当前期刊: arXiv - CS - Computer Vision and Pattern Recognition Go to current issue    加入关注   
显示样式:        排序: IF: - GO 导出
我的关注
我的收藏
您暂时未登录!
登录
  • Multi-Frame to Single-Frame: Knowledge Distillation for 3D Object Detection
    arXiv.cs.CV Pub Date : 2020-09-24
    Yue Wang; Alireza Fathi; Jiajun Wu; Thomas Funkhouser; Justin Solomon

    A common dilemma in 3D object detection for autonomous driving is that high-quality, dense point clouds are only available during training, but not testing. We use knowledge distillation to bridge the gap between a model trained on high-quality inputs at training time and another tested on low-quality inputs at inference time. In particular, we design a two-stage training pipeline for point cloud object

    更新日期:2020-09-25
  • Attribute Propagation Network for Graph Zero-shot Learning
    arXiv.cs.CV Pub Date : 2020-09-24
    Lu Liu; Tianyi Zhou; Guodong Long; Jing Jiang; Chengqi Zhang

    The goal of zero-shot learning (ZSL) is to train a model to classify samples of classes that were not seen during training. To address this challenging task, most ZSL methods relate unseen test classes to seen(training) classes via a pre-defined set of attributes that can describe all classes in the same semantic space, so the knowledge learned on the training classes can be adapted to unseen classes

    更新日期:2020-09-25
  • Heuristics based Mosaic of Social-Sensor Services for Scene Reconstruction
    arXiv.cs.CV Pub Date : 2020-09-21
    Tooba Aamir; Hai Dong; Athman Bouguettaya

    We propose a heuristics-based social-sensor cloud service selection and composition model to reconstruct mosaic scenes. The proposed approach leverages crowdsourced social media images to create an image mosaic to reconstruct a scene at a designated location and an interval of time. The novel approach relies on the set of features defined on the bases of the image metadata to determine the relevance

    更新日期:2020-09-25
  • Cloud Cover Nowcasting with Deep Learning
    arXiv.cs.CV Pub Date : 2020-09-24
    Léa Berthomier; Bruno Pradel; Lior Perez

    Nowcasting is a field of meteorology which aims at forecasting weather on a short term of up to a few hours. In the meteorology landscape, this field is rather specific as it requires particular techniques, such as data extrapolation, where conventional meteorology is generally based on physical modeling. In this paper, we focus on cloud cover nowcasting, which has various application areas such as

    更新日期:2020-09-25
  • Local Context Attention for Salient Object Segmentation
    arXiv.cs.CV Pub Date : 2020-09-24
    Jing Tan; Pengfei Xiong; Yuwen He; Kuntao Xiao; Zhengyi Lv

    Salient object segmentation aims at distinguishing various salient objects from backgrounds. Despite the lack of semantic consistency, salient objects often have obvious texture and location characteristics in local area. Based on this priori, we propose a novel Local Context Attention Network (LCANet) to generate locally reinforcement feature maps in a uniform representational architecture. The proposed

    更新日期:2020-09-25
  • Multi-View Brain HyperConnectome AutoEncoder For Brain State Classification
    arXiv.cs.CV Pub Date : 2020-09-24
    Alin Banka; Inis Buzi; Islem Rekik

    Graph embedding is a powerful method to represent graph neurological data (e.g., brain connectomes) in a low dimensional space for brain connectivity mapping, prediction and classification. However, existing embedding algorithms have two major limitations. First, they primarily focus on preserving one-to-one topological relationships between nodes (i.e., regions of interest (ROIs) in a connectome)

    更新日期:2020-09-25
  • Multi-Scale Profiling of Brain Multigraphs by Eigen-based Cross-Diffusion and Heat Tracing for Brain State Profiling
    arXiv.cs.CV Pub Date : 2020-09-24
    Mustafa Saglam; Islem Rekik

    The individual brain can be viewed as a highly-complex multigraph (i.e. a set of graphs also called connectomes), where each graph represents a unique connectional view of pairwise brain region (node) relationships such as function or morphology. Due to its multifold complexity, understanding how brain disorders alter not only a single view of the brain graph, but its multigraph representation at the

    更新日期:2020-09-25
  • MimicDet: Bridging the Gap Between One-Stage and Two-Stage Object Detection
    arXiv.cs.CV Pub Date : 2020-09-24
    Xin Lu; Quanquan Li; Buyu Li; Junjie Yan

    Modern object detection methods can be divided into one-stage approaches and two-stage ones. One-stage detectors are more efficient owing to straightforward architectures, but the two-stage detectors still take the lead in accuracy. Although recent work try to improve the one-stage detectors by imitating the structural design of the two-stage ones, the accuracy gap is still significant. In this paper

    更新日期:2020-09-25
  • Understanding Fairness of Gender Classification Algorithms Across Gender-Race Groups
    arXiv.cs.CV Pub Date : 2020-09-24
    Anoop Krishnan; Ali Almadan; Ajita Rattani

    Automated gender classification has important applications in many domains, such as demographic research, law enforcement, online advertising, as well as human-computer interaction. Recent research has questioned the fairness of this technology across gender and race. Specifically, the majority of the studies raised the concern of higher error rates of the face-based gender classification system for

    更新日期:2020-09-25
  • BWCFace: Open-set Face Recognition using Body-worn Camera
    arXiv.cs.CV Pub Date : 2020-09-24
    Ali Almadan; Anoop Krishnan; Ajita Rattani

    With computer vision reaching an inflection point in the past decade, face recognition technology has become pervasive in policing, intelligence gathering, and consumer applications. Recently, face recognition technology has been deployed on bodyworn cameras to keep officers safe, enabling situational awareness and providing evidence for trial. However, limited academic research has been conducted

    更新日期:2020-09-25
  • 3D Object Localization Using 2D Estimates for Computer Vision Applications
    arXiv.cs.CV Pub Date : 2020-09-24
    Taha Hasan Masood Siddique; Muhammad Usman

    A technique for object localization based on pose estimation and camera calibration is presented. The 3-dimensional (3D) coordinates are estimated by collecting multiple 2-dimensional (2D) images of the object and are utilized for the calibration of the camera. The calibration steps involving a number of parameter calculation including intrinsic and extrinsic parameters for the removal of lens distortion

    更新日期:2020-09-25
  • Unifying data for fine-grained visual species classification
    arXiv.cs.CV Pub Date : 2020-09-24
    Sayali Kulkarni; Tomer Gadot; Chen Luo; Tanya Birch; Eric Fegraus

    Wildlife monitoring is crucial to nature conservation and has been done by manual observations from motion-triggered camera traps deployed in the field. Widespread adoption of such in-situ sensors has resulted in unprecedented data volumes being collected over the last decade. A significant challenge exists to process and reliably identify what is in these images efficiently. Advances in computer vision

    更新日期:2020-09-25
  • Automatic identification of fossils and abiotic grains during carbonate microfacies analysis using deep convolutional neural networks
    arXiv.cs.CV Pub Date : 2020-09-24
    Xiaokang Liu; Haijun Song

    Petrographic analysis based on microfacies identification in thin sections is widely used in sedimentary environment interpretation and paleoecological reconstruction. Fossil recognition from microfacies is an essential procedure for petrographers to complete this task. Distinguishing the morphological and microstructural diversity of skeletal fragments requires extensive prior knowledge of fossil

    更新日期:2020-09-25
  • FTN: Foreground-Guided Texture-Focused Person Re-Identification
    arXiv.cs.CV Pub Date : 2020-09-24
    Donghaisheng Liu; Shoudong Han; Yang Chen; Chenfei Xia; Jun Zhao

    Person re-identification (Re-ID) is a challenging task as persons are often in different backgrounds. Most recent Re-ID methods treat the foreground and background information equally for person discriminative learning, but can easily lead to potential false alarm problems when different persons are in similar backgrounds or the same person is in different backgrounds. In this paper, we propose a Foreground-Guided

    更新日期:2020-09-25
  • Dense Forecasting of Wildfire Smoke Particulate Matter Using Sparsity Invariant Convolutional Neural Networks
    arXiv.cs.CV Pub Date : 2020-09-23
    Renhao Wang; Ashutosh Bhudia; Brandon Dos Remedios; Minnie Teng; Raymond Ng

    Accurate forecasts of fine particulate matter (PM 2.5) from wildfire smoke are crucial to safeguarding cardiopulmonary public health. Existing forecasting systems are trained on sparse and inaccurate ground truths, and do not take sufficient advantage of important spatial inductive biases. In this work, we present a convolutional neural network which preserves sparsity invariance throughout, and leverages

    更新日期:2020-09-25
  • Insights on Evaluation of Camera Re-localization Using Relative Pose Regression
    arXiv.cs.CV Pub Date : 2020-09-23
    Amir ShalevTel-Aviv-UniversityIntel; Omer AchrackIntel; Brian FulkersonTel-Aviv-University; Ben-Zion BobrovskyTel-Aviv-University

    We consider the problem of relative pose regression in visual relocalization. Recently, several promising approaches have emerged in this area. We claim that even though they demonstrate on the same datasets using the same split to train and test, a faithful comparison between them was not available since on currently used evaluation metric, some approaches might perform favorably, while in reality

    更新日期:2020-09-25
  • ECOVNet: An Ensemble of Deep Convolutional Neural Networks Based on EfficientNet to Detect COVID-19 From Chest X-rays
    arXiv.cs.CV Pub Date : 2020-09-24
    Nihad Karim Chowdhury; Md. Muhtadir Rahman; Noortaz Rezoana; Muhammad Ashad Kabir

    This paper proposed an ensemble of deep convolutional neural networks (CNN) based on EfficientNet, named ECOVNet, to detect COVID-19 using a large chest X-ray data set. At first, the open-access large chest X-ray collection is augmented, and then ImageNet pre-trained weights for EfficientNet is transferred with some customized fine-tuning top layers that are trained, followed by an ensemble of model

    更新日期:2020-09-25
  • How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks
    arXiv.cs.CV Pub Date : 2020-09-24
    Keyulu Xu; Jingling Li; Mozhi Zhang; Simon S. Du; Ken-ichi Kawarabayashi; Stefanie Jegelka

    We study how neural networks trained by gradient descent extrapolate, i.e., what they learn outside the support of training distribution. Previous works report mixed empirical results when extrapolating with neural networks: while multilayer perceptrons (MLPs) do not extrapolate well in simple tasks, Graph Neural Networks (GNNs), a structured network with MLP modules, have some success in more complex

    更新日期:2020-09-25
  • A Gradient Flow Framework For Analyzing Network Pruning
    arXiv.cs.CV Pub Date : 2020-09-24
    Ekdeep Singh Lubana; Robert P. Dick

    Recent network pruning methods focus on pruning models early-on in training. To estimate the impact of removing a parameter, these methods use importance measures that were originally designed for pruning trained models. Despite lacking justification for their use early-on in training, models pruned using such measures result in surprisingly minimal accuracy loss. To better explain this behavior, we

    更新日期:2020-09-25
  • Learning Graph Normalization for Graph Neural Networks
    arXiv.cs.CV Pub Date : 2020-09-24
    Yihao Chen; Xin Tang; Xianbiao Qi; Chun-Guang Li; Rong Xiao

    Graph Neural Networks (GNNs) have attracted considerable attention and have emerged as a new promising paradigm to process graph-structured data. GNNs are usually stacked to multiple layers and the node representations in each layer are computed through propagating and aggregating the neighboring node features with respect to the graph. By stacking to multiple layers, GNNs are able to capture the long-range

    更新日期:2020-09-25
  • Interpreting and Boosting Dropout from a Game-Theoretic View
    arXiv.cs.CV Pub Date : 2020-09-24
    Hao Zhang; Sen Li; Yinchao Ma; Mingjie Li; Yichen Xie; Quanshi Zhang

    This paper aims to understand and improve the utility of the dropout operation from the perspective of game-theoretic interactions. We prove that dropout can suppress the strength of interactions between input variables of deep neural networks (DNNs). The theoretical proof is also verified by various experiments. Furthermore, we find that such interactions were strongly related to the over-fitting

    更新日期:2020-09-25
  • Residual Feature Distillation Network for Lightweight Image Super-Resolution
    arXiv.cs.CV Pub Date : 2020-09-24
    Jie Liu; Jie Tang; Gangshan Wu

    Recent advances in single image super-resolution (SISR) explored the power of convolutional neural network (CNN) to achieve a better performance. Despite the great success of CNN-based methods, it is not easy to apply these methods to edge devices due to the requirement of heavy computation. To solve this problem, various fast and lightweight CNN models have been proposed. The information distillation

    更新日期:2020-09-25
  • Eye Movement Feature Classification for Soccer Expertise Identification in Virtual Reality
    arXiv.cs.CV Pub Date : 2020-09-23
    Benedikt Hosp; Florian Schultz; Enkelejda Kasneci; Oliver Höner

    Latest research in expertise assessment of soccer players pronounced the importance of perceptual skills. Former research focused either on high experimental control or natural presentation mode. To assess perceptual skills of athletes, in an optimized manner, we captured omnidirectional in-field scenes, showed to 12 expert, 9 intermediate and 13 novice goalkeepers from soccer on virtual reality glasses

    更新日期:2020-09-25
  • Adversarial Brain Multiplex Prediction From a Single Network for High-Order Connectional Gender-Specific Brain Mapping
    arXiv.cs.CV Pub Date : 2020-09-24
    Ahmed Nebli; Islem Rekik

    Brain connectivity networks, derived from magnetic resonance imaging (MRI), non-invasively quantify the relationship in function, structure, and morphology between two brain regions of interest (ROIs) and give insights into gender-related connectional differences. However, to the best of our knowledge, studies on gender differences in brain connectivity were limited to investigating pairwise (i.e.

    更新日期:2020-09-25
  • Detection of Iterative Adversarial Attacks via Counter Attack
    arXiv.cs.CV Pub Date : 2020-09-23
    Matthias Rottmann; Mathis Peyron; Natasa Krejic; Hanno Gottschalk

    Deep neural networks (DNNs) have proven to be powerful tools for processing unstructured data. However for high-dimensional data, like images, they are inherently vulnerable to adversarial attacks. Small almost invisible perturbations added to the input can be used to fool DNNs. Various attacks, hardening methods and detection methods have been introduced in recent years. Notoriously, Carlini-Wagner

    更新日期:2020-09-25
  • Generative Modelling of 3D in-silico Spongiosa with Controllable Micro-Structural Parameters
    arXiv.cs.CV Pub Date : 2020-09-23
    Emmanuel Iarussi; Felix Thomsen; Claudio Delrieux

    Research in vertebral bone micro-structure generally requires costly procedures to obtain physical scans of real bone with a specific pathology under study, since no methods are available yet to generate realistic bone structures in-silico. Here we propose to apply recent advances in generative adversarial networks (GANs) to develop such a method. We adapted style-transfer techniques, which have been

    更新日期:2020-09-25
  • Augmented Convolutional LSTMs for Generation of High-Resolution Climate Change Projections
    arXiv.cs.CV Pub Date : 2020-09-23
    Nidhin Harilal; Udit Bhatia; Mayank Singh

    Projection of changes in extreme indices of climate variables such as temperature and precipitation are critical to assess the potential impacts of climate change on human-made and natural systems, including critical infrastructures and ecosystems. While impact assessment and adaptation planning rely on high-resolution projections (typically in the order of a few kilometers), state-of-the-art Earth

    更新日期:2020-09-24
  • X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers
    arXiv.cs.CV Pub Date : 2020-09-23
    Jaemin Cho; Jiasen Lu; Dustin Schwenk; Hannaneh Hajishirzi; Aniruddha Kembhavi

    Mirroring the success of masked language models, vision-and-language counterparts like ViLBERT, LXMERT and UNITER have achieved state of the art performance on a variety of multimodal discriminative tasks like visual question answering and visual grounding. Recent work has also successfully adapted such models towards the generative task of image captioning. This begs the question: Can these models

    更新日期:2020-09-24
  • A Linear Transportation $\mathrm{L}^p$ Distance for Pattern Recognition
    arXiv.cs.CV Pub Date : 2020-09-23
    Oliver M. Crook; Mihai Cucuringu; Tim Hurst; Carola-Bibiane Schönlieb; Matthew Thorpe; Konstantinos C. Zygalakis

    The transportation $\mathrm{L}^p$ distance, denoted $\mathrm{TL}^p$, has been proposed as a generalisation of Wasserstein $\mathrm{W}^p$ distances motivated by the property that it can be applied directly to colour or multi-channelled images, as well as multivariate time-series without normalisation or mass constraints. These distances, as with $\mathrm{W}^p$, are powerful tools in modelling data with

    更新日期:2020-09-24
  • Interactive Learning for Semantic Segmentation in Earth Observation
    arXiv.cs.CV Pub Date : 2020-09-23
    Gaston Lenczner; Adrien Chan-Hon-Tong; Nicola Luminari; Bertrand Le Saux; Guy Le Besnerais

    Dense pixel-wise classification maps output by deep neural networks are of extreme importance for scene understanding. However, these maps are often partially inaccurate due to a variety of possible factors. Therefore, we propose to interactively refine them within a framework named DISCA (Deep Image Segmentation with Continual Adaptation). It consists of continually adapting a neural network to a

    更新日期:2020-09-24
  • A Simple Yet Effective Method for Video Temporal Grounding with Cross-Modality Attention
    arXiv.cs.CV Pub Date : 2020-09-23
    Binjie Zhang; Yu Li; Chun Yuan; Dejing Xu; Pin Jiang; Ying Shan

    The task of language-guided video temporal grounding is to localize the particular video clip corresponding to a query sentence in an untrimmed video. Though progress has been made continuously in this field, some issues still need to be resolved. First, most of the existing methods rely on the combination of multiple complicated modules to solve the task. Second, due to the semantic gaps between the

    更新日期:2020-09-24
  • Learning Visual Voice Activity Detection with an Automatically Annotated Dataset
    arXiv.cs.CV Pub Date : 2020-09-23
    Sylvain Guy; Stéphane Lathuilière; Pablo Mesejo; Radu Horaud

    Visual voice activity detection (V-VAD) uses visual features to predict whether a person is speaking or not. V-VAD is useful whenever audio VAD (A-VAD) is inefficient either because the acoustic signal is difficult to analyze or because it is simply missing. We propose two deep architectures for V-VAD, one based on facial landmarks and one based on optical flow. Moreover, available datasets, used for

    更新日期:2020-09-24
  • Label-Efficient Multi-Task Segmentation using Contrastive Learning
    arXiv.cs.CV Pub Date : 2020-09-23
    Junichiro Iwasawa; Yuichiro Hirano; Yohei Sugawara

    Obtaining annotations for 3D medical images is expensive and time-consuming, despite its importance for automating segmentation tasks. Although multi-task learning is considered an effective method for training segmentation models using small amounts of annotated data, a systematic understanding of various subtasks is still lacking. In this study, we propose a multi-task segmentation model with a contrastive

    更新日期:2020-09-24
  • 2D-3D Geometric Fusion Network using Multi-Neighbourhood Graph Convolution for RGB-D Indoor Scene Classification
    arXiv.cs.CV Pub Date : 2020-09-23
    Albert Mosella-Montoro; Javier Ruiz-Hidalgo

    Multi-modal fusion has been proved to help enhance the performance of scene classification tasks. This paper presents a 2D-3D fusion stage that combines 3D Geometric features with 2D Texture features obtained by 2D Convolutional Neural Networks. To get a robust 3D Geometric embedding, a network that uses two novel layers is proposed. The first layer, Multi-Neighbourhood Graph Convolution, aims to learn

    更新日期:2020-09-24
  • Information-Theoretic Visual Explanation for Black-Box Classifiers
    arXiv.cs.CV Pub Date : 2020-09-23
    Jihun Yi; Eunji Kim; Siwon Kim; Sungroh Yoon

    In this work, we attempt to explain the prediction of any black-box classifier from an information-theoretic perspective. For this purpose, we propose two attribution maps: an information gain (IG) map and a point-wise mutual information (PMI) map. IG map provides a class-independent answer to "How informative is each pixel?", and PMI map offers a class-specific explanation by answering "How much does

    更新日期:2020-09-24
  • Multiple interaction learning with question-type prior knowledge for constraining answer search space in visual question answering
    arXiv.cs.CV Pub Date : 2020-09-23
    Tuong Do; Binh X. Nguyen; Huy Tran; Erman Tjiputra; Quang D. Tran; Thanh-Toan Do

    Different approaches have been proposed to Visual Question Answering (VQA). However, few works are aware of the behaviors of varying joint modality methods over question type prior knowledge extracted from data in constraining answer search space, of which information gives a reliable cue to reason about answers for questions asked in input images. In this paper, we propose a novel VQA model that utilizes

    更新日期:2020-09-24
  • Residual Embedding Similarity-Based Network Selection for Predicting Brain Network Evolution Trajectory from a Single Observation
    arXiv.cs.CV Pub Date : 2020-09-23
    Ahmet Serkan Goktas; Alaa Bessadok; Islem Rekik

    While existing predictive frameworks are able to handle Euclidean structured data (i.e, brain images), they might fail to generalize to geometric non-Euclidean data such as brain networks. Besides, these are rooted the sample selection step in using Euclidean or learned similarity measure between vectorized training and testing brain networks. Such sample connectomic representation might include irrelevant

    更新日期:2020-09-24
  • Multiplexed Illumination for Classifying Visually Similar Objects
    arXiv.cs.CV Pub Date : 2020-09-23
    Taihua Wang; Donald G. Dansereau

    Distinguishing visually similar objects like forged/authentic bills and healthy/unhealthy plants is beyond the capabilities of even the most sophisticated classifiers. We propose the use of multiplexed illumination to extend the range of objects that can be successfully classified. We construct a compact RGB-IR light stage that images samples under different combinations of illuminant position and

    更新日期:2020-09-24
  • Differential Viewpoints for Ground Terrain Material Recognition
    arXiv.cs.CV Pub Date : 2020-09-22
    Jia Xue; Hang Zhang; Ko Nishino; Kristin J. Dana

    Computational surface modeling that underlies material recognition has transitioned from reflectance modeling using in-lab controlled radiometric measurements to image-based representations based on internet-mined single-view images captured in the scene. We take a middle-ground approach for material recognition that takes advantage of both rich radiometric cues and flexible image capture. A key concept

    更新日期:2020-09-24
  • Robust and efficient post-processing for video object detection
    arXiv.cs.CV Pub Date : 2020-09-23
    Alberto Sabater; Luis Montesano; Ana C. Murillo

    Object recognition in video is an important task for plenty of applications, including autonomous driving perception, surveillance tasks, wearable devices or IoT networks. Object recognition using video data is more challenging than using still images due to blur, occlusions or rare object poses. Specific video detectors with high computational cost or standard image detectors together with a fast

    更新日期:2020-09-24
  • A Sparse Sampling-based framework for Semantic Fast-Forward of First-Person Videos
    arXiv.cs.CV Pub Date : 2020-09-21
    Michel Melo Silva; Washington Luis Souza Ramos; Mario Fernando Montenegro Campos; Erickson Rangel Nascimento

    Technological advances in sensors have paved the way for digital cameras to become increasingly ubiquitous, which, in turn, led to the popularity of the self-recording culture. As a result, the amount of visual data on the Internet is moving in the opposite direction of the available time and patience of the users. Thus, most of the uploaded videos are doomed to be forgotten and unwatched stashed away

    更新日期:2020-09-24
  • Unsupervised Feature Learning for Event Data: Direct vs Inverse Problem Formulation
    arXiv.cs.CV Pub Date : 2020-09-23
    Dimche Kostadinov; Davide Scaramuzza

    Event-based cameras record an asynchronous stream of per-pixel brightness changes. As such, they have numerous advantages over the standard frame-based cameras, including high temporal resolution, high dynamic range, and no motion blur. Due to the asynchronous nature, efficient learning of compact representation for event data is challenging. While it remains not explored the extent to which the spatial

    更新日期:2020-09-24
  • Few-shot Font Generation with Localized Style Representations and Factorization
    arXiv.cs.CV Pub Date : 2020-09-23
    Song Park; Sanghyuk Chun; Junbum Cha; Bado Lee; Hyunjung Shim

    Automatic few-shot font generation is in high demand because manual designs are expensive and sensitive to the expertise of designers. Existing few-shot font generation methods aim to learn to disentangle the style and content element from a few reference glyphs, and mainly focus on a universal style representation for each font style. However, such approach limits the model in representing diverse

    更新日期:2020-09-24
  • Generative Model without Prior Distribution Matching
    arXiv.cs.CV Pub Date : 2020-09-23
    Cong Geng; Jia Wang; Li Chen; Zhiyong Gao

    Variational Autoencoder (VAE) and its variations are classic generative models by learning a low-dimensional latent representation to satisfy some prior distribution (e.g., Gaussian distribution). Their advantages over GAN are that they can simultaneously generate high dimensional data and learn latent representations to reconstruct the inputs. However, it has been observed that a trade-off exists

    更新日期:2020-09-24
  • What is the Reward for Handwriting? -- Handwriting Generation by Imitation Learning
    arXiv.cs.CV Pub Date : 2020-09-23
    Keisuke Kanda; Brian Kenji Iwana; Seiichi Uchida

    Analyzing the handwriting generation process is an important issue and has been tackled by various generation models, such as kinematics based models and stochastic models. In this study, we use a reinforcement learning (RL) framework to realize handwriting generation with the careful future planning ability. In fact, the handwriting process of human beings is also supported by their future planning

    更新日期:2020-09-24
  • MAFF-Net: Filter False Positive for 3D Vehicle Detection with Multi-modal Adaptive Feature Fusion
    arXiv.cs.CV Pub Date : 2020-09-23
    Zehan Zhang; Ming Zhang; Zhidong Liang; Xian Zhao; Ming Yang; Wenming Tan; ShiLiang Pu

    3D vehicle detection based on multi-modal fusion is an important task of many applications such as autonomous driving. Although significant progress has been made, we still observe two aspects that need to be further improvement: First, the specific gain that camera images can bring to 3D detection is seldom explored by previous works. Second, many fusion algorithms run slowly, which is essential for

    更新日期:2020-09-24
  • Exploring global diverse attention via pairwise temporal relation for video summarization
    arXiv.cs.CV Pub Date : 2020-09-23
    Ping Li; Qinghao Ye; Luming Zhang; Li Yuan; Xianghua Xu; Ling Shao

    Video summarization is an effective way to facilitate video searching and browsing. Most of existing systems employ encoder-decoder based recurrent neural networks, which fail to explicitly diversify the system-generated summary frames while requiring intensive computations. In this paper, we propose an efficient convolutional neural network architecture for video SUMmarization via Global Diverse Attention

    更新日期:2020-09-24
  • Scene Graph to Image Generation with Contextualized Object Layout Refinement
    arXiv.cs.CV Pub Date : 2020-09-23
    Maor Ivgi; Yaniv Benny; Avichai Ben-David; Jonathan Berant; Lior Wolf

    Generating high-quality images from scene graphs, that is, graphs that describe multiple entities in complex relations, is a challenging task that attracted substantial interest recently. Prior work trained such models by using supervised learning, where the goal is to produce the exact target image layout for each scene graph. It relied on predicting object locations and shapes independently and in

    更新日期:2020-09-24
  • CLASS: Cross-Level Attention and Supervision for Salient Objects Detection
    arXiv.cs.CV Pub Date : 2020-09-23
    Tang Lv; Bo Li

    Salient object detection (SOD) is a fundamental computer vision task. Recently, with the revival of deep neural networks, SOD has made great progresses. However, there still exist two thorny issues that cannot be well addressed by existing methods, indistinguishable regions and complex structures. To address these two issues, in this paper we propose a novel deep network for accurate SOD, named CLASS

    更新日期:2020-09-24
  • LoRRaL: Facial Action Unit Detection Based on Local Region Relation Learning
    arXiv.cs.CV Pub Date : 2020-09-23
    Ziqiang Shi; Liu Liu; Rujie Liu; Xiaoyu Mi; and Kentaro Murase

    End-to-end convolution representation learning has been proved to be very effective in facial action unit (AU) detection. Considering the co-occurrence and mutual exclusion between facial AUs, in this paper, we propose convolution neural networks with Local Region Relation Learning (LoRRaL), which can combine latent relationships among AUs for an end-to-end approach to facial AU occurrence detection

    更新日期:2020-09-24
  • Leveraging Local and Global Descriptors in Parallel to Search Correspondences for Visual Localization
    arXiv.cs.CV Pub Date : 2020-09-23
    Pengju Zhang; Yihong Wu; Bingxi Liu

    Visual localization to compute 6DoF camera pose from a given image has wide applications such as in robotics, virtual reality, augmented reality, etc. Two kinds of descriptors are important for the visual localization. One is global descriptors that extract the whole feature from each image. The other is local descriptors that extract the local feature from each image patch usually enclosing a key

    更新日期:2020-09-24
  • Hamming OCR: A Locality Sensitive Hashing Neural Network for Scene Text Recognition
    arXiv.cs.CV Pub Date : 2020-09-23
    Bingcong Li; Xin Tang; Xianbiao Qi; Yihao Chen; Rong Xiao

    Recently, inspired by Transformer, self-attention-based scene text recognition approaches have achieved outstanding performance. However, we find that the size of model expands rapidly with the lexicon increasing. Specifically, the number of parameters for softmax classification layer and output embedding layer are proportional to the vocabulary size. It hinders the development of a lightweight text

    更新日期:2020-09-24
  • A Real-time Vision Framework for Pedestrian Behavior Recognition and Intention Prediction at Intersections Using 3D Pose Estimation
    arXiv.cs.CV Pub Date : 2020-09-23
    Ue-Hwan Kim; Dongho Ka; Hwasoo Yeo; Jong-Hwan Kim

    Minimizing traffic accidents between vehicles and pedestrians is one of the primary research goals in intelligent transportation systems. To achieve the goal, pedestrian behavior recognition and prediction of pedestrian's crossing or not-crossing intention play a central role. Contemporary approaches do not guarantee satisfactory performance due to lack of generalization, the requirement of manual

    更新日期:2020-09-24
  • Angular Luminance for Material Segmentation
    arXiv.cs.CV Pub Date : 2020-09-22
    Jia Xue; Matthew Purri; Kristin Dana

    Moving cameras provide multiple intensity measurements per pixel, yet often semantic segmentation, material recognition, and object recognition do not utilize this information. With basic alignment over several frames of a moving camera sequence, a distribution of intensities over multiple angles is obtained. It is well known from prior work that luminance histograms and the statistics of natural images

    更新日期:2020-09-24
  • Kernelized dense layers for facial expression recognition
    arXiv.cs.CV Pub Date : 2020-09-22
    M. Amine Mahmoudi; Aladine Chetouani; Fatma Boufera; Hedi Tabia

    Fully connected layer is an essential component of Convolutional Neural Networks (CNNs), which demonstrates its efficiency in computer vision tasks. The CNN process usually starts with convolution and pooling layers that first break down the input images into features, and then analyze them independently. The result of this process feeds into a fully connected neural network structure which drives

    更新日期:2020-09-24
  • Efficient DWT-based fusion techniques using genetic algorithm for optimal parameter estimation
    arXiv.cs.CV Pub Date : 2020-09-22
    S. Kavitha; K. K. Thyagharajan

    Image fusion plays a vital role in medical imaging. Image fusion aims to integrate complementary as well as redundant information from multiple modalities into a single fused image without distortion or loss of information. In this research work, discrete wavelet transform (DWT)and undecimated discrete wavelet transform (UDWT)-based fusion techniques using genetic algorithm (GA)foroptimalparameter

    更新日期:2020-09-24
  • Role of Orthogonality Constraints in Improving Properties of Deep Networks for Image Classification
    arXiv.cs.CV Pub Date : 2020-09-22
    Hongjun Choi; Anirudh Som; Pavan Turaga

    Standard deep learning models that employ the categorical cross-entropy loss are known to perform well at image classification tasks. However, many standard models thus obtained often exhibit issues like feature redundancy, low interpretability, and poor calibration. A body of recent work has emerged that has tried addressing some of these challenges by proposing the use of new regularization functions

    更新日期:2020-09-24
  • Fuzzy Simplicial Networks: A Topology-Inspired Model to Improve Task Generalization in Few-shot Learning
    arXiv.cs.CV Pub Date : 2020-09-23
    Henry Kvinge; Zachary New; Nico Courts; Jung H. Lee; Lauren A. Phillips; Courtney D. Corley; Aaron Tuor; Andrew Avila; Nathan O. Hodas

    Deep learning has shown great success in settings with massive amounts of data but has struggled when data is limited. Few-shot learning algorithms, which seek to address this limitation, are designed to generalize well to new tasks with limited data. Typically, models are evaluated on unseen classes and datasets that are defined by the same fundamental task as they are trained for (e.g. category membership)

    更新日期:2020-09-24
  • Whole Slide Images based Cancer Survival Prediction using Attention Guided Deep Multiple Instance Learning Networks
    arXiv.cs.CV Pub Date : 2020-09-23
    Jiawen Yao; Xinliang Zhu; Jitendra Jonnagaddala; Nicholas Hawkins; Junzhou Huang

    Traditional image-based survival prediction models rely on discriminative patch labeling which make those methods not scalable to extend to large datasets. Recent studies have shown Multiple Instance Learning (MIL) framework is useful for histopathological images when no annotations are available in classification task. Different to the current image-based survival models that limit to key patches

    更新日期:2020-09-24
  • Foreseeing Brain Graph Evolution Over Time Using Deep Adversarial Network Normalizer
    arXiv.cs.CV Pub Date : 2020-09-23
    Zeynep Gurler; Ahmed Nebli; Islem Rekik

    Foreseeing the brain evolution as a complex highly inter-connected system, widely modeled as a graph, is crucial for mapping dynamic interactions between different anatomical regions of interest (ROIs) in health and disease. Interestingly, brain graph evolution models remain almost absent in the literature. Here we design an adversarial brain network normalizer for representing each brain network as

    更新日期:2020-09-24
Contents have been reproduced by permission of the publishers.
导出
全部期刊列表>>
物理学研究前沿热点精选期刊推荐
chemistry
自然职位线上招聘会
欢迎报名注册2020量子在线大会
化学领域亟待解决的问题
材料学研究精选新
GIANT
ACS ES&T Engineering
ACS ES&T Water
ACS Publications填问卷
屿渡论文,编辑服务
阿拉丁试剂right
南昌大学
王辉
南方科技大学
彭小水
隐藏1h前已浏览文章
课题组网站
新版X-MOL期刊搜索和高级搜索功能介绍
ACS材料视界
天合科研
x-mol收录
赵延川
李霄羽
廖矿标
朱守非
试剂库存
down
wechat
bug