• arXiv.cs.CV Pub Date : 2020-09-24
Yue Wang; Alireza Fathi; Jiajun Wu; Thomas Funkhouser; Justin Solomon

A common dilemma in 3D object detection for autonomous driving is that high-quality, dense point clouds are only available during training, but not testing. We use knowledge distillation to bridge the gap between a model trained on high-quality inputs at training time and another tested on low-quality inputs at inference time. In particular, we design a two-stage training pipeline for point cloud object

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Lu Liu; Tianyi Zhou; Guodong Long; Jing Jiang; Chengqi Zhang

The goal of zero-shot learning (ZSL) is to train a model to classify samples of classes that were not seen during training. To address this challenging task, most ZSL methods relate unseen test classes to seen(training) classes via a pre-defined set of attributes that can describe all classes in the same semantic space, so the knowledge learned on the training classes can be adapted to unseen classes

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-21
Tooba Aamir; Hai Dong; Athman Bouguettaya

We propose a heuristics-based social-sensor cloud service selection and composition model to reconstruct mosaic scenes. The proposed approach leverages crowdsourced social media images to create an image mosaic to reconstruct a scene at a designated location and an interval of time. The novel approach relies on the set of features defined on the bases of the image metadata to determine the relevance

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Léa Berthomier; Bruno Pradel; Lior Perez

Nowcasting is a field of meteorology which aims at forecasting weather on a short term of up to a few hours. In the meteorology landscape, this field is rather specific as it requires particular techniques, such as data extrapolation, where conventional meteorology is generally based on physical modeling. In this paper, we focus on cloud cover nowcasting, which has various application areas such as

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Jing Tan; Pengfei Xiong; Yuwen He; Kuntao Xiao; Zhengyi Lv

Salient object segmentation aims at distinguishing various salient objects from backgrounds. Despite the lack of semantic consistency, salient objects often have obvious texture and location characteristics in local area. Based on this priori, we propose a novel Local Context Attention Network (LCANet) to generate locally reinforcement feature maps in a uniform representational architecture. The proposed

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Alin Banka; Inis Buzi; Islem Rekik

Graph embedding is a powerful method to represent graph neurological data (e.g., brain connectomes) in a low dimensional space for brain connectivity mapping, prediction and classification. However, existing embedding algorithms have two major limitations. First, they primarily focus on preserving one-to-one topological relationships between nodes (i.e., regions of interest (ROIs) in a connectome)

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Mustafa Saglam; Islem Rekik

The individual brain can be viewed as a highly-complex multigraph (i.e. a set of graphs also called connectomes), where each graph represents a unique connectional view of pairwise brain region (node) relationships such as function or morphology. Due to its multifold complexity, understanding how brain disorders alter not only a single view of the brain graph, but its multigraph representation at the

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Xin Lu; Quanquan Li; Buyu Li; Junjie Yan

Modern object detection methods can be divided into one-stage approaches and two-stage ones. One-stage detectors are more efficient owing to straightforward architectures, but the two-stage detectors still take the lead in accuracy. Although recent work try to improve the one-stage detectors by imitating the structural design of the two-stage ones, the accuracy gap is still significant. In this paper

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Anoop Krishnan; Ali Almadan; Ajita Rattani

Automated gender classification has important applications in many domains, such as demographic research, law enforcement, online advertising, as well as human-computer interaction. Recent research has questioned the fairness of this technology across gender and race. Specifically, the majority of the studies raised the concern of higher error rates of the face-based gender classification system for

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Ali Almadan; Anoop Krishnan; Ajita Rattani

With computer vision reaching an inflection point in the past decade, face recognition technology has become pervasive in policing, intelligence gathering, and consumer applications. Recently, face recognition technology has been deployed on bodyworn cameras to keep officers safe, enabling situational awareness and providing evidence for trial. However, limited academic research has been conducted

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Taha Hasan Masood Siddique; Muhammad Usman

A technique for object localization based on pose estimation and camera calibration is presented. The 3-dimensional (3D) coordinates are estimated by collecting multiple 2-dimensional (2D) images of the object and are utilized for the calibration of the camera. The calibration steps involving a number of parameter calculation including intrinsic and extrinsic parameters for the removal of lens distortion

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Sayali Kulkarni; Tomer Gadot; Chen Luo; Tanya Birch; Eric Fegraus

Wildlife monitoring is crucial to nature conservation and has been done by manual observations from motion-triggered camera traps deployed in the field. Widespread adoption of such in-situ sensors has resulted in unprecedented data volumes being collected over the last decade. A significant challenge exists to process and reliably identify what is in these images efficiently. Advances in computer vision

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Xiaokang Liu; Haijun Song

Petrographic analysis based on microfacies identification in thin sections is widely used in sedimentary environment interpretation and paleoecological reconstruction. Fossil recognition from microfacies is an essential procedure for petrographers to complete this task. Distinguishing the morphological and microstructural diversity of skeletal fragments requires extensive prior knowledge of fossil

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Donghaisheng Liu; Shoudong Han; Yang Chen; Chenfei Xia; Jun Zhao

Person re-identification (Re-ID) is a challenging task as persons are often in different backgrounds. Most recent Re-ID methods treat the foreground and background information equally for person discriminative learning, but can easily lead to potential false alarm problems when different persons are in similar backgrounds or the same person is in different backgrounds. In this paper, we propose a Foreground-Guided

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-23
Renhao Wang; Ashutosh Bhudia; Brandon Dos Remedios; Minnie Teng; Raymond Ng

Accurate forecasts of fine particulate matter (PM 2.5) from wildfire smoke are crucial to safeguarding cardiopulmonary public health. Existing forecasting systems are trained on sparse and inaccurate ground truths, and do not take sufficient advantage of important spatial inductive biases. In this work, we present a convolutional neural network which preserves sparsity invariance throughout, and leverages

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-23
Amir ShalevTel-Aviv-UniversityIntel; Omer AchrackIntel; Brian FulkersonTel-Aviv-University; Ben-Zion BobrovskyTel-Aviv-University

We consider the problem of relative pose regression in visual relocalization. Recently, several promising approaches have emerged in this area. We claim that even though they demonstrate on the same datasets using the same split to train and test, a faithful comparison between them was not available since on currently used evaluation metric, some approaches might perform favorably, while in reality

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Nihad Karim Chowdhury; Md. Muhtadir Rahman; Noortaz Rezoana; Muhammad Ashad Kabir

This paper proposed an ensemble of deep convolutional neural networks (CNN) based on EfficientNet, named ECOVNet, to detect COVID-19 using a large chest X-ray data set. At first, the open-access large chest X-ray collection is augmented, and then ImageNet pre-trained weights for EfficientNet is transferred with some customized fine-tuning top layers that are trained, followed by an ensemble of model

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Keyulu Xu; Jingling Li; Mozhi Zhang; Simon S. Du; Ken-ichi Kawarabayashi; Stefanie Jegelka

We study how neural networks trained by gradient descent extrapolate, i.e., what they learn outside the support of training distribution. Previous works report mixed empirical results when extrapolating with neural networks: while multilayer perceptrons (MLPs) do not extrapolate well in simple tasks, Graph Neural Networks (GNNs), a structured network with MLP modules, have some success in more complex

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Ekdeep Singh Lubana; Robert P. Dick

Recent network pruning methods focus on pruning models early-on in training. To estimate the impact of removing a parameter, these methods use importance measures that were originally designed for pruning trained models. Despite lacking justification for their use early-on in training, models pruned using such measures result in surprisingly minimal accuracy loss. To better explain this behavior, we

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Yihao Chen; Xin Tang; Xianbiao Qi; Chun-Guang Li; Rong Xiao

Graph Neural Networks (GNNs) have attracted considerable attention and have emerged as a new promising paradigm to process graph-structured data. GNNs are usually stacked to multiple layers and the node representations in each layer are computed through propagating and aggregating the neighboring node features with respect to the graph. By stacking to multiple layers, GNNs are able to capture the long-range

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Hao Zhang; Sen Li; Yinchao Ma; Mingjie Li; Yichen Xie; Quanshi Zhang

This paper aims to understand and improve the utility of the dropout operation from the perspective of game-theoretic interactions. We prove that dropout can suppress the strength of interactions between input variables of deep neural networks (DNNs). The theoretical proof is also verified by various experiments. Furthermore, we find that such interactions were strongly related to the over-fitting

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Jie Liu; Jie Tang; Gangshan Wu

Recent advances in single image super-resolution (SISR) explored the power of convolutional neural network (CNN) to achieve a better performance. Despite the great success of CNN-based methods, it is not easy to apply these methods to edge devices due to the requirement of heavy computation. To solve this problem, various fast and lightweight CNN models have been proposed. The information distillation

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-23
Benedikt Hosp; Florian Schultz; Enkelejda Kasneci; Oliver Höner

Latest research in expertise assessment of soccer players pronounced the importance of perceptual skills. Former research focused either on high experimental control or natural presentation mode. To assess perceptual skills of athletes, in an optimized manner, we captured omnidirectional in-field scenes, showed to 12 expert, 9 intermediate and 13 novice goalkeepers from soccer on virtual reality glasses

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-24
Ahmed Nebli; Islem Rekik

Brain connectivity networks, derived from magnetic resonance imaging (MRI), non-invasively quantify the relationship in function, structure, and morphology between two brain regions of interest (ROIs) and give insights into gender-related connectional differences. However, to the best of our knowledge, studies on gender differences in brain connectivity were limited to investigating pairwise (i.e.

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-23
Matthias Rottmann; Mathis Peyron; Natasa Krejic; Hanno Gottschalk

Deep neural networks (DNNs) have proven to be powerful tools for processing unstructured data. However for high-dimensional data, like images, they are inherently vulnerable to adversarial attacks. Small almost invisible perturbations added to the input can be used to fool DNNs. Various attacks, hardening methods and detection methods have been introduced in recent years. Notoriously, Carlini-Wagner

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-23
Emmanuel Iarussi; Felix Thomsen; Claudio Delrieux

Research in vertebral bone micro-structure generally requires costly procedures to obtain physical scans of real bone with a specific pathology under study, since no methods are available yet to generate realistic bone structures in-silico. Here we propose to apply recent advances in generative adversarial networks (GANs) to develop such a method. We adapted style-transfer techniques, which have been

更新日期：2020-09-25
• arXiv.cs.CV Pub Date : 2020-09-23
Nidhin Harilal; Udit Bhatia; Mayank Singh

Projection of changes in extreme indices of climate variables such as temperature and precipitation are critical to assess the potential impacts of climate change on human-made and natural systems, including critical infrastructures and ecosystems. While impact assessment and adaptation planning rely on high-resolution projections (typically in the order of a few kilometers), state-of-the-art Earth

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Jaemin Cho; Jiasen Lu; Dustin Schwenk; Hannaneh Hajishirzi; Aniruddha Kembhavi

Mirroring the success of masked language models, vision-and-language counterparts like ViLBERT, LXMERT and UNITER have achieved state of the art performance on a variety of multimodal discriminative tasks like visual question answering and visual grounding. Recent work has also successfully adapted such models towards the generative task of image captioning. This begs the question: Can these models

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Oliver M. Crook; Mihai Cucuringu; Tim Hurst; Carola-Bibiane Schönlieb; Matthew Thorpe; Konstantinos C. Zygalakis

The transportation $\mathrm{L}^p$ distance, denoted $\mathrm{TL}^p$, has been proposed as a generalisation of Wasserstein $\mathrm{W}^p$ distances motivated by the property that it can be applied directly to colour or multi-channelled images, as well as multivariate time-series without normalisation or mass constraints. These distances, as with $\mathrm{W}^p$, are powerful tools in modelling data with

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Gaston Lenczner; Adrien Chan-Hon-Tong; Nicola Luminari; Bertrand Le Saux; Guy Le Besnerais

Dense pixel-wise classification maps output by deep neural networks are of extreme importance for scene understanding. However, these maps are often partially inaccurate due to a variety of possible factors. Therefore, we propose to interactively refine them within a framework named DISCA (Deep Image Segmentation with Continual Adaptation). It consists of continually adapting a neural network to a

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Binjie Zhang; Yu Li; Chun Yuan; Dejing Xu; Pin Jiang; Ying Shan

The task of language-guided video temporal grounding is to localize the particular video clip corresponding to a query sentence in an untrimmed video. Though progress has been made continuously in this field, some issues still need to be resolved. First, most of the existing methods rely on the combination of multiple complicated modules to solve the task. Second, due to the semantic gaps between the

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Sylvain Guy; Stéphane Lathuilière; Pablo Mesejo; Radu Horaud

Visual voice activity detection (V-VAD) uses visual features to predict whether a person is speaking or not. V-VAD is useful whenever audio VAD (A-VAD) is inefficient either because the acoustic signal is difficult to analyze or because it is simply missing. We propose two deep architectures for V-VAD, one based on facial landmarks and one based on optical flow. Moreover, available datasets, used for

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Junichiro Iwasawa; Yuichiro Hirano; Yohei Sugawara

Obtaining annotations for 3D medical images is expensive and time-consuming, despite its importance for automating segmentation tasks. Although multi-task learning is considered an effective method for training segmentation models using small amounts of annotated data, a systematic understanding of various subtasks is still lacking. In this study, we propose a multi-task segmentation model with a contrastive

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Albert Mosella-Montoro; Javier Ruiz-Hidalgo

Multi-modal fusion has been proved to help enhance the performance of scene classification tasks. This paper presents a 2D-3D fusion stage that combines 3D Geometric features with 2D Texture features obtained by 2D Convolutional Neural Networks. To get a robust 3D Geometric embedding, a network that uses two novel layers is proposed. The first layer, Multi-Neighbourhood Graph Convolution, aims to learn

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Jihun Yi; Eunji Kim; Siwon Kim; Sungroh Yoon

In this work, we attempt to explain the prediction of any black-box classifier from an information-theoretic perspective. For this purpose, we propose two attribution maps: an information gain (IG) map and a point-wise mutual information (PMI) map. IG map provides a class-independent answer to "How informative is each pixel?", and PMI map offers a class-specific explanation by answering "How much does

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Tuong Do; Binh X. Nguyen; Huy Tran; Erman Tjiputra; Quang D. Tran; Thanh-Toan Do

Different approaches have been proposed to Visual Question Answering (VQA). However, few works are aware of the behaviors of varying joint modality methods over question type prior knowledge extracted from data in constraining answer search space, of which information gives a reliable cue to reason about answers for questions asked in input images. In this paper, we propose a novel VQA model that utilizes

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Ahmet Serkan Goktas; Alaa Bessadok; Islem Rekik

While existing predictive frameworks are able to handle Euclidean structured data (i.e, brain images), they might fail to generalize to geometric non-Euclidean data such as brain networks. Besides, these are rooted the sample selection step in using Euclidean or learned similarity measure between vectorized training and testing brain networks. Such sample connectomic representation might include irrelevant

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Taihua Wang; Donald G. Dansereau

Distinguishing visually similar objects like forged/authentic bills and healthy/unhealthy plants is beyond the capabilities of even the most sophisticated classifiers. We propose the use of multiplexed illumination to extend the range of objects that can be successfully classified. We construct a compact RGB-IR light stage that images samples under different combinations of illuminant position and

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-22
Jia Xue; Hang Zhang; Ko Nishino; Kristin J. Dana

Computational surface modeling that underlies material recognition has transitioned from reflectance modeling using in-lab controlled radiometric measurements to image-based representations based on internet-mined single-view images captured in the scene. We take a middle-ground approach for material recognition that takes advantage of both rich radiometric cues and flexible image capture. A key concept

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Alberto Sabater; Luis Montesano; Ana C. Murillo

Object recognition in video is an important task for plenty of applications, including autonomous driving perception, surveillance tasks, wearable devices or IoT networks. Object recognition using video data is more challenging than using still images due to blur, occlusions or rare object poses. Specific video detectors with high computational cost or standard image detectors together with a fast

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-21
Michel Melo Silva; Washington Luis Souza Ramos; Mario Fernando Montenegro Campos; Erickson Rangel Nascimento

Technological advances in sensors have paved the way for digital cameras to become increasingly ubiquitous, which, in turn, led to the popularity of the self-recording culture. As a result, the amount of visual data on the Internet is moving in the opposite direction of the available time and patience of the users. Thus, most of the uploaded videos are doomed to be forgotten and unwatched stashed away

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Dimche Kostadinov; Davide Scaramuzza

Event-based cameras record an asynchronous stream of per-pixel brightness changes. As such, they have numerous advantages over the standard frame-based cameras, including high temporal resolution, high dynamic range, and no motion blur. Due to the asynchronous nature, efficient learning of compact representation for event data is challenging. While it remains not explored the extent to which the spatial

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Song Park; Sanghyuk Chun; Junbum Cha; Bado Lee; Hyunjung Shim

Automatic few-shot font generation is in high demand because manual designs are expensive and sensitive to the expertise of designers. Existing few-shot font generation methods aim to learn to disentangle the style and content element from a few reference glyphs, and mainly focus on a universal style representation for each font style. However, such approach limits the model in representing diverse

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Cong Geng; Jia Wang; Li Chen; Zhiyong Gao

Variational Autoencoder (VAE) and its variations are classic generative models by learning a low-dimensional latent representation to satisfy some prior distribution (e.g., Gaussian distribution). Their advantages over GAN are that they can simultaneously generate high dimensional data and learn latent representations to reconstruct the inputs. However, it has been observed that a trade-off exists

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Keisuke Kanda; Brian Kenji Iwana; Seiichi Uchida

Analyzing the handwriting generation process is an important issue and has been tackled by various generation models, such as kinematics based models and stochastic models. In this study, we use a reinforcement learning (RL) framework to realize handwriting generation with the careful future planning ability. In fact, the handwriting process of human beings is also supported by their future planning

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Zehan Zhang; Ming Zhang; Zhidong Liang; Xian Zhao; Ming Yang; Wenming Tan; ShiLiang Pu

3D vehicle detection based on multi-modal fusion is an important task of many applications such as autonomous driving. Although significant progress has been made, we still observe two aspects that need to be further improvement: First, the specific gain that camera images can bring to 3D detection is seldom explored by previous works. Second, many fusion algorithms run slowly, which is essential for

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Ping Li; Qinghao Ye; Luming Zhang; Li Yuan; Xianghua Xu; Ling Shao

Video summarization is an effective way to facilitate video searching and browsing. Most of existing systems employ encoder-decoder based recurrent neural networks, which fail to explicitly diversify the system-generated summary frames while requiring intensive computations. In this paper, we propose an efficient convolutional neural network architecture for video SUMmarization via Global Diverse Attention

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Maor Ivgi; Yaniv Benny; Avichai Ben-David; Jonathan Berant; Lior Wolf

Generating high-quality images from scene graphs, that is, graphs that describe multiple entities in complex relations, is a challenging task that attracted substantial interest recently. Prior work trained such models by using supervised learning, where the goal is to produce the exact target image layout for each scene graph. It relied on predicting object locations and shapes independently and in

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Tang Lv; Bo Li

Salient object detection (SOD) is a fundamental computer vision task. Recently, with the revival of deep neural networks, SOD has made great progresses. However, there still exist two thorny issues that cannot be well addressed by existing methods, indistinguishable regions and complex structures. To address these two issues, in this paper we propose a novel deep network for accurate SOD, named CLASS

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Ziqiang Shi; Liu Liu; Rujie Liu; Xiaoyu Mi; and Kentaro Murase

End-to-end convolution representation learning has been proved to be very effective in facial action unit (AU) detection. Considering the co-occurrence and mutual exclusion between facial AUs, in this paper, we propose convolution neural networks with Local Region Relation Learning (LoRRaL), which can combine latent relationships among AUs for an end-to-end approach to facial AU occurrence detection

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Pengju Zhang; Yihong Wu; Bingxi Liu

Visual localization to compute 6DoF camera pose from a given image has wide applications such as in robotics, virtual reality, augmented reality, etc. Two kinds of descriptors are important for the visual localization. One is global descriptors that extract the whole feature from each image. The other is local descriptors that extract the local feature from each image patch usually enclosing a key

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Bingcong Li; Xin Tang; Xianbiao Qi; Yihao Chen; Rong Xiao

Recently, inspired by Transformer, self-attention-based scene text recognition approaches have achieved outstanding performance. However, we find that the size of model expands rapidly with the lexicon increasing. Specifically, the number of parameters for softmax classification layer and output embedding layer are proportional to the vocabulary size. It hinders the development of a lightweight text

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Ue-Hwan Kim; Dongho Ka; Hwasoo Yeo; Jong-Hwan Kim

Minimizing traffic accidents between vehicles and pedestrians is one of the primary research goals in intelligent transportation systems. To achieve the goal, pedestrian behavior recognition and prediction of pedestrian's crossing or not-crossing intention play a central role. Contemporary approaches do not guarantee satisfactory performance due to lack of generalization, the requirement of manual

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-22
Jia Xue; Matthew Purri; Kristin Dana

Moving cameras provide multiple intensity measurements per pixel, yet often semantic segmentation, material recognition, and object recognition do not utilize this information. With basic alignment over several frames of a moving camera sequence, a distribution of intensities over multiple angles is obtained. It is well known from prior work that luminance histograms and the statistics of natural images

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-22
M. Amine Mahmoudi; Aladine Chetouani; Fatma Boufera; Hedi Tabia

Fully connected layer is an essential component of Convolutional Neural Networks (CNNs), which demonstrates its efficiency in computer vision tasks. The CNN process usually starts with convolution and pooling layers that first break down the input images into features, and then analyze them independently. The result of this process feeds into a fully connected neural network structure which drives

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-22
S. Kavitha; K. K. Thyagharajan

Image fusion plays a vital role in medical imaging. Image fusion aims to integrate complementary as well as redundant information from multiple modalities into a single fused image without distortion or loss of information. In this research work, discrete wavelet transform (DWT)and undecimated discrete wavelet transform (UDWT)-based fusion techniques using genetic algorithm (GA)foroptimalparameter

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-22
Hongjun Choi; Anirudh Som; Pavan Turaga

Standard deep learning models that employ the categorical cross-entropy loss are known to perform well at image classification tasks. However, many standard models thus obtained often exhibit issues like feature redundancy, low interpretability, and poor calibration. A body of recent work has emerged that has tried addressing some of these challenges by proposing the use of new regularization functions

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Henry Kvinge; Zachary New; Nico Courts; Jung H. Lee; Lauren A. Phillips; Courtney D. Corley; Aaron Tuor; Andrew Avila; Nathan O. Hodas

Deep learning has shown great success in settings with massive amounts of data but has struggled when data is limited. Few-shot learning algorithms, which seek to address this limitation, are designed to generalize well to new tasks with limited data. Typically, models are evaluated on unseen classes and datasets that are defined by the same fundamental task as they are trained for (e.g. category membership)

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Jiawen Yao; Xinliang Zhu; Jitendra Jonnagaddala; Nicholas Hawkins; Junzhou Huang

Traditional image-based survival prediction models rely on discriminative patch labeling which make those methods not scalable to extend to large datasets. Recent studies have shown Multiple Instance Learning (MIL) framework is useful for histopathological images when no annotations are available in classification task. Different to the current image-based survival models that limit to key patches

更新日期：2020-09-24
• arXiv.cs.CV Pub Date : 2020-09-23
Zeynep Gurler; Ahmed Nebli; Islem Rekik

Foreseeing the brain evolution as a complex highly inter-connected system, widely modeled as a graph, is crucial for mapping dynamic interactions between different anatomical regions of interest (ROIs) in health and disease. Interestingly, brain graph evolution models remain almost absent in the literature. Here we design an adversarial brain network normalizer for representing each brain network as

更新日期：2020-09-24
Contents have been reproduced by permission of the publishers.

down
wechat
bug