当前期刊: arXiv - CS - Computer Vision and Pattern Recognition Go to current issue    加入关注   
显示样式:        排序: IF: - GO 导出
我的关注
我的收藏
您暂时未登录!
登录
  • Multiview Detection with Feature Perspective Transformation
    arXiv.cs.CV Pub Date : 2020-07-14
    Yunzhong Hou; Liang Zheng; Stephen Gould

    Incorporating multiple camera views for detection alleviates the impact of occlusions in crowded scenes. In a multiview system, we need to answer two important questions when dealing with ambiguities that arise from occlusions. First, how should we aggregate cues from the multiple views? Second, how should we aggregate unreliable 2D and 3D spatial information that has been tainted by occlusions? To

    更新日期:2020-07-15
  • Transposer: Universal Texture Synthesis Using Feature Maps as Transposed Convolution Filter
    arXiv.cs.CV Pub Date : 2020-07-14
    Guilin Liu; Rohan Taori; Ting-Chun Wang; Zhiding Yu; Shiqiu Liu; Fitsum A. Reda; Karan Sapra; Andrew Tao; Bryan Catanzaro

    Conventional CNNs for texture synthesis consist of a sequence of (de)-convolution and up/down-sampling layers, where each layer operates locally and lacks the ability to capture the long-term structural dependency required by texture synthesis. Thus, they often simply enlarge the input texture, rather than perform reasonable synthesis. As a compromise, many recent methods sacrifice generalizability

    更新日期:2020-07-15
  • Modeling Artistic Workflows for Image Generation and Editing
    arXiv.cs.CV Pub Date : 2020-07-14
    Hung-Yu Tseng; Matthew Fisher; Jingwan Lu; Yijun Li; Vladimir Kim; Ming-Hsuan Yang

    People often create art by following an artistic workflow involving multiple stages that inform the overall design. If an artist wishes to modify an earlier decision, significant work may be required to propagate this new decision forward to the final artwork. Motivated by the above observations, we propose a generative model that follows a given artistic workflow, enabling both multi-stage image generation

    更新日期:2020-07-15
  • Multitask Learning Strengthens Adversarial Robustness
    arXiv.cs.CV Pub Date : 2020-07-14
    Chengzhi Mao; Amogh Gupta; Vikram Nitin; Baishakhi Ray; Shuran Song; Junfeng Yang; Carl Vondrick

    Although deep networks achieve strong accuracy on a range of computer vision benchmarks, they remain vulnerable to adversarial attacks, where imperceptible input perturbations fool the network. We present both theoretical and empirical analyses that connect the adversarial robustness of a model to the number of tasks that it is trained on. Experiments on two datasets show that attack difficulty increases

    更新日期:2020-07-15
  • MeTRAbs: Metric-Scale Truncation-Robust Heatmaps for Absolute 3D Human Pose Estimation
    arXiv.cs.CV Pub Date : 2020-07-12
    István Sárándi; Timm Linder; Kai O. Arras; Bastian Leibe

    Heatmap representations have formed the basis of human pose estimation systems for many years, and their extension to 3D has been a fruitful line of recent research. This includes 2.5D volumetric heatmaps, whose X and Y axes correspond to image space and Z to metric depth around the subject. To obtain metric-scale predictions, 2.5D methods need a separate post-processing step to resolve scale ambiguity

    更新日期:2020-07-15
  • Collaborative Unsupervised Domain Adaptation for Medical Image Diagnosis
    arXiv.cs.CV Pub Date : 2020-07-05
    Yifan Zhang; Ying Wei; Qingyao Wu; Peilin Zhao; Shuaicheng Niu; Junzhou Huang; Mingkui Tan

    Deep learning based medical image diagnosis has shown great potential in clinical medicine. However, it often suffers two major difficulties in real-world applications: 1) only limited labels are available for model training, due to expensive annotation costs over medical images; 2) labeled images may contain considerable label noise (e.g., mislabeling labels) due to diagnostic difficulties of diseases

    更新日期:2020-07-15
  • Alpha-Net: Architecture, Models, and Applications
    arXiv.cs.CV Pub Date : 2020-06-27
    Jishan Shaikh; Adya Sharma; Ankit Chouhan; Avinash Mahawar

    Deep learning network training is usually computationally expensive and intuitively complex. We present a novel network architecture for custom training and weight evaluations. We reformulate the layers as ResNet-similar blocks with certain inputs and outputs of their own, the blocks (called Alpha blocks) on their connection configuration form their own network, combined with our novel loss function

    更新日期:2020-07-15
  • Learning Accurate and Human-Like Driving using Semantic Maps and Attention
    arXiv.cs.CV Pub Date : 2020-07-10
    Simon Hecker; Dengxin Dai; Alexander Liniger; Luc Van Gool

    This paper investigates how end-to-end driving models can be improved to drive more accurately and human-like. To tackle the first issue we exploit semantic and visual maps from HERE Technologies and augment the existing Drive360 dataset with such. The maps are used in an attention mechanism that promotes segmentation confidence masks, thus focusing the network on semantic classes in the image that

    更新日期:2020-07-15
  • CenterNet3D:An Anchor free Object Detector for Autonomous Driving
    arXiv.cs.CV Pub Date : 2020-07-13
    Guojun Wang; Bin Tian; Yunfeng Ai; Tong Xu; Long Chen; Dongpu Cao

    Accurate and fast 3D object detection from point clouds is a key task in autonomous driving. Existing one-stage 3D object detection methods can achieve real-time performance, however, they are dominated by anchor-based detectors which are inefficient and require additional post-processing. In this paper, we eliminate anchors and model an object as a single point the center point of its bounding box

    更新日期:2020-07-15
  • Breaking the Curse of Space Explosion: Towards Efficient NAS with Curriculum Search
    arXiv.cs.CV Pub Date : 2020-07-07
    Yong Guo; Yaofo Chen; Yin Zheng; Peilin Zhao; Jian Chen; Junzhou Huang; Mingkui Tan

    Neural architecture search (NAS) has become an important approach to automatically find effective architectures. To cover all possible good architectures, we need to search in an extremely large search space with billions of candidate architectures. More critically, given a large search space, we may face a very challenging issue of space explosion. However, due to the limitation of computational resources

    更新日期:2020-07-15
  • Deep Heterogeneous Autoencoder for Subspace Clustering of Sequential Data
    arXiv.cs.CV Pub Date : 2020-07-14
    Abubakar Siddique; Reza Jalil Mozhdehi; Henry Medeiros

    We propose an unsupervised learning approach using a convolutional and fully connected autoencoder, which we call deep heterogeneous autoencoder, to learn discriminative features from segmentation masks and detection bounding boxes. To learn the mask shape information and its corresponding location in an input image, we extract coarse masks from a pretrained semantic segmentation network as well as

    更新日期:2020-07-15
  • Wavelet-Based Dual-Branch Network for Image Demoireing
    arXiv.cs.CV Pub Date : 2020-07-14
    Lin Liu; Jianzhuang Liu; Shanxin Yuan; Gregory Slabaugh; Ales Leonardis; Wengang Zhou; Qi Tian

    When smartphone cameras are used to take photos of digital screens, usually moire patterns result, severely degrading photo quality. In this paper, we design a wavelet-based dual-branch network (WDNet) with a spatial attention mechanism for image demoireing. Existing image restoration methods working in the RGB domain have difficulty in distinguishing moire patterns from true scene texture. Unlike

    更新日期:2020-07-15
  • Towards Dense People Detection with Deep Learning and Depth images
    arXiv.cs.CV Pub Date : 2020-07-14
    David Fuentes-Jimenez; Cristina Losada-Gutierrez; David Casillas-Perez; Javier Macias-Guarasa; Roberto Martin-Lopez; Daniel Pizarro; Carlos A. Luna

    This paper proposes a DNN-based system that detects multiple people from a single depth image. Our neural network processes a depth image and outputs a likelihood map in image coordinates, where each detection corresponds to a Gaussian-shaped local distribution, centered at the person's head. The likelihood map encodes both the number of detected people and their 2D image positions, and can be used

    更新日期:2020-07-15
  • An Uncertainty-based Human-in-the-loop System for Industrial Tool Wear Analysis
    arXiv.cs.CV Pub Date : 2020-07-14
    Alexander Treiss; Jannis Walk; Niklas Kühl

    Convolutional neural networks have shown to achieve superior performance on image segmentation tasks. However, convolutional neural networks, operating as black-box systems, generally do not provide a reliable measure about the confidence of their decisions. This leads to various problems in industrial settings, amongst others, inadequate levels of trust from users in the model's outputs as well as

    更新日期:2020-07-15
  • Re-ranking for Writer Identification and Writer Retrieval
    arXiv.cs.CV Pub Date : 2020-07-14
    Simon Jordan; Mathias Seuret; Pavel Král; Ladislav Lenc; Jiří Martínek; Barbara Wiermann; Tobias Schwinger; Andreas Maier; Vincent Christlein

    Automatic writer identification is a common problem in document analysis. State-of-the-art methods typically focus on the feature extraction step with traditional or deep-learning-based techniques. In retrieval problems, re-ranking is a commonly used technique to improve the results. Re-ranking refines an initial ranking result by using the knowledge contained in the ranked result, e. g., by exploiting

    更新日期:2020-07-15
  • Pasadena: Perceptually Aware and Stealthy Adversarial Denoise Attack
    arXiv.cs.CV Pub Date : 2020-07-14
    Yupeng Cheng; Qing Guo; Felix Juefei-Xu; Xiaofei Xie; Shang-Wei Lin; Weisi Lin; Wei Feng; Yang Liu

    Image denoising techniques have been widely employed in multimedia devices as an image post-processing operation that can remove sensor noise and produce visually clean images for further AI tasks, e.g., image classification. In this paper, we investigate a new task, adversarial denoise attack, that stealthily embeds attacks inside the image denoising module. Thus it can simultaneously denoise input

    更新日期:2020-07-15
  • Unsupervised Multi-Target Domain Adaptation Through Knowledge Distillation
    arXiv.cs.CV Pub Date : 2020-07-14
    Le Thanh Nguyen-Meidine; Madhu Kiran; Jose Dolz; Eric Granger; Atif Bela; Louis-Antoine Blais-Morin

    Unsupervised domain adaptation (UDA) seeks to alleviate the problem of domain shift between the distribution of unlabeled data from the target domain w.r.t labeled data from source domain. While the single-target domain scenario is well studied in UDA literature, the Multi-Target Domain Adaptation (MTDA) setting remains largely unexplored despite its importance. For instance, in video surveillance

    更新日期:2020-07-15
  • UDBNET: Unsupervised Document Binarization Network via Adversarial Game
    arXiv.cs.CV Pub Date : 2020-07-14
    Amandeep Kumar; Shuvozit Ghose; Pinaki Nath Chowdhury; Partha Pratim Roy; Umapada Pal

    Degraded document image binarization is one of the most challenging tasks in the domain of document image analysis. In this paper, we present a novel approach towards document image binarization by introducing three-player min-max adversarial game. We train the network in an unsupervised setup by assuming that we do not have any paired-training data. In our approach, an Adversarial Texture Augmentation

    更新日期:2020-07-15
  • Towards Realistic 3D Embedding via View Alignment
    arXiv.cs.CV Pub Date : 2020-07-14
    Fangneng Zhan; Shijian Lu; Changgong Zhang; Feiying Ma; Xuansong Xie

    Recent advances in generative adversarial networks (GANs) have achieved great success in automated image composition that generates new images by embedding interested foreground objects into background images automatically. On the other hand, most existing works deal with foreground objects in two-dimensional (2D) images though foreground objects in three-dimensional (3D) models are more flexible with

    更新日期:2020-07-15
  • Unsupervised Human 3D Pose Representation with Viewpoint and Pose Disentanglement
    arXiv.cs.CV Pub Date : 2020-07-14
    Qiang Nie; Ziwei Liu; Yunhui Liu

    Learning a good 3D human pose representation is important for human pose related tasks, e.g. human 3D pose estimation and action recognition. Within all these problems, preserving the intrinsic pose information and adapting to view variations are two critical issues. In this work, we propose a novel Siamese denoising autoencoder to learn a 3D pose representation by disentangling the pose-dependent

    更新日期:2020-07-15
  • RGB-D Salient Object Detection with Cross-Modality Modulation and Selection
    arXiv.cs.CV Pub Date : 2020-07-14
    Chongyi Li; Runmin Cong; Yongri Piao; Qianqian Xu; Chen Change Loy

    We present an effective method to progressively integrate and refine the cross-modality complementarities for RGB-D salient object detection (SOD). The proposed network mainly solves two challenging issues: 1) how to effectively integrate the complementary information from RGB image and its corresponding depth map, and 2) how to adaptively select more saliency-related features. First, we propose a

    更新日期:2020-07-15
  • Video Object Segmentation with Episodic Graph Memory Networks
    arXiv.cs.CV Pub Date : 2020-07-14
    Xinkai Lu; Wenguan Wang; Martin Danelljan; Tianfei Zhou; Jianbing Shen; Luc Van Gool

    How to make a segmentation model to efficiently adapt to a specific video and to online target appearance variations are fundamentally crucial issues in the field of video object segmentation. In this work, a novel graph memory network is developed to address the novel idea of ``learning to update the segmentation model''. Specifically, we exploit an episodic memory network, organized as a fully connected

    更新日期:2020-07-15
  • Correlation filter tracking with adaptive proposal selection for accurate scale estimation
    arXiv.cs.CV Pub Date : 2020-07-14
    Luo Xiong; Yanjie Liang; Yan Yan; Hanzi Wang

    Recently, some correlation filter based trackers with detection proposals have achieved state-of-the-art tracking results. However, a large number of redundant proposals given by the proposal generator may degrade the performance and speed of these trackers. In this paper, we propose an adaptive proposal selection algorithm which can generate a small number of high-quality proposals to handle the problem

    更新日期:2020-07-15
  • Pose2RGBD. Generating Depth and RGB images from absolute positions
    arXiv.cs.CV Pub Date : 2020-07-14
    Mihai Cristian Pîrvu

    We propose a method at the intersection of Computer Vision and Computer Graphics fields, which automatically generates RGBD images using neural networks, based on previously seen and synchronized video, depth and pose signals. Since the models must be able to reconstruct both texture (RGB) and structure (Depth), it creates an implicit representation of the scene, as opposed to explicit ones, such as

    更新日期:2020-07-15
  • Improving Face Recognition by Clustering Unlabeled Faces in the Wild
    arXiv.cs.CV Pub Date : 2020-07-14
    Aruni RoyChowdhury; Xiang Yu; Kihyuk Sohn; Erik Learned-Miller; Manmohan Chandraker

    While deep face recognition has benefited significantly from large-scale labeled data, current research is focused on leveraging unlabeled data to further boost performance, reducing the cost of human annotation. Prior work has mostly been in controlled settings, where the labeled and unlabeled data sets have no overlapping identities by construction. This is not realistic in large-scale face recognition

    更新日期:2020-07-15
  • P-KDGAN: Progressive Knowledge Distillation with GANs for One-class Novelty Detection
    arXiv.cs.CV Pub Date : 2020-07-14
    Zhiwei Zhang; Shifeng Chen; Lei Sun

    One-class novelty detection is to identify anomalous instances that do not conform to the expected normal instances. In this paper, the Generative Adversarial Networks (GANs) based on encoder-decoder-encoder pipeline are used for detection and achieve state-of-the-art performance. However, deep neural networks are too over-parameterized to deploy on resource-limited devices. Therefore, Progressive

    更新日期:2020-07-15
  • Learning Semantics-enriched Representation via Self-discovery, Self-classification, and Self-restoration
    arXiv.cs.CV Pub Date : 2020-07-14
    Fatemeh Haghighi; Mohammad Reza Hosseinzadeh Taher; Zongwei Zhou; Michael B. Gotway; Jianming Liang

    Medical images are naturally associated with rich semantics about the human anatomy, reflected in an abundance of recurring anatomical patterns, offering unique potential to foster deep semantic representation learning and yield semantically more powerful models for different medical applications. But how exactly such strong yet free semantics embedded in medical images can be harnessed for self-supervised

    更新日期:2020-07-15
  • Self-Supervised Monocular Depth Estimation: Solving the Dynamic Object Problem by Semantic Guidance
    arXiv.cs.CV Pub Date : 2020-07-14
    Marvin Klingner; Jan-Aike Termöhlen; Jonas Mikolajczyk; Tim Fingscheidt

    Self-supervised monocular depth estimation presents a powerful method to obtain 3D scene information from single camera images, which is trainable on arbitrary image sequences without requiring depth labels, e.g., from a LiDAR sensor. In this work we present a new self-supervised semantically-guided depth estimation (SGDepth) method to deal with moving dynamic-class (DC) objects, such as moving cars

    更新日期:2020-07-15
  • REPrune: Filter Pruning via Representative Election
    arXiv.cs.CV Pub Date : 2020-07-14
    Mincheol Park; Woojeong Kim; Suhyun Kim

    Even though norm-based filter pruning methods are widely accepted, it is questionable whether the "smaller-norm-less-important" criterion is optimal in determining filters to prune. Especially when we can keep only a small fraction of the original filters, it is more crucial to choose the filters that can best represent the whole filters regardless of norm values. Our novel pruning method entitled

    更新日期:2020-07-15
  • Rethinking Image Inpainting via a Mutual Encoder-Decoder with Feature Equalizations
    arXiv.cs.CV Pub Date : 2020-07-14
    Hongyu Liu; Bin Jiang; Yibing Song; Wei Huang; Chao Yang

    Deep encoder-decoder based CNNs have advanced image inpainting methods for hole filling. While existing methods recover structures and textures step-by-step in the hole regions, they typically use two encoder-decoders for separate recovery. The CNN features of each encoder are learned to capture either missing structures or textures without considering them as a whole. The insufficient utilization

    更新日期:2020-07-15
  • A Graph-based Interactive Reasoning for Human-Object Interaction Detection
    arXiv.cs.CV Pub Date : 2020-07-14
    Dongming Yang; Yuexian Zou

    Human-Object Interaction (HOI) detection devotes to learn how humans interact with surrounding objects via inferring triplets of < human, verb, object >. However, recent HOI detection methods mostly rely on additional annotations (e.g., human pose) and neglect powerful interactive reasoning beyond convolutions. In this paper, we present a novel graph-based interactive reasoning model called Interactive

    更新日期:2020-07-15
  • AQD: Towards Accurate Quantized Object Detection
    arXiv.cs.CV Pub Date : 2020-07-14
    Jing Liu; Bohan Zhuang; Peng Chen; Mingkui Tan; Chunhua Shen

    Network quantization aims to lower the bitwidth of weights and activations and hence reduce the model size and accelerate the inference of deep networks. Even though existing quantization methods have achieved promising performance on image classification, applying aggressively low bitwidth quantization on object detection while preserving the performance is still a challenge. In this paper, we demonstrate

    更新日期:2020-07-15
  • 360$^\circ$ Depth Estimation from Multiple Fisheye Images with Origami Crown Representation of Icosahedron
    arXiv.cs.CV Pub Date : 2020-07-14
    Ren Komatsu; Hiromitsu Fujii; Yusuke Tamura; Atsushi Yamashita; Hajime Asama

    In this study, we present a method for all-around depth estimation from multiple omnidirectional images for indoor environments. In particular, we focus on plane-sweeping stereo as the method for depth estimation from the images. We propose a new icosahedron-based representation and ConvNets for omnidirectional images, which we name "CrownConv" because the representation resembles a crown made of origami

    更新日期:2020-07-15
  • Joint Layout Analysis, Character Detection and Recognition for Historical Document Digitization
    arXiv.cs.CV Pub Date : 2020-07-14
    Weihong Ma; Hesuo Zhang; Lianwen Jin; Sihang Wu; Jiapeng Wang; Yongpan Wang

    In this paper, we propose an end-to-end trainable framework for restoring historical documents content that follows the correct reading order. In this framework, two branches named character branch and layout branch are added behind the feature extraction network. The character branch localizes individual characters in a document image and recognizes them simultaneously. Then we adopt a post-processing

    更新日期:2020-07-15
  • Knowledge Distillation for Multi-task Learning
    arXiv.cs.CV Pub Date : 2020-07-14
    Wei-Hong Li; Hakan Bilen

    Multi-task learning (MTL) is to learn one single model that performs multiple tasks for achieving good performance on all tasks and lower cost on computation. Learning such a model requires to jointly optimize losses of a set of tasks with different difficulty levels, magnitudes, and characteristics (e.g. cross-entropy, Euclidean loss), leading to the imbalance problem in multi-task learning. To address

    更新日期:2020-07-15
  • JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds
    arXiv.cs.CV Pub Date : 2020-07-14
    Zeyu Hu; Mingmin Zhen; Xuyang Bai; Hongbo Fu; Chiew-lan Tai

    Semantic segmentation and semantic edge detection can be seen as two dual problems with close relationships in computer vision. Despite the fast evolution of learning-based 3D semantic segmentation methods, little attention has been drawn to the learning of 3D semantic edge detectors, even less to a joint learning method for the two tasks. In this paper, we tackle the 3D semantic edge detection task

    更新日期:2020-07-15
  • Visual Tracking by TridentAlign and Context Embedding
    arXiv.cs.CV Pub Date : 2020-07-14
    Janghoon Choi; Junseok Kwon; Kyoung Mu Lee

    Recent advances in Siamese network-based visual tracking methods have enabled high performance on numerous tracking benchmarks. However, extensive scale variations of the target object and distractor objects with similar categories have consistently posed challenges in visual tracking. To address these persisting issues, we propose novel TridentAlign and context embedding modules for Siamese network-based

    更新日期:2020-07-15
  • Compare and Reweight: Distinctive Image Captioning Using Similar Images Sets
    arXiv.cs.CV Pub Date : 2020-07-14
    Jiuniu Wang; Wenjia Xu; Qingzhong Wang; Antoni B. Chan

    A wide range of image captioning models has been developed, achieving significant improvement based on popular metrics, such as BLEU, CIDEr, and SPICE. However, although the generated captions can accurately describe the image, they are generic for similar images and lack distinctiveness, i.e., cannot properly describe the uniqueness of each image. In this paper, we aim to improve the distinctiveness

    更新日期:2020-07-15
  • Alleviating Over-segmentation Errors by Detecting Action Boundaries
    arXiv.cs.CV Pub Date : 2020-07-14
    Yuchi Ishikawa; Seito Kasai; Yoshimitsu Aoki; Hirokatsu Kataoka

    We propose an effective framework for the temporal action segmentation task, namely an Action Segment Refinement Framework (ASRF). Our model architecture consists of a long-term feature extractor and two branches: the Action Segmentation Branch (ASB) and the Boundary Regression Branch (BRB). The long-term feature extractor provides shared features for the two branches with a wide temporal receptive

    更新日期:2020-07-15
  • BUNET: Blind Medical Image Segmentation Based on Secure UNET
    arXiv.cs.CV Pub Date : 2020-07-14
    Song Bian; Xiaowei Xu; Weiwen Jiang; Yiyu Shi; Takashi Sato

    The strict security requirements placed on medical records by various privacy regulations become major obstacles in the age of big data. To ensure efficient machine learning as a service schemes while protecting data confidentiality, in this work, we propose blind UNET (BUNET), a secure protocol that implements privacy-preserving medical image segmentation based on the UNET architecture. In BUNET,

    更新日期:2020-07-15
  • Topology-Change-Aware Volumetric Fusion for Dynamic Scene Reconstruction
    arXiv.cs.CV Pub Date : 2020-07-14
    Chao Li; Xiaohu Guo

    Topology change is a challenging problem for 4D reconstruction of dynamic scenes. In the classic volumetric fusion-based framework, a mesh is usually extracted from the TSDF volume as the canonical surface representation to help estimating deformation field. However, the surface and Embedded Deformation Graph (EDG) representations bring conflicts under topology changes since the surface mesh has fixed-connectivity

    更新日期:2020-07-15
  • Socially and Contextually Aware Human Motion and Pose Forecasting
    arXiv.cs.CV Pub Date : 2020-07-14
    Vida Adeli; Ehsan Adeli; Ian Reid; Juan Carlos Niebles; Hamid Rezatofighi

    Smooth and seamless robot navigation while interacting with humans depends on predicting human movements. Forecasting such human dynamics often involves modeling human trajectories (global motion) or detailed body joint movements (local motion). Prior work typically tackled local and global human movements separately. In this paper, we propose a novel framework to tackle both tasks of human motion

    更新日期:2020-07-15
  • Face to Purchase: Predicting Consumer Choices with Structured Facial and Behavioral Traits Embedding
    arXiv.cs.CV Pub Date : 2020-07-14
    Zhe Liu; Xianzhi Wang; Lina Yao; Jake An; Lei Bai; Ee-Peng Lim

    Predicting consumers' purchasing behaviors is critical for targeted advertisement and sales promotion in e-commerce. Human faces are an invaluable source of information for gaining insights into consumer personality and behavioral traits. However, consumer's faces are largely unexplored in previous research, and the existing face-related studies focus on high-level features such as personality traits

    更新日期:2020-07-15
  • Top-Related Meta-Learning Method for Few-Shot Detection
    arXiv.cs.CV Pub Date : 2020-07-14
    Qian Li; Nan Guo; Duo Wang; Xiaochun Ye

    Many meta-learning methods which depend on a large amount of data and more parameters have been proposed for few-shot detection. They require more cost. However, because of imbalance of categories and less features, previous methods exist obvious problems, the strong bias and poor classification for few-shot detection. Therefore, for meta-learning method of few-shot detection, we propose a TCL which

    更新日期:2020-07-15
  • A Single Stream Network for Robust and Real-time RGB-D Salient Object Detection
    arXiv.cs.CV Pub Date : 2020-07-14
    Xiaoqi Zhao; Lihe Zhang; Youwei Pang; Huchuan Lu; Lei Zhang

    Existing RGB-D salient object detection (SOD) approaches concentrate on the cross-modal fusion between the RGB stream and the depth stream. They do not deeply explore the effect of the depth map itself. In this work, we design a single stream network to directly use the depth map to guide early fusion and middle fusion between RGB and depth, which saves the feature encoder of the depth stream and achieves

    更新日期:2020-07-15
  • DeepMSRF: A novel Deep Multimodal Speaker Recognition framework with Feature selection
    arXiv.cs.CV Pub Date : 2020-07-14
    Ehsan Asali; Farzan Shenavarmasouleh; Farid Ghareh Mohammadi; Prasanth Sengadu Suresh; Hamid R. Arabnia

    For recognizing speakers in video streams, significant research studies have been made to obtain a rich machine learning model by extracting high-level speaker's features such as facial expression, emotion, and gender. However, generating such a model is not feasible by using only single modality feature extractors that exploit either audio signals or image frames, extracted from video streams. In

    更新日期:2020-07-15
  • TCGM: An Information-Theoretic Framework for Semi-Supervised Multi-Modality Learning
    arXiv.cs.CV Pub Date : 2020-07-14
    Xinwei Sun; Yilun Xu; Peng Cao; Yuqing Kong; Lingjing Hu; Shanghang Zhang; Yizhou Wang

    Fusing data from multiple modalities provides more information to train machine learning systems. However, it is prohibitively expensive and time-consuming to label each modality with a large amount of data, which leads to a crucial problem of semi-supervised multi-modal learning. Existing methods suffer from either ineffective fusion across modalities or lack of theoretical guarantees under proper

    更新日期:2020-07-15
  • Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner
    arXiv.cs.CV Pub Date : 2020-07-14
    Eugene Lee; Evan Chen; Chen-Yi Lee

    Remote heart rate estimation is the measurement of heart rate without any physical contact with the subject and is accomplished using remote photoplethysmography (rPPG) in this work. rPPG signals are usually collected using a video camera with a limitation of being sensitive to multiple contributing factors, e.g. variation in skin tone, lighting condition and facial structure. End-to-end supervised

    更新日期:2020-07-15
  • Vehicle Trajectory Prediction by Transfer Learning of Semi-Supervised Models
    arXiv.cs.CV Pub Date : 2020-07-14
    Nick Lamm; Shashank Jaiprakash; Malavika Srikanth; Iddo Drori

    In this work we show that semi-supervised models for vehicle trajectory prediction significantly improve performance over supervised models on state-of-the-art real-world benchmarks. Moving from supervised to semi-supervised models allows scaling-up by using unlabeled data, increasing the number of images in pre-training from Millions to a Billion. We perform ablation studies comparing transfer learning

    更新日期:2020-07-15
  • Semi-supervised Learning with a Teacher-student Network for Generalized Attribute Prediction
    arXiv.cs.CV Pub Date : 2020-07-14
    Minchul Shin

    This paper presents a study on semi-supervised learning to solve the visual attribute prediction problem. In many applications of vision algorithms, the precise recognition of visual attributes of objects is important but still challenging. This is because defining a class hierarchy of attributes is ambiguous, so training data inevitably suffer from class imbalance and label sparsity, leading to a

    更新日期:2020-07-15
  • Patch-wise Attack for Fooling Deep Neural Network
    arXiv.cs.CV Pub Date : 2020-07-14
    Lianli Gao; Qilong Zhang; Jingkuan Song; Xianglong Liu; Heng Tao Shen

    By adding human-imperceptible noise to clean images, the resultant adversarial examples can fool other unknown models. Features of a pixel extracted by deep neural networks (DNNs) are influenced by its surrounding regions, and different DNNs generally focus on different discriminative regions in recognition. Motivated by this, we propose a patch-wise iterative algorithm -- a black-box attack towards

    更新日期:2020-07-15
  • Personalized Face Modeling for Improved Face Reconstruction and Motion Retargeting
    arXiv.cs.CV Pub Date : 2020-07-14
    Bindita Chaudhuri; Noranart Vesdapunt; Linda Shapiro; Baoyuan Wang

    Traditional methods for image-based 3D face reconstruction and facial motion retargeting fit a 3D morphable model (3DMM) to the face, which has limited modeling capacity and fail to generalize well to in-the-wild data. Use of deformation transfer or multilinear tensor as a personalized 3DMM for blendshape interpolation does not address the fact that facial expressions result in different local and

    更新日期:2020-07-15
  • JNR: Joint-based Neural Rig Representation for Compact 3D Face Modeling
    arXiv.cs.CV Pub Date : 2020-07-14
    Noranart Vesdapunt; Mitch Rundle; HsiangTao Wu; Baoyuan Wang

    In this paper, we introduce a novel approach to learn a 3D face model using a joint-based face rig and a neural skinning network. Thanks to the joint-based representation, our model enjoys some significant advantages over prior blendshape-based models. First, it is very compact such that we are orders of magnitude smaller while still keeping strong modeling capacity. Second, because each joint has

    更新日期:2020-07-15
  • Water level prediction from social media images with a multi-task ranking approach
    arXiv.cs.CV Pub Date : 2020-07-14
    P. Chaudhary; S. D'Aronco; J. P. Leitao; K. Schindler; J. D. Wegner

    Floods are among the most frequent and catastrophic natural disasters and affect millions of people worldwide. It is important to create accurate flood maps to plan (offline) and conduct (real-time) flood mitigation and flood rescue operations. Arguably, images collected from social media can provide useful information for that task, which would otherwise be unavailable. We introduce a computer vision

    更新日期:2020-07-15
  • DETCID: Detection of Elongated Touching Cells with Inhomogeneous Illumination using a Deep Adversarial Network
    arXiv.cs.CV Pub Date : 2020-07-13
    Ali Memariani; Ioannis A. Kakadiaris

    Clostridioides difficile infection (C. diff) is the most common cause of death due to secondary infection in hospital patients in the United States. Detection of C. diff cells in scanning electron microscopy (SEM) images is an important task to quantify the efficacy of the under-development treatments. However, detecting C. diff cells in SEM images is a challenging problem due to the presence of inhomogeneous

    更新日期:2020-07-15
  • Embedded Encoder-Decoder in Convolutional Networks Towards Explainable AI
    arXiv.cs.CV Pub Date : 2020-06-19
    Amirhossein Tavanaei

    Understanding intermediate layers of a deep learning model and discovering the driving features of stimuli have attracted much interest, recently. Explainable artificial intelligence (XAI) provides a new way to open an AI black box and makes a transparent and interpretable decision. This paper proposes a new explainable convolutional neural network (XCNN) which represents important and driving visual

    更新日期:2020-07-15
  • A Bayesian Evaluation Framework for Ground Truth-Free Visual Recognition Tasks
    arXiv.cs.CV Pub Date : 2020-06-20
    Derek S. PrijateljUniversity of Notre Dame, Notre Dame, USA; Mel McCurriePerceptive Automata, Boston, USA; Walter J. ScheirerUniversity of Notre Dame, Notre Dame, USA

    An interesting development in automatic visual recognition has been the emergence of tasks where it is not possible to assign ground truth labels to images, yet still feasible to collect annotations that reflect human judgements about them. Such tasks include subjective visual attribute assignment and the labeling of ambiguous scenes. Machine learning-based predictors for these tasks rely on supervised

    更新日期:2020-07-15
  • Measuring Performance of Generative Adversarial Networks on Devanagari Script
    arXiv.cs.CV Pub Date : 2020-06-21
    Amogh G. Warkhandkar; Baasit Sharief; Omkar B. Bhambure

    The working of neural networks following the adversarial philosophy to create a generative model is a fascinating field. Multiple papers have already explored the architectural aspect and proposed systems with potentially good results however, very few papers are available which implement it on a real-world example. Traditionally, people use the famous MNIST dataset as a Hello, World! example for implementing

    更新日期:2020-07-15
  • Deep Image Orientation Angle Detection
    arXiv.cs.CV Pub Date : 2020-06-21
    Subhadip Maji; Smarajit Bose

    Estimating and rectifying the orientation angle of any image is a pretty challenging task. Initial work used the hand engineering features for this purpose, where after the invention of deep learning using convolution-based neural network showed significant improvement in this problem. However, this paper shows that the combination of CNN and a custom loss function specially designed for angles lead

    更新日期:2020-07-15
  • CPL-SLAM: Efficient and Certifiably Correct Planar Graph-Based SLAM Using the Complex Number Representation
    arXiv.cs.CV Pub Date : 2020-06-25
    Taosha Fan; Hanlin Wang; Michael Rubenstein; Todd Murphey

    In this paper, we consider the problem of planar graph-based simultaneous localization and mapping (SLAM) that involves both poses of the autonomous agent and positions of observed landmarks. We present CPL-SLAM, an efficient and certifiably correct algorithm to solve planar graph-based SLAM using the complex number representation. We formulate and simplify planar graph-based SLAM as the maximum likelihood

    更新日期:2020-07-15
Contents have been reproduced by permission of the publishers.
导出
全部期刊列表>>
产业、创新与基础设施
AI核心技术
10years
自然科研线上培训服务
材料学研究精选
Springer Nature Live 产业与创新线上学术论坛
胸腔和胸部成像专题
自然科研论文编辑服务
ACS ES&T Engineering
ACS ES&T Water
屿渡论文,编辑服务
杨超勇
周一歌
华东师范大学
段炼
清华大学
廖矿标
李远
跟Nature、Science文章学绘图
隐藏1h前已浏览文章
中洪博元
课题组网站
新版X-MOL期刊搜索和高级搜索功能介绍
ACS材料视界
x-mol收录
福州大学
南京大学
王杰
丘龙斌
电子显微学
何凤
洛杉矶分校
吴杰
赵延川
试剂库存
天合科研
down
wechat
bug