
显示样式: 排序: IF: - GO 导出
-
Table of Contents IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2021-02-03
Presents the table of contents for this issue of the publication.
-
Cover IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2021-02-03
Presents a listing of the editorial board, board of governors, current staff, committee members, and/or society editors for this issue of the publication.
-
Cover IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2021-02-03
Presents the table of contents for this issue of the publication.
-
Table of Contents IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2021-01-11
Presents the table of contents for this issue of the publication.
-
Cover IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2021-01-11
Presents a listing of the editorial board, board of governors, current staff, committee members, and/or society editors for this issue of the publication.
-
Cover IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2021-01-11
Presents a listing of the editorial board, board of governors, current staff, committee members, and/or society editors for this issue of the publication.
-
The Perils and Pitfalls of Block Design for EEG Classification Experiments IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-11-19 Ren Li; Jared S. Johansen; Hamad Ahmed; Thomas V. Ilyevsky; Ronnie B. Wilbur; Hari M. Bharadwaj; Jeffrey Mark Siskind
A recent paper [1] claims to classify brain processing evoked in subjects watching ImageNet stimuli as measured with EEG and to employ a representation derived from this processing to construct a novel object classifier. That paper, together with a series of subsequent papers [2] , [3] , [4] , [5] , [6] , [7] , [8] , claims to achieve successful results on a wide variety of computer-vision tasks, including
-
Table of Contents IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-11-03
Presents the table of contents for this issue of the publication.
-
Cover IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-11-03
Presents a listing of the editorial board, board of governors, current staff, committee members, and/or society editors for this issue of the publication.
-
Cover IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-11-03
Presents a listing of the editorial board, board of governors, current staff, committee members, and/or society editors for this issue of the publication.
-
Cover IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-11-03
Presents a listing of the editorial board, board of governors, current staff, committee members, and/or society editors for this issue of the publication.
-
Efficient Global Multi-object Tracking Under Minimum-cost Circulation Framework. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-23 Congchao Wang,Yizhi Wang,Guoqiang Yu
We developed a minimum-cost circulation framework for solving the global data association problem, which plays a key role in the tracking-by-detection paradigm of multi-object tracking. The problem was extensively studied under the minimum-cost flow framework, which is theoretically attractive as being flexible and globally solvable. However, the high computational burden has been a long-standing obstacle
-
Heterogeneous Graph Attention Network for Unsupervised Multiple-Target Domain Adaptation. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-23 Xu Yang,Cheng Deng,Tongliang Liu,Dacheng Tao
Domain adaptation, which transfers the knowledge from label-rich source domain to unlabeled target domains, is a challenging task in machine learning. The prior domain adaptation methods focus on pairwise adaptation assumption with a single source and a single target domain, while little work concerns the scenario of one source domain and multiple target domains. Applying pairwise adaptation methods
-
Deep Coarse-to-fine Dense Light Field Reconstruction with Flexible Sampling and Geometry-aware Fusion. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-23 Jing Jin,Junhui Hou,Jie Chen,Huanqiang Zeng,Sam Kwong,Jingyi Yu
A densely-sampled light field (LF) is highly desirable in various applications. However, it is costly to acquire such data. Although many computational methods have been proposed to reconstruct a densely-sampled LF from a sparsely-sampled one, they still suffer from either low reconstruction quality, low computational efficiency, or the restriction on the regularity of the sampling pattern. To this
-
End-to-End Optimized Versatile Image Compression With Wavelet-Like Transform. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-23 Haichuan Ma,Dong Liu,Ning Yan,Houqiang Li,Feng Wu
Built on deep networks, end-to-end optimized image compression has made impressive progress in the past few years. Previous studies usually adopt a compressive auto-encoder, where the encoder part first converts image into latent features, and then quantizes the features before encoding them into bits. Both the conversion and the quantization incur information loss, resulting in a difficulty to optimally
-
Additive Tree-Structured Conditional Parameter Spaces in Bayesian Optimization: A Novel Covariance Function and a Fast Implementation. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-22 Xingchen Ma,Matthew B Blaschko
Bayesian optimization (BO) is a sample-efficient global optimization algorithm for black-box functions which are expensive to evaluate. Existing literature on model based optimization in conditional parameter spaces are usually built on trees. In this work, we generalize the additive assumption to tree-structured functions and propose an additive tree-structured covariance function, showing improved
-
Ray-Space Epipolar Geometry for Light Field Cameras. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-22 Qi Zhang,Qing Wang,Hongdong Li,Jingyi Yu
Light field essentially represents rays in space. The epipolar geometry between two light fields is an important relationship that captures ray-ray correspondences and relative configuration of two views. Unfortunately, so far little work has been done in deriving a formal epipolar geometry model that is specifically tailored for light field cameras. This is primarily due to the high-dimensional nature
-
Scalable Variational Gaussian Processes for Crowdsourcing: Glitch Detection in LIGO. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-21 Pablo Morales-Alvarez,Pablo Ruiz,Scotty Coughlin,Rafael Molina Soriano,Aggelos Katsaggelos
In the last years, crowdsourcing is transforming the way classification sets are obtained. Instead of relying on a single expert, crowdsourcing shares the effort among a large number of collaborators. This is being applied in the laureate Laser Interferometer Gravitational Waves Observatory (LIGO) in order to detect glitches which might hinder the identification of gravitational-waves. Probabilistic
-
Spherical Principal Curves. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-21 Jongmin Lee,Jang-Hyun Kim,Hee-Seok Oh
This paper presents a new approach for dimension reduction of data observed on spherical surfaces. Several dimension reduction techniques have been developed in recent years for non-Euclidean data analysis. As a pioneer work, Hauberg (2016) attempted to implement principal curves on Riemannian manifolds. However, this approach uses approximations to process data on Riemannian manifolds, resulting in
-
Visual Grounding via Accumulated Attention. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-21 Chaorui Deng,Qi Wu,Qingyao Wu,Fan Lyu,Fuyuan Hu,Mingkui Tan
Visual Grounding (VG) aims to locate the most relevant object or region in an image, based on a natural language query. In real-world VG applications, however, we usually have to deal with ambiguous queries and images with complicated scene structures. Identifying the target based on highly redundant and correlated information can be very challenging, leading to unsatisfactory performance. To tackle
-
Differential Viewpoints for Ground Terrain Material Recognition. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-18 Jia Xue,Hang Zhang,Ko Nishino,Kristin Dana
Computational surface modeling that underlies material recognition has transitioned from reflectance modeling using in-lab controlled radiometric measurements to image-based representations based on internet-mined single-view images captured in the scene. We take a middle-ground approach for material recognition that takes advantage of both rich radiometric cues and flexible image capture. A key concept
-
Efficient and Stable Graph Scattering Transforms via Pruning. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-18 Vassilis N Ioannidis,Siheng Chen,Georgios B Giannakis
Graph convolutional networks (GCNs) have well-documented performance in various graph learning tasks, but their analysis is still at its infancy. Graph scattering transforms (GSTs) offer training-free deep GCN models, and are amenable to generalization and stability analyses. The price paid by GSTs is exponential complexity that increases with the number of layers. This discourages deployment of GSTs
-
EgoCom: A Multi-person Multi-modal Egocentric Communications Dataset. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-18 Curtis Northcutt,Shengxin Zha,Steven Lovegrove,Richard Newcombe
Multi-modal datasets in artificial intelligence (AI) often capture a third-person perspective, but our embodied human intelligence evolved with sensory input from the egocentric, first-person perspective. Towards embodied AI, we introduce the Egocentric Communications (EgoCom) dataset to advance the state-of-the-art in conversational AI, natural language, audio speech analysis, computer vision, and
-
Disentangling Monocular 3D Object Detection: From Single to Multi-Class Recognition. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-18 Andrea Simonelli,Samuel Rota Bulo,Lorenzo Porzi,Manuel Lopez Antequera,Peter Kontschieder
In this paper we introduce a method for multi-class, monocular 3D object detection from a single RGB image, which exploits a novel disentangling transformation and a novel, self-supervised confidence estimation method for predicted 3D bounding boxes. The proposed disentangling transformation isolates the contribution made by different groups of parameters to a given loss, without changing its nature
-
Fashion Retrieval via Graph Reasoning Networks on a Similarity Pyramid. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-18 Yiming Gao,Zhanghui Kuang,Guanbin Li,Ping Luo,Yimin Chen,Liang Lin,Wayne Zhang
Matching clothing images from customers and online shopping stores has rich applications in E-commerce. Existing algorithms mostly encode an image as a global feature vector and perform retrieval via global representation matching. However, discriminative local information on clothes is submerged in this global representation, resulting in sub-optimal performance. To address this issue, we propose
-
Large-Scale Nonlinear AUC Maximization via Triply Stochastic Gradients. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-18 Zhiyuan Dang,Xiang Li,Bin Gu,Cheng Deng,Heng Huang
Learning to improve AUC performance for imbalanced data is an important machine learning research problem. Most methods of AUC maximization assume that the model function is linear in the original feature space. However, this assumption is not suitable for nonlinear separable problems. Although there have been several nonlinear methods of AUC maximization, scaling up nonlinear AUC maximization is still
-
Multi-task Learning with Coarse Priors for Robust Part-aware Person Re-identification. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-18 Changxing Ding,Kan Wang,Pengfei Wang,Dacheng Tao
Part-level representations are important for robust person re-identification (ReID), but in practice feature quality suffers due to the body part misalignment problem. In this paper, we present a robust, compact, and easy-to-use method called the Multi-task Part-aware Network (MPN), which is designed to extract semantically aligned part-level features from pedestrian images. MPN solves the body part
-
Grid Anchor based Image Cropping: A New Benchmark and An Efficient Model. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-15 Hui Zeng,Lida Li,Zisheng Cao,Lei Zhang
Image cropping aims to improve the composition and aesthetic quality of an image by removing extraneous content from it. Most of the existing image cropping databases provide only one or several human-annotated bounding boxes as the groundtruths, which can hardly reflect the non-uniqueness and flexibility of image cropping in practice. The employed evaluation metrics such as intersection-over-union
-
Neural Rendering for Game Character Auto-creation. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-15 Tianyang Shi,Zhengxia Zou,Zhenwei Shi,Yi Yuan
Many role-playing games feature character creation systems where players are allowed to edit the facial appearance of their in-game characters. This paper proposes a novel method to automatically create game characters based on a single face photo. We frame this "artistic creation" process under a self-supervised learning paradigm by leveraging the differentiable neural rendering. Considering the rendering
-
Quasi-globally Optimal and Near/True Real-time Vanishing Point Estimation in Manhattan World. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-10 Haoang Li,Ji Zhao,Jean-Charles Bazin,Yun-Hui Liu
Image lines projected from parallel 3D lines intersect at the vanishing point (VP). Manhattan world holds for the scenes with three orthogonal VPs. In Manhattan world, given several image lines, we aim to cluster them by three unknown-but-sought VPs. VP estimation can be reformulated as computing the rotation between the Manhattan frame and camera frame. To estimate three degrees of freedom (DOF) of
-
Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-10 Yun Liu,Yu-Huan Wu,Pei-Song Wen,Yu-Jun Shi,Yu Qiu,Ming-Ming Cheng
Weakly supervised semantic instance segmentation with only image-level supervision, instead of relying on expensive pixel wise masks or bounding box annotations, is an important problem to alleviate the data-hungry nature of deep learning. In this paper, we tackle this challenging problem by aggregating the image-level information of all training images into a large knowledge graph and exploiting semantic
-
Incremental Density-based Clustering on Multicore Processors. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-10 Son Mai,Jon Jacobsen,Sihem Amer-Yahia,Ivor Spence,Phuong Tran,Ira Assent,Quoc Viet Hung Nguyen
The density-based clustering algorithm is a fundamental data clustering technique with many real-world applications. However, when the database is frequently changed, how to effectively update clustering results rather than reclustering from scratch remains a challenging task. In this work, we introduce IncAnyDBC, a unique parallel incremental data clustering approach to deal with this problem. First
-
Active Surveillance via Group Sparse Bayesian Learning. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-10 Hongbin Pei,Bo Yang,Jiming Liu,Kevin Chang
The key to the effective control of a diffusion system lies in how accurately we could predict its unfolding dynamics based on the observation of its current state. However, in the real-world applications, it is often infeasible to conduct a timely and yet comprehensive observation due to resource constraints. In view of such a practical challenge, the goal of this work is to develop a novel computational
-
Kernel-based Density Map Generation for Dense Object Counting. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-09 Jia Wan,Qingzhong Wang,Antoni B Chan
Crowd counting is an essential topic in computer vision due to its practical usage in surveillance systems. The typical design of crowd counting algorithms is divided into two steps. First, the ground-truth density maps of crowd images are generated from the ground-truth dot maps (density map generation), e.g., by convolving with a Gaussian kernel. Second, deep learning models are designed to predict
-
DE-GAN: A Conditional Generative Adversarial Network for Document Enhancement. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-07 Mohamed Ali Souibgui,Yousri Kessentini
Documents often exhibit various forms of degradation, which make it hard to be read and substantially deteriorate the performance of an OCR system. In this paper, we propose an effective end-to-end framework named Document Enhancement Generative Adversarial Networks (DE-GAN) that uses the conditional GANs (cGANs) to restore severely degraded document images. To the best of our knowledge, this practice
-
A mathematical model for universal semantics. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-07 Weinan E,Yajun Zhou
We characterize the meaning of words with language-independent numerical fingerprints, through a mathematical analysis of recurring patterns in texts. Approximating texts by Markov processes on a long-range time scale, we are able to extract topics, discover synonyms, and sketch semantic fields from a particular document of moderate length, without consulting external knowledge-base or thesaurus. Our
-
Generative Imputation and Stochastic Prediction. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-07 Mohammad Kachuee,Kimmo Karkkainen,Orpaz Goldstein,Sajad Darabi,Majid Sarrafzadeh
In many machine learning applications, we are faced with incomplete datasets. In the literature, missing data imputation techniques have been mostly concerned with filling missing values. However, the existence of missing values is synonymous with uncertainties not only over the distribution of missing values but also over target class assignments that require careful consideration. In this paper,
-
Table of Contents IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-02
Presents the table of contents for this issue of the publication.
-
Cover IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-02
Presents a listing of the editorial board, board of governors, current staff, committee members, and/or society editors for this issue of the publication.
-
Guest Editors’ Introduction to the Special Issue on RGB-D Vision: Methods and Applications IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-03 Mohammed Bennamoun; Yulan Guo; Federico Tombari; Kamal Youcef-Toumi; Ko Nishino
The twenty-six papers in this special issue focus on Red Blue Green (RBG)-D vision, an emerging research topic in computer vision, with a number of applications in robotics, entertainment, biometrics and multimedia. Compared to 2D images and 3D data (including depth images, point clouds and meshes), RGB-D images represent both the photometric and geometric information of a scene. Moreover, low-cost
-
Evolving Career Opportunities Need Your Skills IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-02
Advertisement.
-
IEEE Computer Society Has You Covered! IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-02
Advertisement.
-
Cover IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-02
These instructions give guidelines for preparing papers for this publication. Presents information for authors publishing in this journal.
-
Cover IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-02
Presents the table of contents for this issue of the publication (back cover).
-
MS-TCN++: Multi-Stage Temporal Convolutional Network for Action Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-04 Shi-Jie Li,Yazan AbuFarha,Yun Liu,Ming-Ming Cheng,Juergen Gall
With the success of deep learning in classifying short trimmed videos, more attention has been focused on temporally segmenting and classifying activities in long untrimmed videos. State-of-the-art approaches for action segmentation utilize several layers of temporal convolution and temporal pooling. Despite the capabilities of these approaches in capturing temporal dependencies, their predictions
-
A Topological Loss Function for Deep-Learning based Image Segmentation using Persistent Homology. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-04 James Clough,Nicholas Byrne,Ilkay Oksuz,Veronika A Zimmer,Julia A Schnabel,Andrew King
We introduce a method for training neural networks to perform image or volume segmentation in which prior knowledge about the topology of the segmented object can be explicitly provided and then incorporated into the training process. By using the differentiable properties of persistent homology, a concept used in topological data analysis, we can specify the desired topology of segmented objects in
-
Semantic Object Accuracy for Generative Text-to-Image Synthesis. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-02 Tobias Hinz,Stefan Heinrich,Stefan Ge Wermter
Generative adversarial networks conditioned on textual image descriptions are capable of generating realistic-looking images. However, current methods still struggle to generate images based on complex image captions from a heterogeneous domain. Furthermore, quantitatively evaluating these text-to-image models is challenging, as most evaluation metrics only judge image quality but not the conformity
-
Densely Residual Laplacian Super-Resolution. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-02 Saeed Anwar,Nick Barnes
Super-Resolution convolutional neural networks have recently demonstrated high-quality restoration for single images. However, existing algorithms often require very deep architectures and long training times. Furthermore, current convolutional neural networks for super-resolution are unable to exploit features at multiple scales and weigh them equally or at only static scale only, limiting their learning
-
Fuzzy-match repair guided by quality estimation. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-02 John E Ortega,Mikel L Forcada,Felipe Sanchez-Martinez
Computer-aided translation tools based on translation memories are widely used to assist professional translators. A translation memory (TM) consists of a set of translation units (TU) made up of source- and target-language segment pairs. For the translation of a new source segment s', these tools search the TM and retrieve the TUs (s,t) whose source segments are more similar to s'. The translator
-
Real-time Globally Consistent Dense 3D Reconstruction with Online Texturing. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-02 Lei Han,Siyuan Gu,Dawei Zhong,Shuxue Quan,Lu Fang
High-quality reconstruction of 3D geometry and texture plays a vital role in providing immersive perception of the real world. Additionally, online computation enables the practical usage of 3D reconstruction for interaction. We present an RGBD-based globally-consistent dense 3D reconstruction approach, accompanying high-resolution (< 1 cm) geometric reconstruction and high-quality (the spatial resolution
-
Towards Partial Supervision for Generic Object Counting in Natural Scenes. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-01 Hisham Cholakkal,Guolei Sun,Salman Khan,Fahad Shahbaz Khan,Ling Shao,Luc Van Gool
Generic object counting in natural scenes is a challenging computer vision problem. Existing approaches either rely on instance-level supervision or absolute count information to train a generic object counter. We introduce a partially supervised setting that significantly reduces the supervision level required for generic object counting. We propose two novel frameworks, named lower-count (LC) and
-
Building and Interpreting Deep Similarity Models. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-01 Oliver Eberle,Jochen Buttner,Florian Krautli,Klaus-Robert Mueller,Matteo Valleriani,Gregoire Montavon
Many learning algorithms such as kernel machines, nearest neighbors, clustering, or anomaly detection, are based on distances or similarities. Before similarities are used for training an actual machine learning model, we would like to verify that they are bound to meaningful patterns in the data. In this paper, we propose to make similarities interpretable by augmenting them with an explanation. We
-
GeoNet++: Iterative Geometric Neural Network with Edge-Aware Refinement for Joint Depth and Surface Normal Estimation. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-01 Xiaojuan Qi,Zhengzhe Liu,Renjie Liao,Philip H S Torr,Raquel Urtasun,Jiaya Jia
In this paper, we propose a geometric neural network with edge-aware refinement (GeoNet++) to jointly predict both depth and surface normal maps from a single image. Building on top of two-stream CNNs, GeoNet++ captures the geometric relationships between depth and surface normals with the proposed depth-to-normal and normal-to-depth modules. In particular, the "depth-to-normal" module exploits the
-
GMFAD: Towards Generalized Visual Recognition via Multi-Layer Feature Alignment and Disentanglement. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-09-01 Haoliang Li,Shiqi Wang,Renjie Wan,Alex Kot Chichung
The deep-learning-based approaches which have been repeatedly proven to bring benefits to visual recognition tasks usually make a strong assumption that the training and test data are drawn from similar feature spaces and distributions. However, such an assumption may not always hold in various practical application scenarios on visual recognition tasks. Inspired by the hierarchical organization of
-
DATA: Differentiable ArchiTecture Approximation with Distribution Guided Sampling. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-08-31 Xinbang Zhang,Jianlong Chang,Yiwen Guo,Gaofeng Meng,Zhouchen Lin,Shiming Xiang,Chunhong Pan
Neural architecture search (NAS) is inherently subject to the gap of architectures during searching and validating. To bridge this gap effectively, we develop Differentiable ArchiTecture Approximation (DATA) with Ensemble Gumbel-Softmax (EGS) estimator and Architecture Distribution Constraint (ADC) to automatically approximate architectures during searching and validating in a differentiable manner
-
You Only Search Once: Single Shot Neural Architecture Search via Direct Sparse Optimization. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-08-31 Xinbang Zhang,Zehao Huang,Naiyan Wang,Shiming Xiang,Chunhong Pan
Recently Neural Architecture Search (NAS) has raised great interest in both academia and industry. However, it remains challenging because of its huge and non-continuous search space. Instead of applying evolutionary algorithm or reinforcement learning as previous works, this paper proposes a Direct Sparse Optimization NAS (DSO-NAS) method.The motivation behind DSO-NAS is to address the task in the
-
Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-08-27 Rene Ranftl,Katrin Lasinger,David Hafner,Konrad Schindler,Vladlen Koltun
The success of monocular depth estimation relies on large and diverse training sets. Due to the challenges associated with acquiring dense ground-truth depth across different environments at scale, a number of datasets with distinct characteristics and biases have emerged. We develop tools that enable mixing multiple datasets during training, even if their annotations are incompatible. In particular
-
Streaming convolutional neural networks for end-to-end learning with multi-megapixel images. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-08-26 Johannes Henricus Francisca Maria Pinckaers,Bram van Ginneken,Geert Litjens
Due to memory constraints on current hardware, most convolution neural networks (CNN) are trained on sub-megapixel images. For example, most popular datasets in computer vision contain images much less than a megapixel in size (0.09MP for ImageNet and 0.001MP for CIFAR-10). In some domains such as medical imaging, multi-megapixel images are needed to identify the presence of disease accurately. We
-
Locally Connected Network for Monocular 3D Human Pose Estimation. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-08-24 Hai Ci,Xiaoxuan Ma,Chunyu Wang,Yizhou Wang
We present an approach to estimate 3D human pose from a monocular image. The method consists of two steps: it first estimates a 2D pose from an image and then recovers the corresponding 3D pose. This work focuses on the second step. The Graph Convolutional Network (GCN) has recently become the de facto standard for human pose related tasks such as action recognition. However, in this work, we show
-
Meta-Transfer Learning through Hard Tasks. IEEE Trans. Pattern Anal. Mach. Intell. (IF 17.861) Pub Date : 2020-08-21 Qianru Sun,Yaoyao Liu,Zhaozheng Chen,Tat-Seng Chua,Bernt Schiele
Meta-learning has been proposed as a framework to address the challenging few-shot learning setting. The key idea is to leverage a large number of similar few-shot tasks in order to learn how to adapt a base-learner to a new task for which only a few labeled samples are available. As deep neural networks (DNNs) tend to overfit using a few samples only, typical meta-learning models use shallow neural