当前期刊: Pattern Recognition Go to current issue    加入关注   
显示样式:        排序: 导出
我的关注
我的收藏
您暂时未登录!
登录
  • Auto-weighted Multi-view Co-clustering via Fast Matrix Factorization
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-21
    Feiping Nie; Shaojun Shi; Xuelong Li

    Multi-view clustering is a hot research topic in machine learning and pattern recognition, however, it remains high computational complexity when clustering multi-view data sets. Although a number of approaches have been proposed to accelerate the computational efficiency, most of them do not consider the data duality between features and samples. In this paper, we propose a novel co-clustering approach termed as Fast Multi-view Bilateral K-means (FMVBKM), which can implement clustering task on row and column of the input data matrix, simultaneously. Specifically, FMVBKM applies the relaxed K-means clustering technique to multi-view data clustering. In addition, to decrease information loss in matrix factorization, we further introduce a new co-clustering method named as Fast Multi-view Matrix Tri-Factorization (FMVMTF). Extensive experimental results on six benchmark data sets show that the proposed two approaches not only have comparable clustering performance but also present the high computational efficiency, in comparison with state-of-the-art multi-view clustering methods.

    更新日期:2020-01-22
  • PoseConvGRU: A Monocular Approach for Visual Ego-motion Estimation by Learning
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-21
    Guangyao Zhai; Liang Liu; Linjian Zhang; Yong Liu; Yunliang Jiang

    Visual ego-motion estimation is one of the longstanding problems which estimates the movement of cameras from images. Learning based ego-motion estimation methods have seen an increasing attention since its desirable properties of robustness to image noise and camera calibration independence. In this work, we propose a data-driven approach of learning based visual ego-motion estimation for a monocular camera. We use an end-to-end learning approach in allowing the model to learn a map from input image pairs to the corresponding ego-motion, which is parameterized as 6-DoF transformation matrix. We introduce a two-module Long-term Recurrent Convolutional Neural Networks called PoseConvGRU. The feature-encoding module encodes the short-term motion feature in an image pair, while the memory-propagating module captures the long-term motion feature in the consecutive image pairs. The visual memory is implemented with convolutional gated recurrent units, which allows propagating information over time. At each time step, two consecutive RGB images are stacked together to form a 6-channel tensor for feature-encoding module to learn how to extract motion information and estimate poses. The sequence of output maps is then passed through the memory-propagating module to generate the relative transformation pose of each image pair. In addition, we have designed a series of data augmentation methods to avoid the overfitting problem and improve the performance of the model when facing challengeable scenarios such as high-speed or reverse driving. We evaluate the performance of our proposed approach on the KITTI Visual Odometry benchmark and Malaga 2013 Dataset. The experiments show a competitive performance of the proposed method to the state-of-the-art monocular geometric and learning methods and encourage further exploration of learning-based methods for the purpose of estimating camera ego-motion even though geometrical methods demonstrate promising results.

    更新日期:2020-01-22
  • MIMN-DPP: Maximum-Information and Minimum-Noise Determinantal Point Processes for Unsupervised Hyperspectral Band Selection
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-21
    Weizhao Chen; Zhijing Yang; Jinchang Ren; Jiangzhong Cao; Nian Cai; Huimin Zhao; Peter Yuen

    Band selection plays an important role in hyperspectral imaging for reducing the data and improving the efficiency of data acquisition and analysis whilst significantly lowering the cost of the imaging system. Without the category labels, it is challenging to select an effective and low-redundancy band subset. In this paper, a new unsupervised band selection algorithm is proposed based on a new band search criterion and an improved Determinantal Point Processes (DPP). First, to preserve the original information of hyperspectral image, a novel band search criterion is designed for searching the bands with high information entropy and low noise. Unfortunately, finding the optimal solution based on the search criteria to select a low-redundancy band subset is a NP-hard problem. To solve this problem, we consider the correlation of bands from both original hyperspectral image and its spatial information to construct a double-graph model to describe the relationship between spectral bands. Besides, an improved DPP algorithm is proposed for the approximate search of a low-redundancy band subset from the double-graph model. Experiment results on several well-known datasets show that the proposed optical band selection algorithm achieves better performance than many other state-of-the-art methods.

    更新日期:2020-01-21
  • Multi-Scale Differential Feature for ECG Biometrics with Collective Matrix Factorization
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-20
    Kuikui Wang; Gongping Yang; Yuwen Huang; Yilong Yin

    Electrocardiogram (ECG) biometrics has recently received considerable attention and is considered to be a promising biometric trait. Although some promising results on ECG biometrics have been reported, it is still challenging to perform this technique robustly and precisely. To address these issues, this paper presents a novel ECG biometrics framework: Multi-Scale Differential Feature for ECG biometrics with Collective Matrix Factorization (CMF). First, we extract the Multi-Scale Differential Feature (MSDF) from the one-dimensional ECG signal and then fuse MSDF with 1DMRLBP to generate the MSDF-1DMRLBP, which acts as the base feature of the ECG signal. Second, to extract discriminative information from the intermediate base features, we leverage the CMF technique to generate the final robust ECG representations by simultaneously embedding MSDF-1DMRLBP and label information. Consequently, the final robust features could preserve the intra-subject and inter-subject similarities. Extensive experiments are conducted on four ECG databases, and the results demonstrate that the proposed method can outperform the state-of-the-art in terms of both accuracy and efficiency.

    更新日期:2020-01-21
  • A Novel Density-Based Clustering Algorithm Using Nearest Neighbor Graph
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-17
    Hao Li; Xiaojie Liu; Tao Li; Rundong Gan

    Density-based clustering has several desirable properties, such as the abilities to handle and identify noise samples, discover clusters of arbitrary shapes, and automatically discover of the number of clusters. Identifying the core samples within the dense regions of a dataset is a significant step of the density-based clustering algorithm. Unlike many other algorithms that estimate the density of each samples using different kinds of density estimators and then choose core samples based on a threshold, in this paper, we present a novel approach for identifying local high-density samples utilizing the inherent properties of the nearest neighbor graph (NNG). After using the density estimator to filter noise samples, the proposed algorithm ADBSCAN in which “A” stands for “Adaptive” performs a DBSCAN-like clustering process. The experimental results on artificial and real-world datasets have demonstrated the significant performance improvement over existing density-based clustering algorithms.

    更新日期:2020-01-17
  • Unsupervised Representation Learning by Discovering Reliable Image Relations
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-17
    Timo Milbich; Omair Ghori; Ferran Diego; Björn Ommer

    Learning robust representations that allow to reliably establish relations between images is of paramount importance for virtually all of computer vision. Annotating the quadratic number of pairwise relations between training images is simply not feasible, while unsupervised inference is prone to noise, thus leaving the vast majority of these relations to be unreliable. To nevertheless find those relations which can be reliably utilized for learning, we follow a divide-and-conquer strategy: We find reliable similarities by extracting compact groups of images and reliable dissimilarities by partitioning these groups into subsets, converting the complicated overall problem into few reliable local subproblems. For each of the subsets we obtain a representation by learning a mapping to a target feature space so that their reliable relations are kept. Transitivity relations between the subsets are then exploited to consolidate the local solutions into a concerted global representation. While iterating between grouping, partitioning, and learning, we can successively use more and more reliable relations which, in turn, improves our image representation. In experiments, our approach shows state-of-the-art performance on unsupervised classification on ImageNet with 46.0% and competes favorably on different transfer learning tasks on PASCAL VOC.

    更新日期:2020-01-17
  • Appropriateness of Performance Indices for Imbalanced Data Classification: An Analysis
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-17
    Sankha Subhra Mullick; Shounak Datta; Sourish Gunesh Dhekane; Swagatam Das

    Indices quantifying the performance of classifiers under class-imbalance, often suffer from distortions depending on the constitution of the test set or the class-specific classification accuracy, creating difficulties in assessing the merit of the classifier. We identify two fundamental conditions that a performance index must satisfy to be respectively resilient to altering number of testing instances from each class and the number of classes in the test set. In light of these conditions, under the effect of class imbalance, we theoretically analyze four indices commonly used for evaluating binary classifiers and five popular indices for multi-class classifiers. For indices violating any of the conditions, we also suggest remedial modification and normalization. We further investigate the capability of the indices to retain information about the classification performance over all the classes, even when the classifier exhibits extreme performance on some classes. Simulation studies are performed on high dimensional deep representations of subset of the ImageNet dataset using four state-of-the-art classifiers tailored for handling class imbalance. Finally, based on our theoretical findings and empirical evidence, we recommend the appropriate indices that should be used to evaluate the performance of classifiers in presence of class-imbalance.

    更新日期:2020-01-17
  • Invariant Subspace Learning for Time Series Data Based on Dynamic Time Warping Distance
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-17
    Huiqi Deng; Weifu Chen; Qi Shen; Andy J. Ma; Pong C. Yuen; Guocan Feng

    Low-dimensional and compact representation of time series data is of importance for mining and storage. In practice, time series data are vulnerable to various temporal transformations, such as shift and temporal scaling, however, which are unavoidable in the process of data collection. If a learning algorithm directly calculates the difference between such transformed data based on Euclidean distance, the measurement cannot faithfully reflect the similarity and hence could not learn the underlying discriminative features. In order to solve this problem, we develop a novel subspace learning algorithm based on dynamic time warping (DTW) distance which is an elastic distance defined in a DTW space. The algorithm aims to minimize the reconstruction error in the DTW space. However, since DTW space is a semi-pseudo metric space, it is difficult to generalize common subspace learning algorithms for such semi-pseudo metric spaces. In this work, we introduce warp operators with which DTW reconstruction error can be approximated by reconstruction error between transformed series and their reconstructions in a subspace. The warp operators align time series data with their linear representations in the DTW space, which is in particular important for misaligned time series, so that the subspace can be learned to obtain an intrinsic basis (dictionary) for the representation of the data. The warp operators and the subspace are optimized alternatively until reaching equilibrium. Experiments results show that the proposed algorithm outperforms traditional subspace learning algorithms and temporal transform-invariance based methods (including SIDL, Kernel PCA, and SPMC et. al), and obtains competitive results with the state-of-the-art algorithms, such as BOSS algorithm.

    更新日期:2020-01-17
  • Abnormality Detection in Retinal Image by Individualized Background Learning
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-16
    Benzhi Chen; Lisheng Wang; Xiuying Wang; Jian Sun; Yijie Huang; Dagan Feng; Zongben Xu

    Computer-aided lesion detection (CAD) techniques, which provide potential for automatic early screening of retinal pathologies, are widely studied in retinal image analysis. While many CAD approaches based on lesion samples or lesion features can well detect pre-defined lesion types, it remains challenging to detect various abnormal regions (namely abnormalities) from retinal images. In this paper, we try to identify diverse abnormalities from a retinal test image by finely learning its individualized retinal background (IRB) on which retinal lesions superimpose. 3150 normal retinal images are collected as the priors for IRB learning. A preprocessing step is applied to all retinal images for spatial, scale and color normalization. Retinal blood vessels, which have individual variations in different images, are particularly suppressed from all images. A multi-scale sparse coding based learning (MSSCL) algorithm and a repeated learning strategy are proposed for finely learning the IRB. By the MSSCL algorithm, a background space is constructed by sparsely encoding the test image in a multi-scale manner using the dictionary learned from normal retinal images, which will contain more complete IRB information than any single-scale coding result. From the background space, the IRB can be well learned by low-rank approximation and thus different salient lesions can be separated and detected. The MSSCL algorithm will be iteratively repeated on the modified test image in which the detected salient lesions are suppressed, so as to further improve the accuracy of the IRB and suppress lesions in the IRB. Consequently, a high-accuracy IRB can be learned and thus both salient lesions and weak lesions that have low contrasts with the background can be clearly separated. The effectiveness and contributions of the proposed method are validated by experiments over different clinical data-sets and comparisons with the state-of-the-art CAD methods.

    更新日期:2020-01-17
  • Towards Interpretable and Robust Hand Detection via Pixel-wise Prediction
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-16
    Dan Liu; Libo Zhang; Tiejian Luo; Lili Tao; Yanjun Wu

    The lack of interpretability of existing CNN-based hand detection methods makes it difficult to understand the rationale behind their predictions. In this paper, we propose a novel neural network model, which introduces interpretability into hand detection for the first time. The main improvements include: (1) Detect hands at pixel level to explain what pixels are the basis for its decision and improve transparency of the model. (2) The explainable Highlight Feature Fusion block highlights distinctive features among multiple layers and learns discriminative ones to gain robust performance. (3) We introduce a transparent representation, the rotation map, to learn rotation features instead of complex and non-transparent rotation and derotation layers. (4) Auxiliary supervision accelerates the training process, which saves more than 10 hours in our experiments. Experimental results on the VIVA and Oxford hand detection and tracking datasets show competitive accuracy of our method compared with state-of-the-art methods with higher speed.

    更新日期:2020-01-17
  • FINGERPRINT PORE MATCHING USING DEEP FEATURES
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-16
    Feng Liu; Yuanhao Zhao; Guojie Liu; Linlin Shen

    As a popular living fingerprint feature, sweat pore has been adopted to build robust high resolution automated fingerprint recognition systems (AFRSs). Pore matching is an important step in high resolution fingerprint recognition. This paper proposes a novel pore matching method with high recognition accuracy. The method mainly solves the pore representation problem in the state-of-the-art direct pore matching method. By making full use of the diversity and large quantities of sweat pores on fingerprints, deep convolutional networks are carefully designed to learn a deep feature (denoted as DeepPoreID) for each pore. The inter-class difference and intra-class similarity of pore patch pairs can be well solved using deep learning. The DeepPoreID is then used to describe the local feature for each pore and finally integrated into the classical direct pore matching method. More specifically, pore patches, which are cropped from both Query and Template fingerprint images, are imported into the well-trained networks to generate DeepPoreID for pore representation. The similarity between those DeepPoreIDs are then obtained by calculating the Euclidian Distance between them. Subsequently, one-to-many coarse pore correspondences are established via comparing their similarity. Finally, classical Weighted RANdom SAmple Consensus (WRANSAC) is employed to pick true pore correspondences from coarse ones. The experiments carried on the two public high resolution fingerprint database have shown the effectiveness of the proposed DeepPoreID, especially for fingerprint matching with small image size. Meanwhile, better recognition accuracy is achieved by the proposed method when compared with the existing state-of-the-art methods. About 35% rise in equal error rate (EER) and about 30% rise in FMR1000 when compared with the best result evaluated on the database with image size of 320×240 pixels.

    更新日期:2020-01-17
  • Scene Recognition: A Comprehensive Survey
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-15
    Lin Xie; Feifei Lee; Li Liu; Koji Kotani; Qiu Chen

    With the success of deep learning in the field of computer vision, object recognition has made important breakthroughs, and its recognition accuracy has been drastically improved. However, the performance of scene recognition is still not sufficient to some extent because of complex configurations. Over the past several years, scene recognition algorithms have undergone important evolution as a result of the development of machine learning and Deep Convolutional Neural Networks (DCNN). This paper reviews many of the most popular and effective approaches to scene recognition, which is expected to create benefits for future research and practical applications. We seek to establish relationships among different algorithms and determine the critical components that lead to remarkable performance. Through the analysis of some representative schemes, motivation and insights are identified, which will help to facilitate the design of better recognition architectures. In addition, current available scene datasets and benchmarks are presented for evaluation and comparison. Finally, potential problems and promising directions are highlighted.

    更新日期:2020-01-15
  • Object Recognition Based on Convex Hull Alignment
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-13
    Robert Cupec; Ivan Vidović; Damir Filko; Petra Đurović

    A common approach to recognition of objects in cluttered scenes is to generate hypotheses about objects present in the scene by matching local descriptors of point features. These hypotheses are then evaluated by measuring how well they explain a particular part of the scene. In this paper, we investigate an alternative approach, which is based on alignment of convex hulls of segments detected in a depth image with convex hulls of target 3D object models or their parts. This alignment is performed using the Convex Template Instance descriptor. This descriptor was originally proposed for fruit recognition and classification of segmented objects. We have adapted this approach to recognize objects in complex scenes. Furthermore, we propose a novel three-level hypothesis evaluation strategy which can be used to achieve highly efficient object recognition. The proposed approach is evaluated by comparison with nine state-of-the-art approaches using three challenging benchmark datasets.

    更新日期:2020-01-13
  • HscoreNet: A Deep Network for Estrogen and Progesterone Scoring Using Breast IHC Images
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-10
    Monjoy Saha; Indu Arun; Rosina Ahmed; Sanjoy Chatterjee; Chandan Chakraborty
    更新日期:2020-01-11
  • Unsupervised Domain Adaptive Re-Identification: Theory and Practice
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-10
    Liangchen Song; Cheng Wang; Lefei Zhang; Bo Du; Qian Zhang; Chang Huang; Xinggang Wang

    We study the problem of unsupervised domain adaptive re-identification (re-ID) which is an active topic in computer vision but lacks a theoretical foundation. We first extend existing unsupervised domain adaptive classification theories to re-ID tasks. Concretely, we introduce some assumptions on the extracted feature space and then derive several loss functions guided by these assumptions. To optimize them, a novel self-training scheme for unsupervised domain adaptive re-ID tasks is proposed. It iteratively makes guesses for unlabeled target data based on an encoder and trains the encoder based on the guessed labels. Extensive experiments on unsupervised domain adaptive person re-ID and vehicle re-ID tasks with comparisons to the state-of-the-arts confirm the effectiveness of the proposed theories and self-training framework. Our code is available on https://github.com/LcDog/DomainAdaptiveReID.

    更新日期:2020-01-11
  • Multi-Task CNN for Restoring Corrupted Fingerprint Images1
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-10
    Wei Jing Wong; Shang-Hong Lai

    Fingerprint image enhancement is one of the fundamental modules in an automated fingerprint recognition system (AFRS). While the performance of AFRS advances with sophisticated fingerprint matching algorithms, poor fingerprint image quality remains a major issue to achieve accurate fingerprint recognition. In this paper, we present a multi-task convolutional neural network (CNN) based method to recover fingerprint ridge structures from corrupted fingerprint images. By learning from the noises and corruptions caused by various undesirable conditions of finger and sensor, the proposed CNN model consists of two streams that reconstruct the fingerprint image and orientation field simultaneously. The enhanced fingerprint is further refined using the orientation field information. Moreover, we create a deliberately corrupted fingerprint image dataset associated with ground truth images to facilitate the supervised learning of the proposed CNN model. Experimental results show significant improvement on both image quality and fingerprint matching accuracy after applying the proposed fingerprint image enhancement technique to several well-known fingerprint datasets.

    更新日期:2020-01-11
  • Towards Explaining Anomalies: A Deep Taylor Decomposition of One-Class Models
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-09
    Jacob Kauffmann; Klaus-Robert Müller; Grégoire Montavon

    Detecting anomalies in the data is a common machine learning task, with numerous applications in the sciences and industry. In practice, it is not always sufficient to reach high detection accuracy, one would also like to be able to understand why a given data point has been predicted to be anomalous. We propose a principled approach for one-class SVMs (OC-SVM), that draws on the novel insight that these models can be rewritten as distance/pooling neural networks. This ‘neuralization’ step lets us apply deep Taylor decomposition (DTD), a methodology that leverages the model structure in order to quickly and reliably explain decisions in terms of input features. The proposed method (called ‘OC-DTD’) is applicable to a number of common distance-based kernel functions, and it outperforms baselines such as sensitivity analysis, distance to nearest neighbor, or edge detection.

    更新日期:2020-01-09
  • Discriminative Distribution Alignment: A Unified Framework for Heterogeneous Domain Adaptation
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-09
    Yuan Yao; Yu Zhang; Xutao Li; Yunming Ye

    Heterogeneous domain adaptation (HDA) aims to leverage knowledge from a source domain for helping learn an accurate model in a heterogeneous target domain. HDA is exceedingly challenging since the feature spaces of domains are distinct. To tackle this issue, we propose a unified learning framework called Discriminative Distribution Alignment (DDA) for deriving a domain-invariant subspace. The proposed DDA can simultaneously match the discriminative directions of domains, align the distributions across domains, and enhance the separability of data during adaptation. To achieve this, DDA trains an adaptive classifier by both reducing the distribution divergence and enlarging distances between class centroids. Based on the proposed DDA framework, we further develop two methods, by embedding the cross-entropy loss and squared loss into this framework, respectively. We conduct experiments on the tasks of categorization across domains and modalities. Experimental results clearly demonstrate that the proposed DDA outperforms several state-of-the-art models.

    更新日期:2020-01-09
  • Identifying the best data-driven feature selection method for boosting reproducibility in classification tasks
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-09
    Nicolas Georges; Islem Mhiri; Islem Rekik; Alzheimer’s Disease Neuroimaging Initiative

    Considering the proliferation of extremely high-dimensional data in many domains including computer vision and healthcare applications such as computer-aided diagnosis (CAD), advanced techniques for reducing the data dimensionality and identifying the most relevant features for a given classification task such as distinguishing between healthy and disordered brain states are needed. Despite the existence of many works on boosting the classification accuracy using a particular feature selection (FS) method, choosing the best one from a large pool of existing FS techniques for boosting feature reproducibility within a dataset of interest remains a formidable challenge to tackle. Notably, a good performance of a particular FS method does not necessarily imply that the experiment is reproducible and that the features identified are optimal for the entirety of the samples. Essentially, this paper presents the first attempt to address the following challenge: “Given a set of different feature selection methods {FS1,⋯,FSK}, and a dataset of interest, how to identify the most reproducible and ‘trustworthy’ connectomic features that would produce reliable biomarkers capable of accurately differentiate between two specific conditions?” To this aim, we propose FS-Select framework which explores the relationships among the different FS methods using a multi-graph architecture based on feature reproducibility power, average accuracy, and feature stability of each FS method. By extracting the ‘central’ graph node, we identify the most reliable and reproducible FS method for the target brain state classification task along with the most discriminative features fingerprinting these brain states. To evaluate the reproducibility power of FS-Select, we perturbed the training set by using different cross-validation strategies on a multi-view small-scale connectomic dataset (late mild cognitive impairment vs Alzheimer’s disease) and large-scale dataset including autistic vs healthy subjects. Our experiments revealed reproducible connectional features fingerprinting disordered brain states.

    更新日期:2020-01-09
  • UcoSLAM: Simultaneous Localization and Mapping by Fusion of KeyPoints and Squared Planar Markers
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-08
    Rafael Muñoz-Salinas; R. Medina-Carnicer

    Simultaneous Localization and Mapping is the process of simultaneously creating a map of the environment while navigating in it. Most of the SLAM approaches use natural features (e.g. keypoints) that are unstable over time, repetitive in many cases or their number insufficient for a robust tracking (e.g. in indoor buildings). Other researchers, on the other hand, have proposed the use of artificial landmarks, such as squared fiducial markers, placed in the environment to help tracking and relocalization. This paper proposes a novel SLAM approach by fusing natural and artificial landmarks in order to achieve long-term robust tracking in many scenarios. Our method has been compared to the start-of-the-art methods ORB-SLAM2 [1], LDSO [2] and SPM-SLAM[3] in the public datasets Kitti [4], Euroc-MAV [5], TUM [6] and SPM [3], obtaining better precision, robustness and speed. Our tests also show that the combination of markers and keypoints achieves better accuracy than each one of them independently.

    更新日期:2020-01-08
  • A reduced universum twin support vector machine for class imbalance learning
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-07
    B. Richhariya; M. Tanveer

    In most of the real world datasets, there is an imbalance in the number of samples belonging to different classes. Various pattern classification problems such as fault or disease detection involve class imbalanced data. The support vector machine (SVM) classifier becomes biased towards the majority class due to class imbalance. Moreover, in the existing SVM based techniques for class imbalance, there is no information about the distribution of data. Motivated by the idea of prior information about data distribution, a reduced universum twin support vector machine for class imbalance learning (RUTSVM-CIL) is proposed in this paper. For the first time, universum learning is incorporated with SVM to solve the problem of class imbalance. Oversampling and undersampling of data is performed to remove the imbalance in the classes. The universum data points are used to give prior information about the data. To reduce the computation time of our universum based algorithm, we use a small sized rectangular kernel matrix. The reduced kernel matrix needs less storage space, and thus applicable for large scale imbalanced datasets. Comprehensive experimentation is performed on various synthetic, real world and large scale imbalanced datasets. In comparison to the existing approaches for class imbalance, the proposed RUTSVM-CIL gives better generalization performance for most of the benchmark datasets. Also, the computation cost of RUTSVM-CIL is very less, making it suitable for real world applications.

    更新日期:2020-01-07
  • Automatic Characteristic-Calibrated Registration (ACC-REG): Hippocampal Surface Registration using Eigen-graphs
    Pattern Recogn. (IF 5.898) Pub Date : 2020-01-07
    Hei Long CHAN; Tsz Chun YAM; Lok Ming LUI

    In this paper, we propose an efficient algorithm, the ACC-REG, to automatically extract intrinsic key characteristics on hippocampal mesh surfaces and hence compute an accurate registration mapping between them. Given a pair of hippocampal surface mesh, the proposed algorithm constructs the eigen-graphs, an intrinsic feature on the surface, on each surface as its representative. The eigen-graphs are then calibrated along the longitudinal direction of the hippocampal surfaces. Accurately corresponded intrinsic characteristics on each hippocampus can thus be extracted. As a result, the two surfaces can be registered with improved accuracy and low computation cost. Experiments on ADNI data demonstrate the effectiveness of the proposed ACC-REG model over existing methods.

    更新日期:2020-01-07
  • Systematic review of 3D facial expression recognition methods
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-11
    Gilderlane Ribeiro Alexandre; José Marques Soares; George André Pereira Thé

    The three-dimensional representation of the human face has emerged as a viable and effective way to characterize the facial surface for expression classification purposes. The rapid progress in the area continually demands its up-to-date characterization to guide and support research decisions, specially for newcomer researchers. This systematic literature review focus on investigating three major aspects of 3D facial expression recognition methods: face representation, preprocessing and classification experiments. The investigation of 49 specialized studies revealed the preferential types of data and regions of interest for face representation in recent years, as well as a trend towards keypoint-independent methods. In addition, it brings to light current weaknesses regarding the report of preprocessing techniques and identifies challenges concerning the current possibility of fair comparison among multiple methods. The presented findings outline essential research decisions whose the regardful report is of great value to this research community.

    更新日期:2020-01-04
  • MobileFAN: Transferring deep hidden representation for face alignment
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-19
    Yang Zhao; Yifan Liu; Chunhua Shen; Yongsheng Gao; Shengwu Xiong

    Facial landmark detection is a crucial prerequisite for many face analysis applications. Deep learning-based methods currently dominate the approach of addressing the facial landmark detection. However, such works generally introduce a large number of parameters, resulting in high memory cost. In this paper, we aim for a lightweight as well as effective solution to facial landmark detection. To this end, we propose an effective lightweight model, namely Mobile Face Alignment Network (MobileFAN), using a simple backbone MobileNetV2 as the encoder and three deconvolutional layers as the decoder. The proposed MobileFAN, with only 8% of the model size and lower computational cost, achieves superior or equivalent performance compared with state-of-the-art models. Moreover, by transferring the geometric structural information of a face graph from a large complex model to our proposed MobileFAN through feature-aligned distillation and feature-similarity distillation, the performance of MobileFAN is further improved in effectiveness and efficiency for face alignment. Extensive experiment results on three challenging facial landmark estimation benchmarks including COFW, 300W and WFLW show the superiority of our proposed MobileFAN against state-of-the-art methods.

    更新日期:2020-01-04
  • Noise-robust dictionary learning with slack block-Diagonal structure for face recognition
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-22
    Zhe Chen; Xiao-Jun Wu; He-Feng Yin; Josef Kittler

    Strict ‘0-1’ block-diagonal structure has been widely used for learning structured representation in face recognition problems. However, it is questionable and unreasonable to assume the within-class representations are the same. To circumvent this problem, in this paper, we propose a slack block-diagonal (SBD) structure for representation where the target structure matrix is dynamically updated, yet its blockdiagonal nature is preserved. Furthermore, in order to depict the noise in face images more precisely, we propose a robust dictionary learning algorithm based on mixed-noise model by utilizing the above SBD structure (SBD2L). SBD2L considers that there exists two forms of noise in data which are drawn from Laplacian and Gaussion distribution, respectively. Moreover, SBD2L introduces a low-rank constraint on the representation matrix to enhance the dictionary’s robustness to noise. Extensive experiments on four benchmark databases show that the proposed SBD2L can achieve better classification results than several state-of-the-art dictionary learning methods.

    更新日期:2020-01-04
  • A novel classification-selection approach for the self updating of template-based face recognition systems
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-27
    Giulia Orrù; Gian Luca Marcialis; Fabio Roli

    The boosting on the need of security notably increased the amount of possible facial recognition applications, especially due to the success of the Internet of Things (IoT) paradigm. However, although handcrafted and deep learning-inspired facial features reached a significant level of compactness and expressive power, the facial recognition performance still suffers from intra-class variations such as ageing, facial expressions, lighting changes, and pose. These variations cannot be captured in a single acquisition and require multiple acquisitions of long duration, which are expensive and need a high level of collaboration from the users. Among others, self-update algorithms have been proposed in order to mitigate these problems. Self-updating aims to add novel templates to the users’ gallery among the inputs submitted during system operations. Consequently, computational complexity and storage space tend to be among the critical requirements of these algorithms. The present paper deals with the above problems by a novel template-based self-update algorithm, able to keep over time the expressive power of a limited set of templates stored in the system database. The rationale behind the proposed approach is in the working hypothesis that a dominating mode characterises the features’ distribution given the client. Therefore, the key point is to select the best templates around that mode. We propose two methods, which are tested on systems based on handcrafted features and deep-learning-inspired autoencoders at the state-of-the-art. Three benchmark data sets are used. Experimental results confirm that, by effective and compact feature sets which can support our working hypothesis, the proposed classification-selection approaches overcome the problem of manual updating and, in case, stringent computational requirements.

    更新日期:2020-01-04
  • A paired sparse representation model for robust face recognition from a single sample
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-25
    Fania Mokhayeri; Eric Granger

    Sparse representation-based classification (SRC) has been shown to achieve a high level of accuracy in face recognition (FR). However, matching faces captured in unconstrained video against a gallery with a single reference facial still per individual typically yields low accuracy. For improved robustness to intra-class variations, SRC techniques for FR have recently been extended to incorporate variational information from an external generic set into an auxiliary dictionary. Despite their success in handling linear variations, non-linear variations (e.g., pose and expressions) between probe and reference facial images cannot be accurately reconstructed with a linear combination of images in the gallery and auxiliary dictionaries because they do not share the same type of variations. In order to account for non-linear variations due to pose, a paired sparse representation model is introduced allowing for joint use of variational information and synthetic face images. The proposed model, called synthetic plus variational model, reconstructs a probe image by jointly using (1) a variational dictionary and (2) a gallery dictionary augmented with a set of synthetic images generated over a wide diversity of pose angles. The augmented gallery dictionary is then encouraged to pair the same sparsity pattern with the variational dictionary for similar pose angles by solving a newly formulated simultaneous sparsity-based optimization problem. Experimental results obtained on Chokepoint and COX-S2V datasets, using different face representations, indicate that the proposed approach can outperform state-of-the-art SRC-based methods for still-to-video FR with a single sample per person.

    更新日期:2020-01-04
  • Deformable face net for pose invariant face recognition
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-25
    Mingjie He; Jie Zhang; Shiguang Shan; Meina Kan; Xilin Chen

    Unconstrained face recognition still remains a challenging task due to various factors such as pose, expression, illumination, partial occlusion, etc. In particular, the most significant appearance variations are stemmed from poses which leads to severe performance degeneration. In this paper, we propose a novel Deformable Face Net (DFN) to handle the pose variations for face recognition. The deformable convolution module attempts to simultaneously learn face recognition oriented alignment and identity-preserving feature extraction. The displacement consistency loss (DCL) is proposed as a regularization term to enforce the learnt displacement fields for aligning faces to be locally consistent both in the orientation and amplitude since faces possess strong structure. Moreover, the identity consistency loss (ICL) and the pose-triplet loss (PTL) are designed to minimize the intra-class feature variation caused by different poses and maximize the inter-class feature distance under the same poses. The proposed DFN can effectively handle pose invariant face recognition (PIFR). Extensive experiments show that the proposed DFN outperforms the state-of-the-art methods, especially on the datasets with large poses.

    更新日期:2020-01-04
  • A robust statistics approach for plane detection in unorganized point clouds
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-16
    Abner M. C. Araújo; Manuel M. Oliveira

    Plane detection is a key component for many applications, such as industrial reverse engineering and self-driving cars. However, existing plane-detection techniques are sensitive to noise and to user-defined parameters. We introduce a fast deterministic technique for plane detection in unorganized point clouds that is robust to noise and virtually independent of parameter tuning. It is based on a novel planarity test drawn from robust statistics and on a split and merge strategy. Its parameter values are automatically adjusted to fit the local distribution of samples in the input dataset, thus leading to good reconstruction of even small planar regions. We demonstrate the effectiveness of our solution on several real datasets, comparing its performance to state-of-art plane detection techniques, and showing that it achieves better accuracy, while still being one of the fastest.

    更新日期:2020-01-04
  • Discovering influential factors in variational autoencoders
    Pattern Recogn. (IF 5.898) Pub Date : 2019-12-15
    Shiqi Liu; Jingxin Liu; Qian Zhao; Xiangyong Cao; Huibin Li; Deyu Meng; Hongying Meng; Sheng Liu

    In the field of machine learning, it is still a critical issue to identify and supervise the learned representation without manually intervening or intuition assistance to extract useful knowledge or serve for the downstream tasks. In this work, we focus on supervising the influential factors extracted by the variational autoencoder (VAE). The VAE is proposed to learn independent low dimension representation while facing the problem that sometimes pre-set factors are ignored. We argue that the mutual information of the input and each learned factor of the representation plays a necessary indicator of discovering the influential factors. We find the VAE objective inclines to induce mutual information sparsity in factor dimension over the data intrinsic dimension and therefore result in some non-influential factors whose function on data reconstruction could be ignored. We show mutual information also influences the lower bound of VAE’s reconstruction error and downstream classification task. To make such indicator applicable, we design an algorithm for calculating the mutual information for VAE and prove its consistency. Experimental results on MNIST, CelebA and DEAP datasets show that mutual information can help determine influential factors, of which some are interpretable and can be used to further generation and classification tasks, and help discover the variant that connects with emotion on DEAP dataset.

    更新日期:2020-01-04
  • Connectivity-based cylinder detection in unorganized point clouds
    Pattern Recogn. (IF 5.898) Pub Date : 2019-12-19
    Abner M.C. Araújo; Manuel M. Oliveira

    Cylinder detection is an important step in reverse engineering of industrial sites, as such environments often contain a large number of cylindrical pipes and tanks. However, existing techniques for cylinder detection require the specification of several parameters which are difficult to adjust because their values depend on the noise level of the input point cloud. Also, these solutions often expect the cylinders to be either parallel or perpendicular to the ground. We present a cylinder-detection technique that is robust to noise, contains parameters which require little to no fine-tuning, and can handle cylinders with arbitrary orientations. Our approach is based on a robust linear-time circle-detection algorithm that naturally discards outliers, allowing our technique to handle datasets with various density and noise levels while using a set of default parameter values. It works by projecting the point cloud onto a set of directions over the unit hemisphere and detecting circular projections formed by samples defining connected components in 3D. The extracted cylindrical surfaces are obtained by fitting a cylinder to each connected component. We compared our technique against the state-of-the-art methods on both synthetic and real datasets containing various densities and noise levels, and show that it outperforms existing techniques in terms of accuracy and robustness to noise, while still maintaining a competitive running time.

    更新日期:2020-01-04
  • A repeatable and robust local reference frame for 3D surface matching
    Pattern Recogn. (IF 5.898) Pub Date : 2019-12-25
    Sheng Ao; Yulan Guo; Jindong Tian; Yong Tian; Dong Li

    Local reference frames (LRFs) have been widely used for 3D local surface description. In this work, we propose a repeatable LRF with strong robustness to different nuisances. Different from existing LRF methods, the proposed LRF uses a part of neighboring points within the support region to calculate the z-axis, and performs an effective feature transformation on the neighboring points to define the x-axis. Specifically, feature transformation is applied to the data on a projection plane based on three point distribution characteristics via weighted strategies. These characteristics include the z-height, the distance to the center and the average length to 1-ring neighbors, covariance analysis is then applied to the transformed points to obtain the eigenvector with the largest eigenvalue, which points towards the maximum variance direction. Using a sign disambiguation technique, the modified eigenvector is used to define the final x-axis. Furthermore, a scale strategy is proposed to improve the robustness of the LRF with respect to mesh decimation. The proposed LRF was rigorously tested on six public benchmark datasets consisting of three different application contexts, i.e., 3D shape retrieval, 3D object recognition and registration. Experiments show that our method achieves significantly higher repeatability and stronger robustness than the state-of-the-arts under Gaussian noise, shot noise and mesh resolution variation. Finally, the descriptor matching results on four typical datasets further demonstrate the effectiveness of our LRF.

    更新日期:2020-01-04
  • Wavelet-based segmentation on the sphere
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-04
    Xiaohao Cai; Christopher G.R. Wallis; Jennifer Y.H. Chan; Jason D. McEwen

    Segmentation, a useful/powerful technique in pattern recognition, is the process of identifying object outlines within images. There are a number of efficient algorithms for segmentation in Euclidean space that depend on the variational approach and partial differential equation modelling. Wavelets have been used successfully in various problems in image processing, including segmentation, inpainting, noise removal, super-resolution image restoration, and many others. Wavelets on the sphere have been developed to solve such problems for data defined on the sphere, which arise in numerous fields such as cosmology and geophysics. In this work, we propose a wavelet-based method to segment images on the sphere, accounting for the underlying geometry of spherical data. Our method is a direct extension of the tight-frame based segmentation method used to automatically identify tube-like structures such as blood vessels in medical imaging. It is compatible with any arbitrary type of wavelet frame defined on the sphere, such as axisymmetric wavelets, directional wavelets, curvelets, and hybrid wavelet constructions. Such an approach allows the desirable properties of wavelets to be naturally inherited in the segmentation process. In particular, directional wavelets and curvelets, which were designed to efficiently capture directional signal content, provide additional advantages in segmenting images containing prominent directional and curvilinear features. We present several numerical experiments, applying our wavelet-based segmentation method, as well as the common K-means method, on real-world spherical images, including an Earth topographic map, a light probe image, solar data-sets, and spherical retina images. These experiments demonstrate the superiority of our method and show that it is capable of segmenting different kinds of spherical images, including those with prominent directional features. Moreover, our algorithm is efficient with convergence usually within a few iterations.

    更新日期:2020-01-04
  • Graph regularized low-rank representation for submodule clustering
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-28
    Tong Wu

    In this paper, a new submodule clustering method for imaging (2-D) data is proposed. Unlike most existing clustering methods that first convert such data into vectors as preprocessing, the proposed method arranges the data samples as lateral slices of a third-order tensor. Our algorithm is based on the union-of-free-submodules model and the samples are represented using t-product in the third-order tensor space. First, we impose a low-rank constraint on the representation tensor to capture the principle information of data. By incorporating manifold regularization into the tensor factorization, the proposed method explicitly exploits the local manifold structure of data. Meanwhile, a segmentation dependent term is employed to integrate the two pipeline steps of affinity learning and spectral clustering into a unified optimization framework. The proposed method can be efficiently solved based on the alternating direction method of multipliers and spectral clustering. Finally, a nonlinear extension is proposed to handle data drawn from a mixture of nonlinear manifolds. Extensive experimental results on five real-world image datasets confirm the effectiveness of the proposed methods.

    更新日期:2020-01-04
  • UNIC: A fast nonparametric clustering
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-19
    Nadiia Leopold; Oliver Rose

    Clustering is among the tools for exploring, analyzing, and deriving information from data. In the case of large data sets, the real burden to the application of clustering algorithms can be their complexity and demand of control parameters. We present a new fast nonparametric clustering algorithm, UNIC, to address these challenges. To identify clusters, the algorithm evaluates the distances between selected points and other points in the set. While assessing these distances, it employs methods of robust statistics to identify the cluster borders. The performance of the proposed algorithm is assessed in an experimental study and compared with several existing clustering methods over a variety of benchmark data sets.

    更新日期:2020-01-04
  • Concatenation hashing: A relative position preserving method for learning binary codes
    Pattern Recogn. (IF 5.898) Pub Date : 2019-12-06
    Zhenyu Weng; Yuesheng Zhu

    Hashing methods perform the efficient nearest neighbor search by mapping high-dimensional data to binary codes. Compared to projection-based hashing methods, hashing methods that adopt the clustering technique can encode the complex relationship of the data into binary codes. However, their search performance is affected by the boundary of the cluster. Two similar data points may be assigned to two different clusters and then encoded into two much different binary codes. In this paper, we propose a new hashing method based on the clustering technique and it can alleviate the effect from the cluster boundary. It is from an observation that the relative positions of any two close data points to each cluster center are close. An alternating optimization is developed to simultaneously discover the cluster structures of the data and learn the hash functions to preserve the relative positions of the data to each cluster center. To integrate the information in each cluster, the corresponding binary code of each data point is obtained by concatenating the substrings learnt by the hash functions in each cluster. The experiments show that our method is competitive to or better than the state-of-the-art hashing methods.

    更新日期:2020-01-04
  • Ensemble Selection based on Classifier Prediction Confidence
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-04
    Tien Thanh Nguyen; Anh Vu Luong; Manh Truong Dang; Alan Wee-Chung Liew; John McCall

    Ensemble selection is one of the most studied topics in ensemble learning because a selected subset of base classifiers may perform better than the whole ensemble system. In recent years, a great many ensemble selection methods have been introduced. However, many of these lack flexibility: either a fixed subset of classifiers is pre-selected for all test samples (static approach), or the selection of classifiers depends upon the performance of techniques that define the region of competence (dynamic approach). In this paper, we propose an ensemble selection method that takes into account each base classifier's confidence during classification and the overall credibility of the base classifier in the ensemble. In other words, a base classifier is selected to predict for a test sample if the confidence in its prediction is higher than its credibility threshold. The credibility thresholds of the base classifiers are found by minimizing the empirical 0–1 loss on the entire training observations. In this way, our approach integrates both the static and dynamic aspects of ensemble selection. Experiments on 62 datasets demonstrate that the proposed method achieves much better performance in comparison to some ensemble methods.

    更新日期:2020-01-04
  • Sign consistency for the linear programming discriminant rule
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-05
    Zhen Zhang; Shengzheng Wang; Wei Bian

    Linear discriminant analysis (LDA) is an important conventional model for data classification. Classical theory shows that LDA is Bayes consistent for a fixed data dimensionality p and a large training sample size n. However, in high-dimensional settings when p ≫ n, LDA is difficult due to the inconsistent estimation of the covariance matrix and the mean vectors of populations. Recently, a linear programming discriminant (LPD) rule was proposed for high-dimensional linear discriminant analysis, based on the sparsity assumption over the discriminant function. It is shown that the LPD rule is Bayes consistent in high-dimensional settings. In this paper, we further show that the LPD rule is sign consistent under the sparsity assumption. Such sign consistency ensures the LPD rule to select the optimal discriminative features for high-dimensional data classification problems. Evaluations on both synthetic and real data validate our result on the sign consistency of the LPD rule.

    更新日期:2020-01-04
  • Correlation classifiers based on data perturbation: New formulations and algorithms
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-11
    Zhizheng Liang; Xuewen Chen; Lei Zhang; Jin Liu; Yong Zhou

    This paper develops a novel framework for a family of correlation classifiers that are reconstructed from uncertain convex programs under data perturbation. Under this framework, correlation classifiers are exploited from the pessimistic viewpoint under possible perturbation of data, and the max-min optimization problem is formulated by simplifying the original model in terms of adaptive uncertainty regions. The proposed model can be formulated as a minimization problem under proper conditions. The proximal majorization-minimization optimization (PMMO) based on Bregman divergences is devised to solve the proposed model that may be nonconvex or nonsmooth. It is found that using PMMO to solve the proposed model can exploit the convergence rate of the solution sequence in the nonconvex case. In the case of specific functions we can use the accelerated versions of first-order methods to solve the proposed model with convexity in order to make them have fast convergence rates in terms of the objective function. Extensive experiments on some data sets are conducted to demonstrate the feasibility and validity of the proposed model.

    更新日期:2020-01-04
  • Prototype learning and collaborative representation using Grassmann manifolds for image set classification
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-20
    Dong Wei; Xiaobo Shen; Quansen Sun; Xizhan Gao; Wenzhu Yan

    Image set classification using manifolds is becoming increasingly more attractive since it considers non-Euclidean geometry. However, with the success of dictionary learning for image set classification using manifolds, how to learn an over-complete dictionary is still challenging. This paper proposes a novel prototype subspace learning method, in which a set of images is represented by a linear subspace and then mapped onto a Grassmann manifold. With this subspace representation, class prototypes and intra-class differences can be represented as principal components and variation subspaces, respectively. Isometric mapping further maps the manifolds into the symmetrical space via collaborative representation, which permits a closed-term solution. The proposed method is evaluated for face recognition, object recognition and action recognition. Extensive experimental results on the Honda, Extended YaleB, ETH-80 and Cambridge-Gesture datasets verify the superiority of the proposed method over the state-of-the-art methods.

    更新日期:2020-01-04
  • Dissimilarity-based representations for one-class classification on time series
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-20
    Stefano Mauceri; James Sweeney; James McDermott

    In several real-world classification problems it can be impractical to collect samples from classes other than the one of interest, hence the need for classifiers trained on a single class. There is a rich literature concerning binary and multi-class time series classification but less concerning one-class learning. In this study, we investigate the little-explored one-class time series classification problem. We represent time series as vectors of dissimilarities from a set of time series referred to as prototypes. Based on this approach, we evaluate a Cartesian product of 12 dissimilarity measures, and 8 prototype methods (strategies to select prototypes). Finally, a one-class nearest neighbor classifier is used on the dissimilarity-based representations (DBR). Experimental results show that DBR are competitive overall when compared with a strong baseline on the data-sets of the UCR/UEA archive. Additionally, DBR enable dimensionality reduction, and visual exploration of data-sets.

    更新日期:2020-01-04
  • CARs-Lands: An associative classifier for large-scale datasets
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-25
    Mehrdad Almasi; Mohammad Saniee Abadeh

    Associative classifiers are one of the most efficient classifiers for large datasets. However, they are unsuitable to be directly used in large-scale data problems. Associative classifiers discover frequent/rare rules or both in order to produce an efficient classifier. Discovery rules need to explore a large solution space in a well-organized manner; hence, learning of the associative classification methods of large datasets is not suitable on large-scale datasets because of memory and time-complexity constraints. The proposed method, CARs-Lands, presents an efficient distributed associative classifier. In CARs-Lands, first, a modified dataset is generated. This new dataset has sub-datasets that are completely appropriate to produce classification association rules (CARs) in a parallel manner. The produced dataset by CARs-Lands contains two types of instances: main instances and neighbor instances. Main instances can be either real instances of training dataset or meta-instances, which are not in the training dataset; each main instance has several neighbor instances from the training dataset, which together form a sub-dataset. These sub-datasets are used for parallel local association rule mining. In CARs-Lands, local association rules lead to more accurate prediction, because each test instance is classified by the association rules of their nearest neighbors in the training datasets. The proposed approach is evaluated in terms of accuracy on six real-world large-scale datasets against five recent and well-known methods. Experiment results show that the proposed classification method has high prediction accuracy and is highly competitive when compared to other classification methods.

    更新日期:2020-01-04
  • Generalized support vector data description for anomaly detection
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-20
    Mehmet Turkoz; Sangahn Kim; Youngdoo Son; Myong K. Jeong; Elsayed A. Elsayed

    Traditional anomaly detection procedures assume that normal observations are obtained from a single distribution. However, due to the complexities of modern industrial processes, the observations may belong to multiple operating modes with different distributions. In such cases, traditional anomaly detection procedures may trigger false alarms while the process is indeed in another normally operating mode. We propose a generalized support vector-based anomaly detection procedure called generalized support vector data description which can be used to determine the anomalies in multimodal processes. The proposed procedure constructs hyperspheres for each class in order to include as many observations as possible and keep other class observations as far apart as possible. In addition, we introduce a generalized Bayesian framework which does not only consider the prior information from each mode, but also highlights the relationships among the modes. The effectiveness of the proposed procedure is demonstrated through various simulation studies and real-life applications.

    更新日期:2020-01-04
  • F-measure curves: A tool to visualize classifier performance under imbalance
    Pattern Recogn. (IF 5.898) Pub Date : 2019-12-16
    Roghayeh Soleymani; Eric Granger; Giorgio Fumera

    Learning from imbalanced data is a challenging problem in many real-world machine learning applications due in part to the bias of performance in most classification systems. This bias may exist due to three reasons: (1) Classification systems are often optimized and compared using performance measurements that are unsuitable for imbalance problems; (2) most learning algorithms are designed and tested on a fixed imbalance level of data, which may differ from operational scenarios; (3) the preference of correct classification of classes is different from one application to another. This paper investigates specialized performance evaluation metrics and tools for imbalance problem, including scalar metrics that assume a given operating condition (skew level and relative preference of classes), and global evaluation curves or metrics that consider a range of operating conditions. We focus on the case in which the scalar metric F-measure is preferred over other scalar metrics, and propose a new global evaluation space for the F-measure that is analogous to the cost curves for expected cost. In this space, a classifier is represented as a curve that shows its performance over all of its decision thresholds and a range of possible imbalance levels for the desired preference of true positive rate to precision. Curves obtained in the F-measure space are compared to those of existing spaces (ROC, precision-recall and cost) and analogously to cost curves. The proposed F-measure space allows to visualize and compare classifiers’ performance under different operating conditions more easily than in ROC and precision-recall spaces. This space allows us to set the optimal decision threshold of a soft classifier and to select the best classifier among a group. This space also allows to empirically improve the performance obtained with ensemble learning methods specialized for class imbalance, by selecting and combining the base classifiers for ensembles using a modified version of the iterative Boolean combination algorithm that is optimized using the F-measure instead of AUC. Experiments on a real-world dataset for video face recognition show the advantages of evaluating and comparing different classifiers in the F-measure space versus ROC, precision-recall, and cost spaces. In addition, it is shown that the performance evaluated using the F-measure of Bagging ensemble method can improve considerably by using the modified iterative Boolean combination algorithm.

    更新日期:2020-01-04
  • Locality-constrained affine subspace coding for image classification and retrieval
    Pattern Recogn. (IF 5.898) Pub Date : 2019-12-24
    Bingbing Zhang; Qilong Wang; Xiaoxiao Lu; Fasheng Wang; Peihua Li

    Feature coding is a key component of the bag of visual words (BoVW) model, which is designed to improve image classification and retrieval performance. In the feature coding process, each feature of an image is nonlinearly mapped via a dictionary of visual words to form a high-dimensional sparse vector. Inspired by the well-known locality-constrained linear coding (LLC), we present a locality-constrained affine subspace coding (LASC) method to address the limitation whereby LLC fails to consider the local geometric structure around visual words. LASC is distinguished from all the other coding methods since it constructs a dictionary consisting of an ensemble of affine subspaces. As such, the local geometric structure of a manifold is explicitly modeled by such a dictionary. In the process of coding, each feature is linearly decomposed and weighted to form the first-order LASC vector with respect to its top-k neighboring subspaces. To further boost performance, we propose the second-order LASC vector based on information geometry. We use the proposed coding method to perform both image classification and image retrieval tasks and the experimental results show that the method achieves superior or competitive performance in comparison to state-of-the-art methods.

    更新日期:2020-01-04
  • Writer-aware CNN for parsimonious HMM-based offline handwritten Chinese text recognition
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-14
    Zi-Rui Wang; Jun Du; Jia-Ming Wang

    Recently, the hybrid convolutional neural network hidden Markov model (CNN-HMM) has been introduced for offline handwritten Chinese text recognition (HCTR) and has achieved state-of-the-art performance. However, modeling each of the large vocabulary of Chinese characters with a uniform and fixed number of hidden states requires high memory and computational costs and makes the tens of thousands of HMM state classes confusing. Another key issue of CNN-HMM for HCTR is the diversified writing style, which leads to model strain and a significant performance decline for specific writers. To address these issues, we propose a writer-aware CNN based on parsimonious HMM (WCNN-PHMM). First, PHMM is designed using a data-driven state-tying algorithm to greatly reduce the total number of HMM states, which not only yields a compact CNN by state sharing of the same or similar radicals among different Chinese characters but also improves the recognition accuracy due to the more accurate modeling of tied states and the lower confusion among them. Second, WCNN integrates each convolutional layer with one adaptive layer fed by a writer-dependent vector, namely, the writer code, to extract the irrelevant variability in writer information to improve recognition performance. The parameters of writer-adaptive layers are jointly optimized with other network parameters in the training stage, while a multiple-pass decoding strategy is adopted to learn the writer code and generate recognition results. Validated on the ICDAR 2013 competition of CASIA-HWDB database, the more compact WCNN-PHMM of a 7360-class vocabulary can achieve a relative character error rate (CER) reduction of 16.6% over the conventional CNN-HMM without considering language modeling. By adopting a powerful hybrid language model (N-gram language model and recurrent neural network language model), the CER of WCNN-PHMM is reduced to 3.17%. Moreover, the state-tying results of PHMM explicitly show the information sharing among similar characters and the confusion reduction of tied state classes. Finally, we visualize the learned writer codes and demonstrate the strong relationship with the writing styles of different writers. To the best of our knowledge, WCNN-PHMM yields the best results on the ICDAR 2013 competition set, demonstrating its power when enlarging the size of the character vocabulary.

    更新日期:2020-01-04
  • Automatic processing of Historical Arabic Documents: A comprehensive Survey
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-29
    Mohamed Ibn Khedher; Houda Jmila; Mounim A. El-Yacoubi

    Nowadays, there is a huge amount of Historical Arabic Documents (HAD) in the national libraries and archives around the world. Analyzing this type of data manually is a difficult and costly task. Thus, an automatic process is required to exploit these documents more rapidly. Processing historical documents is a recent research subject that has seen a remarkable growth in the last years. Processing Historical Arabic Documents is a particularly challenging problem. First, due to complicated nature of Arabic script compared to other scripts and second because the documents are ancient. This paper focuses on this difficult problem and provides a comprehensive survey of existing research work. First, we describe in detail the challenges making the automatic processing of Historical Arabic Documents a difficult task. Second, we classify this task into four applications of automatic processing of HAD: i) Analyze the document to extract the main text ii) Identify the writer of the document iii) Recognize some words or parts of the document in a reference dataset andiv) Retrieve and extract specific data from the document. For each application, existing approaches are surveyed and qualitatively described. Finally, we focus on available datasets and describe how they can be used in each application.

    更新日期:2020-01-04
  • Human activity recognition from UAV-captured video sequences
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-29
    Hazar Mliki; Fatma Bouhlel; Mohamed Hammami

    This research paper introduces a new approach for human activity recognition from UAV-captured video sequences. The proposed approach involves two phases: an offline phase and an inference phase. A scene stabilization step is performed together with these two phases. The offline phase aims to generate the human/non-human model as well as a human activity model using a convolutional neural network. The inference phase makes use of the already generated models in order to detect humans and recognize their activities. Our main contribution lies in adapting the convolutional neural networks, normally dedicated to the classification task, to detect humans. In addition, the classification of human activities is carried out according to two scenarios: An instant classification of video frames and an entire classification of the video sequences. Relying on an experimental evaluation of the proposed methods for human detection and human activity classification on the UCF-ARG dataset, we validated not only these contributions but also the performance of our methods compared to the existing ones.

    更新日期:2020-01-04
  • Dynamic imposter based online instance matching for person search
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-29
    Ju Dai; Pingping Zhang; Huchuan Lu; Hongyu Wang

    Person search aims to locate the target person matching a given query from a list of unconstrained whole images. It is a challenging task due to the unavailable bounding boxes of pedestrians, limited samples for each labeled identity and large amount of unlabeled persons in existing datasets. To address these issues, we propose a novel end-to-end learning framework for person search. The proposed framework settles pedestrian detection and person re-identification concurrently. To achieve the goal of co-learning and utilize the information of unlabeled persons, a novel yet extremely efficient Dynamic Imposter based Online Instance Matching (DI-OIM) loss is formulated. The DI-OIM loss is inspired by the observation that pedestrians appearing in the same image obviously have different identities. Thus we assign the unlabeled persons with dynamic pseudo-labels. The pseudo-labeled persons along with the labeled persons can be used to learn powerful feature representations. Experiments on CUHK-SYSU and PRW datasets demonstrate that our method outperforms other state-of-the-art algorithms. Moreover, it is superior and efficient in terms of memory capacity comparing with existing methods.

    更新日期:2020-01-04
  • Non-rigid object tracking via deep multi-scale spatial-temporal discriminative saliency maps
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-25
    Pingping Zhang; Wei Liu; Dong Wang; Yinjie Lei; Hongyu Wang; Chunhua Shen; Huchuan Lu

    In this paper, we propose a novel effective non-rigid object tracking framework based on the spatial-temporal consistent saliency detection. In contrast to most existing trackers that utilize a bounding box to specify the tracked target, the proposed framework can extract accurate regions of the target as tracking outputs. It achieves a better description of the non-rigid objects and reduces the background pollution for the tracking model. Furthermore, our model has several unique characteristics. First, a tailored fully convolutional neural network (TFCN) is developed to model the local saliency prior for a given image region, which not only provides the pixel-wise outputs but also integrates the semantic information. Second, a novel multi-scale multi-region mechanism is proposed to generate local saliency maps that effectively consider visual perceptions with different spatial layouts and scale variations. Subsequently, the local saliency maps are fused via a weighted entropy method, resulting in a discriminative saliency map. Finally, we present a non-rigid object tracking algorithm based on the predicted saliency maps. By utilizing a spatial-temporal consistent saliency map (STCSM), we conduct the target-background classification and use an online fine-tuning scheme for model updating. Extensive experiments demonstrate that the proposed algorithm achieves competitive performance in both saliency detection and visual tracking, especially outperforming other related trackers on the non-rigid object tracking datasets. Source codes and compared results are released at https://github.com/Pchank/TFCNTracker.

    更新日期:2020-01-04
  • Deep reinforcement hashing with redundancy elimination for effective image retrieval
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-20
    Juexu Yang; Yuejie Zhang; Rui Feng; Tao Zhang; Weiguo Fan

    Hashing is one of the most promising techniques in approximate nearest neighbor search due to its time efficiency and low cost in memory. Recently, with the help of deep learning, deep supervised hashing can perform representation learning and compact hash code learning jointly in an end-to-end style, and obtains better retrieval accuracy compared to non-deep methods. However, most deep hashing methods are trained with a pair-wise loss or triplet loss in a mini-batch style, which makes them inefficient at data sampling and cannot preserve the global similarity information. Besides that, many existing methods generate hash codes with redundant or even harmful bits, which is a waste of space and may lower the retrieval accuracy. In this paper, we propose a novel deep reinforcement hashing model with redundancy elimination called Deep Reinforcement De-Redundancy Hashing (DRDH), which can fully exploit large-scale similarity information and eliminate redundant hash bits with deep reinforcement learning. DRDH conducts hash code inference in a block-wise style, and uses Deep Q Network (DQN) to eliminate redundant bits. Very promising results have been achieved on four public datasets, i.e., CIFAR-10, NUS-WIDE, MS-COCO, and Open-Images-V4, which demonstrate that our method can generate highly compact hash codes and yield better retrieval performance than those of state-of-the-art methods.

    更新日期:2020-01-04
  • AI-GAN: Asynchronous interactive generative adversarial network for single image rain removal
    Pattern Recogn. (IF 5.898) Pub Date : 2019-12-06
    Xin Jin; Zhibo Chen; Weiping Li

    Single image rain removal plays an important role in numerous multimedia applications. Existing algorithms usually tackle the deraining problem by the way of signal removal, which lead to over-smoothness and generate unexpected artifacts in de-rained images. This paper addresses the deraining problem from a completely different perspective of feature-wise disentanglement, and introduces the interactions and constraints between two disentangled latent spaces. Specifically, we propose an Asynchronous Interactive Generative Adversarial Network (AI-GAN) to progressively disentangle the rainy image into background and rain spaces in feature level through a two-branch structure. Each branch employs a two-stage synthesis strategy and interacts asynchronously by exchanging feed-forward information and sharing feedback gradients, achieving complementary adversarial optimization. This ‘adversarial’ is not only the ‘adversarial’ between the generator and the discriminator, but also means that the two generators are entangled, and interact with each other in the optimization process. Extensive experimental results demonstrate that AI-GAN outperforms state-of-the-art deraining methods and benefits various typical multimedia applications such as Image/Video Coding, Action Recognition, and Person Re-identification.

    更新日期:2020-01-04
  • Statistical bootstrap-based principal mode component analysis for dynamic background subtraction
    Pattern Recogn. (IF 5.898) Pub Date : 2019-12-16
    Benson S.Y. Lam; Amanda M.Y. Chu; H. Yan

    Background subtraction is needed to extract foreground information from a video sequence for further processing in many applications, such as surveillance tracking. However, due to the presence of a dynamic background and noise, extracting foreground accurately from a video sequence remains challenging. A novel projection method, namely Principal Mode Component Analysis (PMCA), is proposed to capture the most repetitive patterns of a video sequence, which is one of the key characteristics of the video background. The patterns are captured by applying the bootstrapping method together with the statistic mode measure. The bootstrapping method can model the distribution of almost any statistic of the dynamic background and complicated noise. This is different from current methods, which restrict the distribution to a closed-form function. We introduce a mathematical relaxation that can formulate the statistical mode measure for a continuous video data. A fast exhaustive search method is proposed to find the global optimal solution for the PMCA. This fast method adopts a simplification procedure that makes the optimization procedure independent of the video size. The proposed method is computationally much more traceable than existing ones. We compare the proposed method with 10 different methods, including several state-of-the-art techniques, for 19 different real-world video sequences from two popular datasets. Experiment results show that the proposed method performs the best in 16 cases and second best in 2 cases.

    更新日期:2020-01-04
  • Deep cascaded cross-modal correlation learning for fine-grained sketch-based image retrieval
    Pattern Recogn. (IF 5.898) Pub Date : 2019-12-11
    Yanfei Wang; Fei Huang; Yuejie Zhang; Rui Feng; Tao Zhang; Weiguo Fan

    Fine-grained Sketch-based Image Retrieval (FG-SBIR), which utilizes hand-drawn sketches to search the target object images, has recently drawn much attention. It is a challenging task because sketches and images belong to different modalities and sketches are highly abstract and ambiguous. Existing solutions to this problem either focus on visual comparisons between sketches and images and ignore the multimodal characteristics of annotated images, or treat the retrieval as a one-time process. In this paper, we formulate FG-SBIR as a coarse-to-fine process, and propose a Deep Cascaded Cross-modal Ranking Model (DCCRM) that can exploit all the beneficial multimodal information in sketches and annotated images and improve both the retrieval efficiency and the top-K ranked effectiveness. Our goal concentrates on constructing deep representations for sketches, images, and descriptions, and learning the optimized deep correlations across such different domains. Thus for a given query sketch, its relevant images with fine-grained instance-level similarities in a specific category can be returned, and the strict requirement of the instance-level retrieval for FG-SBIR is satisfied. Very positive results have been obtained in our experiments by using a large quantity of public data.

    更新日期:2020-01-04
  • No-reference stereoscopic image quality assessment using a multi-task CNN and registered distortion representation
    Pattern Recogn. (IF 5.898) Pub Date : 2019-12-16
    Yiqing Shi; Wenzhong Guo; Yuzhen Niu; Jiamei Zhan

    Scene discrepancy between the left and right views presents more challenges for image quality assessment (IQA) of stereoscopic images as opposed to monocular ones. Existing no-reference stereoscopic IQA (NR-SIQA) metrics cannot achieve a good performance on asymmetrically distorted stereoscopic images. In this paper, we propose an NR-SIQA index that first addresses scene discrepancy by means of image registration. It then uses a registered distortion representation based on the left and registered right views to represent the distortion in the stereoscopic image. Because different distortion types influence image quality differently, a multi-task convolutional neural network (CNN) is employed to learn image quality prediction and distortion-type identification simultaneously. We first design a one-column multi-task CNN model, that learns from the registered distortion representation. Then, we extend the one-column model to a three-column model, which also learns from the left and right views. Our experimental results validate the effectiveness of the proposed registered distortion representation and multi-task CNN architecture. The proposed one- and three-column models outperform the state-of-the-art NR-SIQA metrics, especially for asymmetrically distorted stereoscopic images.

    更新日期:2020-01-04
  • No-reference mesh visual quality assessment via ensemble of convolutional neural networks and compact multi-linear pooling
    Pattern Recogn. (IF 5.898) Pub Date : 2019-12-25
    Ilyass Abouelaziz; Aladine Chetouani; Mohammed El Hassouni; Longin Jan Latecki; Hocine Cherifi

    Blind or No reference quality evaluation is a challenging issue since it is done without access to the original content. In this work, we propose a method based on deep learning for the mesh visual quality assessment without reference. For a given 3D model, we first compute its mesh saliency. Then, we extract views from the 3D mesh and the corresponding mesh saliency. After that, the views are split into small patches that are filtered using a saliency threshold. Only the salient patches are selected and used as input data. After that, three pre-trained deep convolutional neural networks are employed for feature learning: VGG, AlexNet, and ResNet. Each network is fine-tuned and produces a feature vector. The Compact Multi-linear Pooling (CMP) is used afterward to fuse the retrieved vectors into a global feature representation. Finally, fully connected layers followed by a regression module are used to estimate the quality score. Extensive experiments are executed on four mesh quality datasets and comparisons with existing methods demonstrate the effectiveness of our method in terms of correlation with subjective scores.

    更新日期:2020-01-04
  • MDFN: Multi-scale deep feature learning network for object detection
    Pattern Recogn. (IF 5.898) Pub Date : 2019-12-14
    Wenchi Ma; Yuanwei Wu; Feng Cen; Guanghui Wang

    This paper proposes an innovative object detector by leveraging deep features learned in high-level layers. Compared with features produced in earlier layers, the deep features are better at expressing semantic and contextual information. The proposed deep feature learning scheme shifts the focus from concrete features with details to abstract ones with semantic information. It considers not only individual objects and local contexts but also their relationships by building a multi-scale deep feature learning network (MDFN). MDFN efficiently detects the objects by introducing information square and cubic inception modules into the high-level layers, which employs parameter-sharing to enhance the computational efficiency. MDFN provides a multi-scale object detector by integrating multi-box, multi-scale and multi-level technologies. Although MDFN employs a simple framework with a relatively small base network (VGG-16), it achieves better or competitive detection results than those with a macro hierarchical structure that is either very deep or very wide for stronger ability of feature extraction. The proposed technique is evaluated extensively on KITTI, PASCAL VOC, and COCO datasets, which achieves the best results on KITTI and leading performance on PASCAL VOC and COCO. This study reveals that deep features provide prominent semantic information and a variety of contextual contents, which contribute to its superior performance in detecting small or occluded objects. In addition, the MDFN model is computationally efficient, making a good trade-off between the accuracy and speed.

    更新日期:2020-01-04
  • Encoding features robust to unseen modes of variation with attentive long short-term memory
    Pattern Recogn. (IF 5.898) Pub Date : 2019-12-18
    Wissam J. Baddar; Yong Man Ro

    Long short-term memory (LSTM) is a type of recurrent neural networks that is efficient for encoding spatio-temporal features in dynamic sequences. Recent work has shown that the LSTM retains information related to the mode of variation in the input dynamic sequence which reduces the discriminability of the encoded features. To encode features robust to unseen modes of variation, we devise an LSTM adaptation named attentive mode variational LSTM. The proposed attentive mode variational LSTM utilizes the concept of attention to separate the input dynamic sequence into two parts: (1) task-relevant dynamic sequence features and (2) task-irrelevant static sequence features. The task-relevant dynamic features are used to encode and emphasize the dynamics in the input sequence. The task-irrelevant static sequence features are utilized to encode the mode of variation in the input dynamic sequence. Finally, the attentive mode variational LSTM suppresses the effect of mode variation with a shared output gate and results in a spatio-temporal feature robust to unseen variations. The effectiveness of the proposed attentive mode variational LSTM has been verified using two tasks: facial expression recognition and human action recognition. Comprehensive and extensive experiments have verified that the proposed method encodes spatio-temporal features robust to variations unseen during the training.

    更新日期:2020-01-04
  • Semi-supervised cross-modal image generation with generative adversarial networks
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-12
    Dan Li; Changde Du; Huiguang He

    Cross-modal image generation is an important aspect of the multi-modal learning. Existing methods usually use the semantic feature to reduce the modality gap. Although these methods have achieved notable progress, there are still some limitations: (1) they usually use single modality information to learn the semantic feature; (2) they require the training data to be paired. To overcome these problems, we propose a novel semi-supervised cross-modal image generation method, which consists of two semantic networks and one image generation network. Specifically, in the semantic networks, we use image modality to assist non-image modality for semantic feature learning by using a deep mutual learning strategy. In the image generation network, we introduce an additional discriminator to reduce the image reconstruction loss. By leveraging large amounts of unpaired data, our method can be trained in a semi-supervised manner. Extensive experiments demonstrate the effectiveness of the proposed method.

    更新日期:2020-01-04
  • Active emulation of computer codes with Gaussian processes – Application to remote sensing
    Pattern Recogn. (IF 5.898) Pub Date : 2019-11-13
    Daniel Heestermans Svendsen; Luca Martino; Gustau Camps-Valls

    Many fields of science and engineering rely on running simulations with complex and computationally expensive models to understand the involved processes in the system of interest. Nevertheless, the high cost involved hamper reliable and exhaustive simulations. Very often such codes incorporate heuristics that ironically make them less tractable and transparent. This paper introduces an active learning methodology for adaptively constructing surrogate models, i.e. emulators, of such costly computer codes in a multi-output setting. The proposed technique is sequential and adaptive, and is based on the optimization of a suitable acquisition function. It aims to achieve accurate approximations, model tractability, as well as compact and expressive simulated datasets. In order to achieve this, the proposed Active Multi-Output Gaussian Process Emulator (AMOGAPE) combines the predictive capacity of Gaussian Processes (GPs) with the design of an acquisition function that favors sampling in low density and fluctuating regions of the approximation functions. Comparing different acquisition functions, we illustrate the promising performance of the method for the construction of emulators with toy examples, as well as for a widely used remote sensing transfer code.

    更新日期:2020-01-04
Contents have been reproduced by permission of the publishers.
导出
全部期刊列表>>
2020新春特辑
限时免费阅读临床医学内容
ACS材料视界
科学报告最新纳米科学与技术研究
清华大学化学系段昊泓
自然科研论文编辑服务
加州大学洛杉矶分校
上海纽约大学William Glover
南开大学化学院周其林
课题组网站
X-MOL
北京大学分子工程苏南研究院
华东师范大学分子机器及功能材料
中山大学化学工程与技术学院
试剂库存
天合科研
down
wechat
bug