-
ADFNet: accumulated decoder features for real-time semantic segmentation IET Comput. Vis. (IF 1.516) Pub Date : 2020-12-15 Hyunguk Choi; Hoyeon Ahn; Joonmo Kim; Moongu Jeon
Semantic segmentation is one of the important technologies in autonomous driving, and ensuring its real-time and high performance is of utmost importance for the safety of pedestrians and passengers. To improve its performance using deep neural networks that operate in real-time, the authors propose a simple and efficient method called ADFNet using accumulated decoder features, ADFNet operates by only
-
Partial disentanglement of hierarchical variational auto-encoder for texture synthesis IET Comput. Vis. (IF 1.516) Pub Date : 2020-12-15 Marek Jakab; Lukas Hudec; Wanda Benesova
Multiple research studies have recently demonstrated deep networks can generate realistic-looking textures and stylised images from a single texture example. However, they suffer from some drawbacks. Generative adversarial networks are in general difficult to train. Multiple feature variations, encoded in their latent representation, require a priori information to generate images with specific features
-
GLStyleNet: exquisite style transfer combining global and local pyramid features IET Comput. Vis. (IF 1.516) Pub Date : 2020-12-15 Zhizhong Wang; Lei Zhao; Sihuan Lin; Qihang Mo; Huiming Zhang; Wei Xing; Dongming Lu
Recent studies using deep neural networks have shown remarkable success in style transfer, especially for artistic and photo-realistic images. However, these methods cannot solve more sophisticated problems. The approaches using global statistics fail to capture small, intricate textures and maintain correct texture scales of the artworks, and the others based on local patches are defective on global
-
Multi-mode neural network for human action recognition IET Comput. Vis. (IF 1.516) Pub Date : 2020-12-15 Haohua Zhao; Weichen Xue; Xiaobo Li; Zhangxuan Gu; Li Niu; Liqing Zhang
Video data are of two different intrinsic modes, in-frame and temporal. It is beneficial to incorporate static in-frame features to acquire dynamic features for video applications. However, some existing methods such as recurrent neural networks do not have a good performance, and some other such as 3D convolutional neural networks (CNNs) are both memory consuming and time consuming. This study proposes
-
Detecting dense text in natural images IET Comput. Vis. (IF 1.516) Pub Date : 2020-12-15 Dianzhuan Jiang; Shengsheng Zhang; Yaping Huang; Qi Zou; Xingyuan Zhang; Mengyang Pu; Junbo Liu
Most existing text detection methods are mainly motivated by deep learning-based object detection approaches, which may result in serious overlapping between detected text lines, especially in dense text scenarios. It is because text boxes are not commonly overlapped, as different from general objects in natural scenes. Moreover, text detection requires higher localisation accuracy than object detection
-
Robust locality preserving projections using angle-based adaptive weight method IET Comput. Vis. (IF 1.516) Pub Date : 2020-12-15 Yunlong Gao; Shuxin Zhong; Kangli Hu; Jinyan Pan
Locality preserving projections (LPP) method is a classical manifold learning method for dimensionality reduction. However, LPP is sensitive to outliers since squared L2-norm may exaggerate the distance of outliers. Besides, the normalisation constraint of LPP may impair its robustness during embedding. Motivated by this observation, the authors propose a novel robust LPP using angle-based adaptive
-
Converting video classification problem to image classification with global descriptors and pre-trained network IET Comput. Vis. (IF 1.516) Pub Date : 2020-12-15 Saeedeh Zebhi; SMT Al-Modarresi; Vahid Abootalebi
Motion history image (MHI) is a spatio-temporal template that temporal motion information is collapsed into a single image where intensity is a function of recency of motion. Also, it consists of spatial information. Energy image (EI) based on the magnitude of optical flow is a temporal template that shows only temporal information of motion. Each video can be described in these templates. So, four
-
Referring expression comprehension model with matching detection and linguistic feedback IET Comput. Vis. (IF 1.516) Pub Date : 2020-12-15 Jianming Wang; Enjie Cui; Kunliang Liu; Yukuan Sun; Jiayu Liang; Chunmiao Yuan; Xiaojie Duan; Guanghao Jin; Tae-Sun Chung
The task of referring expression comprehension (REC) is to localise an image region of a specific object described by a natural language expression, and all existing REC methods assume that the object described by the referring expression must be located in the given image. However, this assumption is not correct in some real applications. For example, a visually impaired user might tell his robot
-
Combination of temporal-channels correlation information and bilinear feature for action recognition IET Comput. Vis. (IF 1.516) Pub Date : 2020-12-15 Jiahui Cai; Jianguo Hu; Shiren Li; Jialing Lin; Jun Wang
In this study, the authors focus on improving the spatio–temporal representation ability of three-dimensional (3D) convolutional neural networks (CNNs) in the video domain. They observe two unfavourable issues: (i) the convolutional filters only dedicate to learning local representation along input channels. Also they treat channel-wise features equally, without emphasising the important features;
-
Domain-invariant adversarial learning with conditional distribution alignment for unsupervised domain adaptation IET Comput. Vis. (IF 1.516) Pub Date : 2020-12-15 Xingmei Wang; Boxuan Sun; Hongbin Dong
Unsupervised domain adaption aims to reduce the divergence between the source domain and the target domain. The final objective is to learn domain-invariant features from both domains that get the minimised expected error on the target domain. The divergence between domains which is also called domain shift is mainly between the distributions of domains' samples. Additionally, the label shift is also
-
Creative and diverse artwork generation using adversarial networks IET Comput. Vis. (IF 1.516) Pub Date : 2020-12-15 Haibo Chen; Lei Zhao; Lihong Qiu; Zhizhong Wang; Huiming Zhang; Wei Xing; Dongming Lu
Existing style transfer methods have achieved great success in artwork generation by transferring artistic styles onto everyday photographs while keeping their contents unchanged. Despite this success, these methods have one inherent limitation: they cannot produce newly created image contents, lacking creativity and flexibility. On the other hand, generative adversarial networks (GANs) can synthesise
-
Diversified Fisher kernel: encoding discrimination in Fisher features to compete deep neural models for visual classification task IET Comput. Vis. (IF 1.516) Pub Date : 2020-12-15 Sarah Ahmed; Tayyaba Azim
Fisher kernels derived from stochastic probabilistic models such as restricted and deep Boltzmann machines have shown competitive visual classification results in comparison to widely popular deep discriminative models. This genre of Fisher kernels bridges the gap between shallow and deep learning paradigm by inducing the characteristics of deep architecture into Fisher kernel, further deployed for
-
Moving shadow detection via binocular vision and colour clustering IET Comput. Vis. (IF 1.516) Pub Date : 2020-12-15 Lei Lu; Ming Xu; Jeremy S. Smith; Yuyao Yan
A pedestrian segmentation algorithm in the presence of cast shadows is presented in this study. The novelty of this algorithm lies in the fusion of multi-view and multi-plane homographic projections of foregrounds and the use of the fused data to guide colour clustering. This brings about an advantage over the existing binocular algorithms in that it can remove cast shadows while keeping pedestrians’
-
Human-like evaluation method for object motion detection algorithms IET Comput. Vis. (IF 1.516) Pub Date : 2020-12-15 Abimael Guzman-Pando; Mario Ignacio Chacon-Murguia; Lucia B. Chacon-Diaz
This study proposes a new method to evaluate the performance of algorithms for moving object detection (MODA) in video sequences. The proposed method is based on human performance metric intervals, instead of ideal metric values (0 or 1) which are commonly used in the literature. These intervals are proposed to establish a more reliable evaluation and comparison, and to identify areas of improvement
-
YOLOpeds: efficient real-time single-shot pedestrian detection for smart camera applications IET Comput. Vis. (IF 1.516) Pub Date : 2020-11-16 Christos Kyrkou
Deep-learning-based pedestrian detectors can enhance the capabilities of smart camera systems in a wide spectrum of machine vision applications including video surveillance, autonomous driving, robots and drones, smart factory, and health monitoring. However, such complex paradigms do not scale easily and are not traditionally implemented in resource-constrained smart cameras for on-device processing
-
Modelling large scale camera networks for identification and tracking: an abstract framework IET Comput. Vis. (IF 1.516) Pub Date : 2020-11-16 Lakshmi Mohan; Vivek Menon
In this study, the authors discuss a novel approach for multi-camera-based unobtrusive identification and tracking of occupants in wide-area, multi-building scenarios. Considering the scalability issues in adopting a centralised approach to monitor wide-area scenarios, they proposed a distributed approach to occupant identification and tracking. The key technical idea underlying their approach is to
-
Harnessing feedback region proposals for multi-object tracking IET Comput. Vis. (IF 1.516) Pub Date : 2020-11-16 Aswathy Prasanna Kumar; Deepak Mishra
In the tracking-by-detection approach of online multiple object tracking (MOT), a major challenge is how to associate object detections on the new video frame with previously tracked objects. Two important aspects that directly influence the performance of MOT are quality of detection and accuracy in data association. The authors propose an efficient and unified MOT framework for improved object detection
-
Dual attention module and multi-label based fully convolutional network for crowd counting IET Comput. Vis. (IF 1.516) Pub Date : 2020-11-16 Suyu Wang; Bin Yang; Bo Liu; Guanghui Zheng
High-density crowd counting in natural scenes is an extremely difficult and challenging research subject in computer vision. Although the algorithm based on the convolutional neural network has achieved significantly better results than the traditional algorithm, most of them tend to focus on the local features of images, and difficult to obtain the rich global contextual dependencies. To solve this
-
Drone swarm patrolling with uneven coverage requirements IET Comput. Vis. (IF 1.516) Pub Date : 2020-11-16 Claudio Piciarelli; Gian Luca Foresti
Swarms of drones are being more and more used in many practical scenarios, such as surveillance, environmental monitoring, search and rescue in hardly-accessible areas and so on. While a single drone can be guided by a human operator, the deployment of a swarm of multiple drones requires proper algorithms for automatic task-oriented control. In this study, the authors focus on visual coverage optimisation
-
Decentralised indoor smart camera mapping and hierarchical navigation for autonomous ground vehicles IET Comput. Vis. (IF 1.516) Pub Date : 2020-11-16 Taylor J.L. Whitaker; Samantha-Jo Cunningham; Christophe Bobda
In this work, the authors propose a novel decentralised coordination scheme for autonomous ground vehicles to enable map building and path planning with a network of smart overhead cameras. Decentralised indoor smart camera mapping and hierarchical navigation supports the automatic generation of waypoint graphs for each camera in an environment and allows path planning through the environment across
-
Generic wavelet-based image decomposition and reconstruction framework for multi-modal data analysis in smart camera applications IET Comput. Vis. (IF 1.516) Pub Date : 2020-11-16 Yijun Yan; Yiguang Liu; Mingqiang Yang; Huimin Zhao; Yanmei Chai; Jinchang Ren
Effective acquisition, analysis and reconstruction of multi-modal data such as colour and multi-/hyper-spectral imagery is crucial in smart camera applications, where wavelet-based coding and compression of images are highly demanded. Many existing discrete wavelet filtering banks have fixed coefficients hence their performance is highly dependent on the signal/image being processed. To tackle this
-
Learning across views for stereo image completion IET Comput. Vis. (IF 1.516) Pub Date : 2020-11-16 Wei Ma; Mana Zheng; Wenguang Ma; Shibiao Xu; Xiaopeng Zhang
Stereo image completion (SIC) is to fill holes existing in a pair of stereo images. SIC is more complicated than single image repairing, which needs to complete the pair of images while keeping their stereoscopic consistency. In recent years, deep learning has been introduced into single image repairing but seldom used for SIC. The authors present a novel deep learning-based approach for SIC. In their
-
Catadioptric hyperspectral imaging, an unmixing approach IET Comput. Vis. (IF 1.516) Pub Date : 2020-11-16 Didem Ozisik Baskurt; Yalin Bastanlar; Yasemin Yardimci Cetin
Hyperspectral imaging systems provide dense spectral information on the scene under investigation by collecting data from a high number of contiguous bands of the electromagnetic spectrum. The low spatial resolutions of these sensors frequently give rise to the mixing problem in remote sensing applications. Several unmixing approaches are developed in order to handle the challenging mixing problem
-
Stroke controllable style transfer based on dilated convolutions IET Comput. Vis. (IF 1.516) Pub Date : 2020-11-16 Zhaopan Xu; Juan Zhang; Yu Zhang; Mingquan Zhou; Kang Li; Shengling Geng; Xiaojuan Zhang
Transferring a photo to a stylised image with beautiful texture has become one of the most popular topics in computer vision and the application of image processing. Controlling the stroke size of the texture is one of the challenging problems in this task. Recent representative methods for such problem introduce a pyramid model to regulate receptive fields in the network. Meanwhile, dilated convolutions
-
Deep emotion recognition based on audio–visual correlation IET Comput. Vis. (IF 1.516) Pub Date : 2020-11-16 Noushin Hajarolasvadi; Hasan Demirel
Human emotion recognition is studied by means of unimodal channels over the last decade. However, efforts continue to answer tempting questions about how variant modalities can complement each other. This study proposes a multimodal approach using three-dimensional (3D) convolutional neural networks (CNNs) to model human emotion through a modality-referenced system while investigating the solution
-
Algorithm using supervised subspace learning and non-local representation for pose variation recognition IET Comput. Vis. (IF 1.516) Pub Date : 2020-11-16 Mengmeng Liao; Changzhi Wang; Xiaodong Gu
Pose variation has been one of the challenges of face recognition. To solve this challenge, the authors propose a classification algorithm using supervised subspace learning and non-local representation (SSLNR). In SSLNR, they first propose a supervised subspace learning algorithm (SSLA). SSLA includes three different terms. The first term is the difference term, which can reduce the intra-class differences
-
Identification of crop diseases using improved convolutional neural networks IET Comput. Vis. (IF 1.516) Pub Date : 2020-11-16 Long Wang; Jun Sun; Xiaohong Wu; Jifeng Shen; Bing Lu; Wenjun Tan
Conventional AlexNet has the problems of slow training speed, single characteristic scale and low recognition accuracy. To solve these problems, a convolutional neural network identification model based on Inception module and dilated convolution is proposed in this study. The inception module combined with dilated convolution, could extract disease characteristics at different scales and increase
-
Subgraph and object context-masked network for scene graph generation IET Comput. Vis. (IF 1.516) Pub Date : 2020-11-16 Zhenxing Zheng; Zhendong Li; Gaoyun An; Songhe Feng
Scene graph generation is to recognise objects and their semantic relationships in an image and can help computers understand visual scene. To improve relationship prediction, geometry information is essential and usually incorporated into relationship features. Existing methods use coordinates of objects to encode their spatial layout. However, in this way, they neglect the context of objects. In
-
Image stylisation: from predefined to personalised IET Comput. Vis. (IF 1.516) Pub Date : 2020-10-08 Ignacio Garcia-Dorado; Pascal Getreuer; Bartlomiej Wronski; Peyman Milanfar
The authors present a framework for interactive design of new image stylisations using a wide range of predefined filter blocks. Both novel and off-the-shelf image filtering and rendering techniques are extended and combined to allow the user to unleash their creativity to intuitively invent, modify, and tune new styles from a given set of filters. In parallel to this manual design, they propose a
-
Advances in colour transfer IET Comput. Vis. (IF 1.516) Pub Date : 2020-10-08 Francois Pitié
Colour grading is an essential step in movie post-production, which is done in the industry by experienced artists on expensive edit hardware and software suites. This paper presents a review of the advances made to automate this process. The review looks in particular at how the state-of-the-art in optimal transport and deep learning has advanced some of the fundamental problems of colour transfer
-
Motion-based frame interpolation for film and television effects IET Comput. Vis. (IF 1.516) Pub Date : 2020-10-08 Anil Kokaram; Davinder Singh; Simon Robinson; Damien Kelly; Bill Collis; Kim Libreri
Frame interpolation is the process of synthesising a new frame in-between existing frames in an image sequence. It has emerged as a key algorithmic module in motion picture effects. In the context of this special issue, this study provides a review of the technology used to create in-between frames and presents a Bayesian framework that generalises frame interpolation algorithms using the concept of
-
Deep quantised portrait matting IET Comput. Vis. (IF 1.516) Pub Date : 2020-10-08 Zhan Zhang; Yuehai Wang; Jianyi Yang
Portrait matting is of vital importance for many applications such as portrait editing, background replacement, ecommerce demonstration, and augmented reality. The portrait matt can be accessed by predicting the α value of the original picture. Previous deep matting methods usually adopt a segmentation network to tackle portrait matting tasks. However, these traditional methods will introduce unpleasant
-
Going beyond free viewpoint: creating animatable volumetric video of human performances IET Comput. Vis. (IF 1.516) Pub Date : 2020-10-08 Anna Hilsmann; Philipp Fechteler; Wieland Morgenstern; Wolfgang Paier; Ingo Feldmann; Oliver Schreer; Peter Eisert
An end-to-end pipeline for the creation of high-quality animatable volumetric video of human performances is presented. Going beyond the application of free-viewpoint video, the authors allow re-animation and alteration of an actor's performance through the enrichment of the captured data with semantics and animation properties. Hybrid geometry- and video-based animation methods are applied that allow
-
Interactive facial animation with deep neural networks IET Comput. Vis. (IF 1.516) Pub Date : 2020-10-08 Wolfgang Paier; Anna Hilsmann; Peter Eisert
Creating realistic animations of human faces is still a challenging task in computer graphics. While computer graphics (CG) models capture much variability in a small parameter vector, they usually do not meet the necessary visual quality. This is due to the fact, that geometry-based animation often does not allow fine-grained deformations and fails in difficult areas (mouth, eyes) to produce realistic
-
Offline mobile diagnosis system for citrus pests and diseases using deep compression neural network IET Comput. Vis. (IF 1.516) Pub Date : 2020-10-08 Jie You; Joonwhoan Lee
This study presents an offline mobile diagnosis system for citrus pests and diseases by compression convolutional neural network. Recently, with the growth of labelled data, the deep neural network incites the revolutionary change with a quantum leap in various fields. Benefiting from the backpropagation method, the proper network structure can automatically extract high-level representations and find
-
Motion boundary emphasised optical flow method for human action recognition IET Comput. Vis. (IF 1.516) Pub Date : 2020-10-08 Cheng Peng; Haozhi Huang; Ah-Chung Tsoi; Sio-Long Lo; Yun Liu; Zi-yi Yang
This study proposes a three-stream model using two different types of deep convolutional neural networks (CNNs): (i) a spatial stream with a CNN on images; (ii) a ResNet (residual network) on optical flows; and, (iii) a ResNet on the concatenation of motion features. This model is applied to four datasets: (i) UCF Sports; (ii) Youtube Sports; (iii) SBU action interaction; and (iv) a subset of the UCF-1M
-
Accurate and fast single shot multibox detector IET Comput. Vis. (IF 1.516) Pub Date : 2020-10-08 Lie Guo; Dongxing Wang; Linhui Li; Jindun Feng
With the development of deep learning, the performance of object detection has made great progress. However, there are still some challenging problems, such as the detection accuracy of small objects and the efficiency of the detector. This study proposes an accurate and fast single shot multibox detector, which includes context comprehensive enhancement (CCE) module and feature enhancement module
-
Spatial–temporal representation for video re-identification via key images IET Comput. Vis. (IF 1.516) Pub Date : 2020-10-08 Wanru Song; Changhong Chen; Qingqing Zhao; Feng Liu
Video-based person re-identification aims to verify the pedestrian identity from image sequences. The sequences are captured by cameras located in different directions at different times. Existing studies have certain limitations in the case of occlusions and pose variations. To solve the aforementioned problems, this study proposes a new two-stage framework, from which the key-image-based fusion spatial–temporal
-
Beyond top-N accuracy indicator: a comprehensive evaluation indicator of CNN models in image classification IET Comput. Vis. (IF 1.516) Pub Date : 2020-10-08 Yuntao Liu; Yong Dou; Peng Qiao
Nowadays, a large number of deep convolutional neural network (CNN) models are applied to image classification tasks. However, the authors find that the most widely used evaluation indicator, the Top- N Accuracy indicator, cannot discriminate these models effectively. In this study, they propose a new indicator called Maximum-Spanning-Confusion-Tree indicator to solve this problem. The Maximum-Spa
-
Skeleton-based attention-aware spatial–temporal model for action detection and recognition IET Comput. Vis. (IF 1.516) Pub Date : 2020-08-06 Ran Cui; Aichun Zhu; Jingran Wu; Gang Hua
Action detection and recognition are popular subjects of research in the field of computer vision. The task of action detection can be regarded as the sum of action location and recognition. Action features described by using information concerning the human skeleton have the advantages of robustness against external factors and requiring a small amount of calculation. This study proposes a skeleton-based
-
Efficient complex ISAR object recognition using adaptive deep relation learning IET Comput. Vis. (IF 1.516) Pub Date : 2020-08-06 Chunsheng Liu; Zhongmei Wang
Complex inverse synthetic aperture radar (ISAR) object recognition is a critical and challenging problem in computer vision tasks. An efficient complex object recognition method for ISAR images is proposed based on adaptive deep relation learning. (i) An adaptive multimodal mechanism is proposed to greatly improve the multimodal sampling and transformation capabilities of convolutional neural networks
-
Regularising neural networks for future trajectory prediction via inverse reinforcement learning framework IET Comput. Vis. (IF 1.516) Pub Date : 2020-08-06 Dooseop Choi; Kyoungwook Min; Jeongdan Choi
Predicting distant future trajectories of agents in a dynamic scene is challenging because the future trajectory of an agent is affected not only by their past trajectory but also the scene contexts. To tackle this problem, the authors propose a model based on recurrent neural networks, and a novel method for training this model. The proposed model is based on an encoder–decoder architecture where
-
Adversarial examples detection through the sensitivity in space mappings IET Comput. Vis. (IF 1.516) Pub Date : 2020-08-06 Xurong Li; Shouling Ji; Juntao Ji; Zhenyu Ren; Chunming Wu; Bo Li; Ting Wang
Adversarial examples (AEs) against deep neural networks (DNNs) raise wide concerns about the robustness of DNNs. Existing detection mechanisms are often limited to a given attack algorithm. Therefore, it is highly desirable to develop a robust detection approach that remains effective for a large group of attack algorithms. In addition, most of the existing defences only perform well for small images
-
Architecture to improve the accuracy of automatic image annotation systems IET Comput. Vis. (IF 1.516) Pub Date : 2020-08-06 Artin Ghostan Khatchatoorian; Mansour Jamzad
Automatic image annotation (AIA) is an image retrieval mechanism to extract relative semantic tags from visual content. So far, the improvement of accuracy in newly developed such methods have been about 1 or 2% in the F1-score and the architectures seem to have room for improvement. Therefore, the authors designed a more detailed architecture for AIA and suggested new algorithms for its main parts
-
RTL3D: real-time LIDAR-based 3D object detection with sparse CNN IET Comput. Vis. (IF 1.516) Pub Date : 2020-08-06 Lin Yan; Kai Liu; Evgeny Belyaev; Meiyu Duan
LIDAR (light detection and ranging) based real-time 3D perception is crucial for applications such as autonomous driving. However, most of the convolutional neural network (CNN) based methods are time-consuming and computation-intensive. These drawbacks are mainly attributed to the highly variable density of LIDAR point cloud and the complexity of their pipelines. To find a balance between speed and
-
Orthogonal random projection for tensor completion IET Comput. Vis. (IF 1.516) Pub Date : 2020-08-06 Yali Feng; Guoxu Zhou
The low-rank tensor completion problem, which aims to recover the missing data from partially observable data. However, most of the existing tensor completion algorithms based on Tucker decomposition cannot avoid using singular value decomposition (SVD) operation to calculate the Tucker factors, so they are not suitable for the completion of large-scale data. To solve this problem, they propose a new
-
Polyp detection using CNNs in colonoscopy video IET Comput. Vis. (IF 1.516) Pub Date : 2020-08-06 Azadeh Haj-Manouchehri; Hossein Mahvash Mohammadi
Polyps are a group of cells growing on the inner surface of the colon. Over time, some polyps can lead to colon cancer, which is often fatal if found in its later stages. Colon cancer can be prevented if the polyps are identified and removed in their early stages. Colonoscopy is a very effective screening method to remove polyps and it largely prevents colon cancer. However, some polyps may not be
-
STDC-Flow: large displacement flow field estimation using similarity transformation-based dense correspondence IET Comput. Vis. (IF 1.516) Pub Date : 2020-08-06 Congxuan Zhang; Zhen Chen; Fan Xiong; Wen Liu; Ming Li; Liyue Ge
In order to improve the accuracy and robustness of optical flow computation under large displacements and motion occlusions, the authors present in this study a large displacement flow field estimation approach using similarity transformation-based dense correspondence, named STDC-Flow approach. First, the authors compute an initial nearest-neighbour field by using the STDC-Flow of the consecutive
-
Optimisation-based training of evolutionary convolution neural network for visual classification applications IET Comput. Vis. (IF 1.516) Pub Date : 2020-08-06 Shanshan Tu; Sadaqat ur Rehman; Muhammad Waqas; Obaid ur Rehman; Zhongliang Yang; Basharat Ahmad; Zahid Halim; Wei Zhao
Training of the convolution neural network (CNN) is a problem of global optimisation. This study proposed a hybrid modified particle swarm optimisation (MPSO) and conjugate gradient (CG) algorithm for efficient training of CNN. The training involves MPSO–CG to avoid trapping in local minima. Particularly, improvements in the MPSO by introducing a novel approach for control parameters, improved parameters
-
Pose-invariant face recognition based on matching the occlusion free regions aligned by 3D generic model IET Comput. Vis. (IF 1.516) Pub Date : 2020-08-06 Arezoo Sadeghzadeh; Hossein Ebrahimnezhad
Face recognition systems perform accurately in a controlled environment, but an unconstrained environment dramatically degrades their performance. In this study, a novel pose-invariant face recognition system is proposed based on the occlusion free regions. This method utilises a gallery set of frontal face images and can handle large pose variations. For a 2D probe face image with an arbitrary pose
-
Accurate scale estimation for visual tracking with significant deformation IET Comput. Vis. (IF 1.516) Pub Date : 2020-08-06 Lutao Chu; Huiyun Li; Zhiheng Yang
Scale variation of a target frequently appears in tasks of visual tracking. Accurate scale estimation is challenging due to deformation, occlusion, rotation, change in the view angle and diversity of tracking object categories. Most tracking methods employ an exhaustive search of scales to estimate the target scales. However, only finite and discrete scales are usually searched due to the expensive
Contents have been reproduced by permission of the publishers.