-
Hierarchical matrix factorization for interpretable collaborative filtering Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-07 Kai Sugahara, Kazushi Okamoto
Matrix factorization (MF) is a simple collaborative filtering technique that achieves superior recommendation accuracy by decomposing the user-item interaction matrix into user and item latent matrices. Because the model typically learns each interaction independently, it may overlook the underlying shared dependencies between users and items, resulting in less stable and interpretable recommendations
-
A siamese-based verification system for open-set architecture attribution of synthetic images Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-05 Lydia Abady, Jun Wang, Benedetta Tondi, Mauro Barni
Despite the wide variety of methods developed for synthetic image attribution, most of them can only attribute images generated by models or architectures included in the training set and do not work with architectures, hindering their applicability in real-world scenarios. In this paper, we propose a verification framework that relies on a Siamese Network to address the problem of open-set attribution
-
Multifractal characterization and recognition of animal behavior based on deep wavelet transform Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-04 Kexin Meng, Shanjie Yang, Piercarlo Cattani, Shijiao Gao, Shuli Mei
The study conduct an in-depth exploration of the multifractal characteristics of dairy cows behavioral data, aiming to reveal their complexity and representation in behavioral patterns. By means of Multifractal Detrended Fluctuation Analysis (MFDFA) in conjunction with deep wavelet transform, we extract multifractal indices that precisely depict the differences and dynamic changes of cows behavior
-
Towards high-fidelity facial UV map generation in real-world Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-02 Yuanming Li, Jeong-gi Kwak, Bon-hwa Ku, David Han, Hanseok Ko
We present a framework for completing high-fidelity 3D facial UV maps from single-face image. Despite the success of Generative Adversarial Networks (GANs) in this area, generating accurate UV maps from in-the-wild images remains challenging. Our approach involves a novel network called “Map and Edit” that combines a 2D generative model and a 3D prior to explicitly control the generation of multi-view
-
Adaptive regularized ensemble for evolving data stream classification Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-01 Aldo M. Paim, Fabrício Enembreck
Extracting knowledge from data streams requires fast incremental algorithms that are able to handle unlimited processing and ever-changing data with finite memory. A strategy for this challenge is the use of ensembles owing to their ability to tackle concept drift and achieve highly accurate predictions. However, ensembles often require a lot of computational resources. In this study, we propose a
-
Channel-spatial knowledge distillation for efficient semantic segmentation Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-01 Ayoub Karine, Thibault Napoléon, Maher Jridi
In this paper, we propose a new lightweight Channel-Spatial Knowledge Distillation (CSKD) method to handle the task of efficient image semantic segmentation. More precisely, we investigate the KD approach that train a compressed neural network called student under the supervision of a heavy one called teacher. In this context, we propose to improve the distillation mechanism by capturing the contextual
-
Frame-part-activated deep reinforcement learning for Action Prediction Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-01 Lei Chen, Zhanjie Song
In this paper, we propose a frame-part-activated deep reinforcement learning (FPA-DRL) for action prediction. Most existing methods for action prediction utilize the evolution of whole frames to model actions, which cannot avoid the noise of the current action, especially in the early prediction. Moreover, the loss of structural information of human body diminishes the capacity of features to describe
-
Continual learning for adaptive social network identification Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-28 Simone Magistri, Daniele Baracchi, Dasara Shullani, Andrew D. Bagdanov, Alessandro Piva
The popularity of social networks as primary mediums for sharing visual content has made it crucial for forensic experts to identify the original platform of multimedia content. Various methods address this challenge, but the constant emergence of new platforms and updates to existing ones often render forensic tools ineffective shortly after release. This necessitates the regular updating of methods
-
SPACE: Senti-Prompt As Classifying Embedding for sentiment analysis Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-28 Jinyoung Kim, Youngjoong Ko
In natural language processing, the general approach to sentiment analysis involves a pre-training and fine-tuning paradigm using pre-trained language models combined with classifier models. Recently, numerous studies have applied prompts not only to downstream generation but also to classification tasks as well. However, to fully utilize the advantages of prompts and incorporate the context-dependent
-
Adaptive watermarking with self-mutual check parameters in deep neural networks Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-24 Zhenzhe Gao, Zhaoxia Yin, Hongjian Zhan, Heng Yin, Yue Lu
Artificial Intelligence has found wide application, but also poses risks due to unintentional or malicious tampering during deployment. Regular checks are therefore necessary to detect and prevent such risks. Fragile watermarking is a technique used to identify tampering in AI models. However, previous methods have faced challenges including risks of omission, additional information transmission, and
-
GBCA: Graph Convolution Network and BERT combined with Co-Attention for fake news detection Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-23 Zhen Zhang, Qiyun Lv, Xiyuan Jia, Wenhao Yun, Gongxun Miao, Zongqing Mao, Guohua Wu
Social media has evolved into a widely influential information source in contemporary society. However, the widespread use of social media also enables the rapid spread of fake news, which can pose a significant threat to national and social stability. Current fake news detection methods primarily rely on graph neural network, which analyze the dissemination patterns of news articles. Nevertheless
-
Attention based multi-task interpretable graph convolutional network for Alzheimer’s disease analysis Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-22 Shunqin Jiang, Qiyuan Feng, Hengxin Li, Zhenyun Deng, Qinghong Jiang
Alzheimer’s Disease impairs the memory and cognitive function of patients, and early intervention can effectively mitigate its deterioration. Most existing methods for Alzheimer’s analysis rely solely on medical images, ignoring the impact of some clinical indicators associated with the disease. Furthermore, these methods have thus far failed to identify the specific brain regions affected by the disease
-
Forensic analysis of AI-compression traces in spatial and frequency domain Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-22 Sandra Bergmann, Denise Moussa, Fabian Brand, André Kaup, Christian Riess
The classical JPEG compression is a rich source of cues for forensic image analysis. However, this compression standard will in the near future be complemented by a new, highly efficient learning-based compression standard called JPEG-AI. JPEG-AI is fundamentally different from classical JPEG. Hence, its forensic traces can also be expected to be fundamentally different. We argue that there is a pressing
-
Less is more: A minimalist approach to robust GAN-generated face detection Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-22 Tanusree Ghosh, Ruchira Naskar
Hyper-realistic images that are not differentiable from authentic images to regular viewers have become extremely easy to generate and highly accessible. Furthermore, the increasing pervasiveness of social media networks in our daily lives has facilitated the easy dissemination of fake news accompanied by such synthetic images. Hyper-realistic artificial face images are often illicitly used as profile
-
Learning interactions across sentiment and emotion with graph attention network and position encodings Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-16 Ao Jia, Yazhou Zhang, Sagar Uprety, Dawei Song
Sentiment classification and emotion recognition are two close related tasks in NLP. However, most of the recent studies have treated them as two separate tasks, where the shared knowledge are neglected. In this paper, we propose a multi-task interactive graph attention network with position encodings, termed MIP-GAT, to improve the performance of each task by simultaneously leveraging similarities
-
PNSP: Overcoming catastrophic forgetting using Primary Null Space Projection in continual learning Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-15 DaiLiang Zhou, YongHong Song
Continual Learning (CL) plays a crucial role in enhancing learning performance for both new and previous tasks in continuous data streams, thus contributing to the advancement of cognitive computing. However, CL faces a fundamental challenge known as the stability-plasticity quandary. In this research, we present an innovative and effective CL algorithm called Primary Null Space Projection (PNSP) to
-
CrossFormer: Cross-guided attention for multi-modal object detection Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-15 Seungik Lee, Jaehyeong Park, Jinsun Park
Object detection is one of the essential tasks in a variety of real-world applications such as autonomous driving and robotics. In a real-world scenario, unfortunately, there are numerous challenges such as illumination changes, adverse weather conditions, and geographical changes, to name a few. To tackle the problem, we propose a novel multi-modal object detection model that is built upon a hierarchical
-
A lightness-aware loss for low-light image enhancement Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-14 Dian Xie, Huajun Xing, Liangyu Chen, Shijie Hao
Current low-light image enhancement methods have made great progress on improving the visibility of low-light images. Nevertheless, they pay less attention to preserving visual naturalness and therefore often introduce over-enhancement and local artifacts into their results. To address this issue, it is useful to introduce additional multi-view information of an image into enhancement models, such
-
Human Gait Recognition by using Two Stream Neural Network along with Spatial and Temporal Features Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-11 Asif Mehmood, Javeria Amin, Muhammad Sharif, Seifedine Kadry
Human Gait Recognition (HGR) is referred to as a biometric tactic that is broadly used for the recognition of an individual by using the pattern of walking. There are some key factors such as angle variation, clothing variation, foot shadows, and carrying conditions that affect the human gait. In this work, a new approach is proposed for the HGR that contains five major steps. In the first step, the
-
M[formula omitted]TTS: Multi-modal text-to-speech of multi-scale style control for dubbing Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-10 Yan Liu, Li-Fang Wei, Xinyuan Qian, Tian-Hao Zhang, Song-Lu Chen, Xu-Cheng Yin
Dubbing refers to the procedure of recording characters by professional voice actors in films and games. It is more expressive and immersive than conventional Text-to-Speech (TTS) technologies and requires synchronization and style consistency of audio and video. Previous dubbing methods use video to provide either a global style vector or a local prosody embedding, limiting the expressiveness of the
-
CustomDepth: Customizing point-wise depth categories for depth completion Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-10 Shenglun Chen, Xinchen Ye, Hong Zhang, Haojie Li, Zhihui Wang
Classification-based depth completion methods have achieved remarkable performance. However, the result is still coarse due to the limitation of using unified depth categories to represent depth distribution. In this work, we propose CustomDepth which can customize exclusive depth categories for each image point to boost performance. To this end, CustomDepth introduces a depth subdivision module that
-
Feature enhancement and coarse-to-fine detection for RGB-D tracking Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-10 Xue-Feng Zhu, Tianyang Xu, Xiao-Jun Wu, Josef Kittler
Existing RGB-D tracking algorithms advance the performance by constructing typical appearance models from the RGB-only tracking frameworks. There is no attempt to exploit any complementary visual information from the multi-modal input. This paper addresses this deficit and presents a novel algorithm to boost the performance of RGB-D tracking by taking advantage of collaborative clues. To guarantee
-
Machine learning for low signal-to-noise ratio detection Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-09 Fred Lacy, Angel Ruiz-Reyes, Anthony Brescia
[Display omitted]
-
Data-efficient 3D instance segmentation by transferring knowledge from synthetic scans Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-07 Xiaodong Wu, Ruiping Wang, Xilin Chen
The 3D comprehension ability of indoor environments is critical for robots. While deep learning-based methods have improved performance, they require significant amounts of annotated training data. Nevertheless, the cost of scanning and annotating point cloud data in real scenes is high, leading to data scarcity. Consequently, there is an urgent need to investigate data-efficient methods for point
-
On characterizing the evolution of embedding space of neural networks using algebraic topology Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-07 S. Suresh, B. Das, V. Abrol, S. Dutta Roy
We study how the topology of feature embedding space changes as it passes through the layers of a well-trained deep neural network (DNN) through Betti numbers. Motivated by existing studies using simplicial complexes on shallow fully connected networks (FCN), we present an extended analysis using Cubical homology instead, with a variety of popular deep architectures and real image datasets. We demonstrate
-
Hierarchical reinforcement learning for chip-macro placement in integrated circuit Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-07 Zhentao Tan, Yadong Mu
The complexity of chip design has consistently grown, adhering to Moore’s law. In this paper, we examine a crucial step in integrated circuit design called chip macro placement. Traditionally, human experts are consulted to optimize placement for reduced power consumption, but this requires significant effort. Recently, machine learning-based methods have emerged to address this task, showing promising
-
Enhanced blind face inpainting via structured mask prediction Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-07 Honglei Li, Yifan Zhang, Wenmin Wang
Blind face inpainting is the task of automatically recovering an occluded face image without given masks indicating missing areas. Popular inpainting methods assume that the occlusion patterns are known with given occlusion masks. Previous blind inpainting methods, ignoring the structure in faces and occlusions, treat occlusion detection as an independent pixel prediction problem. To overcome the limitations
-
N-QGNv2: Predicting the optimum quadtree representation of a depth map from a monocular camera Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-05 Daniel Braun, Olivier Morel, Cédric Demonceaux, Pascal Vasseur
Self-supervised monocular depth prediction is a widely researched field that aims to provide a better scene understanding. However, most existing methods prioritize prediction accuracy over computation cost, which can hinder the deployment of these methods in real-world applications. Our objective is to propose a solution that efficiently compresses the depth map while maintaining a high level of accuracy
-
-
Even small correlation and diversity shifts pose dataset-bias issues Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-03 Alceu Bissoto, Catarina Barata, Eduardo Valle, Sandra Avila
Distribution shifts hinder the deployment of deep learning in real-world problems. Distribution shifts appear when train and test data come from different sources, which commonly happens in practice. Despite shifts occurring concurrently in many forms (e.g., correlation and diversity shifts) and intensities, the literature focuses only on severe and isolated shifts. In this work, we propose a comprehensive
-
Subdivided Mask Dispersion Framework for semi-supervised semantic segmentation Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-02 Yooseung Wang, Jaehyuk Jang, Changick Kim
Learning the relationship between weak and strong perturbations has been considered a major part of semi-supervised semantic segmentation. We observed two problems with a publicly used perturbation method, which randomly generates a mask with a single large bounding box. The large single bounding box that entirely covers the important object components in an image, hindering the model from capturing
-
An analytic study on clustering driven self-supervised speaker verification Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-29 Abderrahim Fathan, Jahangir Alam
One of the most widely used self-supervised speaker verification system training methods is to optimize the speaker embedding network in a discriminative fashion using clustering algorithm-driven Pseudo-Labels. Although the pseudo-labels-based self-supervised training scheme showed impressive performance, recent studies have shown that label noise can significantly impact performance. In this paper
-
Listwise learning to rank method combining approximate NDCG ranking indicator with Conditional Generative Adversarial Networks Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-26 Jinzhong Li, Huan Zeng, Cunwei Xiao, Chunjuan Ouyang, Hua Liu
Some previous empirical studies have shown that the performances of the listwise learning to rank approaches are in general better than the pointwise or pairwise learning to rank techniques. The listwise learning to rank methods which directly optimize information retrieval indicators are a type of essential and popular method of learning to rank. However, the existing learning to rank approaches based
-
Channel-level Matching Knowledge Distillation for object detectors via MSE Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-24 Zetao Jiang, Qinyang Huang, Huijuan Zhang
Knowledge distillation (KD) has been widely used in different tasks as a practical model compression technique. Due to the poor performance of directly using Mean Square Error (MSE) between the intermediate features of the teacher and student, most feature-based detector distillation methods are primarily concerned with proposing diverse attention mechanisms and employing MSE to guide the student in
-
A deep learning-based global and segmentation-based semantic feature fusion approach for indoor scene classification Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-24 Ricardo Pereira, Tiago Barros, Luís Garrote, Ana Lopes, Urbano J. Nunes
This work proposes a novel approach that uses a semantic segmentation mask to obtain a 2D spatial layout of the segmentation-categories across the scene, designated by segmentation-based semantic features (SSFs). These features represent, per segmentation-category, the pixel count, as well as the 2D average position and respective standard deviation values. Moreover, a two-branch network, GSFApp, that
-
Adversarial mimicry attacks against image splicing forensics: An approach for jointly hiding manipulations and creating false detections Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-24 Giulia Boato, Francesco G.B. De Natale, Gianluca De Stefano, Cecilia Pasquini, Fabio Roli
-
Gauss-like Logarithmic Kernel Function to improve the performance of kernel machines on the small datasets Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-24 Betul Hicdurmaz, Nurullah Calik, Serpil Ustebay
Support vector machine is one of the most used machine learning algorithms with a comprehensive mathematical infrastructure. The power behind the algorithm is the kernel trick that enables the model to overcome non-linear data distributions by using functions that satisfy the Mercer condition. Undoubtedly, the radial basis function (RBF) is among the most widely used of these functions. The RBF kernel
-
Example forgetting and rehearsal in continual learning Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-24 B, e, a, t, r, i, x, , B, e, n, k, ő
A major challenge of training neural networks on different tasks in a sequential manner is catastrophic forgetting, where earlier experiences are forgotten while learning a new one. In recent years, rehearsal-based methods have become popular top-performing alleviation approaches. Rehearsal builds upon maintaining and repeatedly using for training a small buffer of data selected across encountered
-
Detection of visual pursuits using 1D convolutional neural networks Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-24 Alex Torquato S. Carneiro, Flavio Luiz Coutinho, Carlos H. Morimoto
The visual pursuit of moving targets is a natural behaviour that has been exploited in, for example, medical diagnosis, law enforcement, and human computer interaction. Most proposed algorithms to detect this behaviour are based on some kind of motion similarity metric that assumes small or no distortion between the trajectory described by the target being pursued and the sensor measurements. We propose
-
Adaptive proposal network based on generative adversarial learning for weakly supervised temporal sentence grounding Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-23 Weikang Wang, Yuting Su, Jing Liu, Peiguang Jing
Temporal sentence grounding aims to locate the moment most related to the given natural language query. Noticing the time-consuming labeling process of the temporal bounding boxes, recent works started to focus on the weakly supervised temporal sentence grounding (WTSG) with only video-text pairwise annotations. Existing WTSG methods mainly adopted anchor-based structure to generate moment candidates
-
Text-guided Fourier Augmentation for long-tailed recognition Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-23 Weiqiu Wang, Zining Chen, Fei Su, Zhicheng Zhao
Real-world data often exhibits a long-tailed distribution in practical scenarios. However, deep learning models usually face challenges when it comes to effectively identifying infrequent classes amidst the abundance of prevalent ones. The fundamental issue lies in the scarcity of available information for tail classes. A highly intuitive approach is to uncover a greater amount of valuable information
-
A synthetic human-centric dataset generation pipeline for active robotic vision Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-22 Charalampos Georgiadis, Nikolaos Passalis, Nikos Nikolaidis
Active vision aims to equip computer vision methods with the ability to dynamically adjust the capturing sensor’s viewpoint, position, or parameters in real time. This dynamic capability allows for improving the accuracy of the perception process. However, training and evaluating an active vision model often requires a large number of annotated images captured under different sensor and environmental
-
Reducing redundancy in the bottleneck representation of autoencoders Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-15 Firas Laakom, Jenni Raitoharju, Alexandros Iosifidis, Moncef Gabbouj
Autoencoders (AEs) are a type of unsupervised neural networks, which can be used to solve various tasks, e.g., dimensionality reduction, image compression, and image denoising. An AE has two goals: (i) compress the original input to a low-dimensional space at the bottleneck of the network topology using an encoder, (ii) reconstruct the input from the representation at the bottleneck using a decoder
-
EIGAN: An explicitly and implicitly feature-aligned GAN for degraded image classification Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-15 Jing Hu, Weiwei Zhong, Meiqi Zhang, Susu Kang, Ouyang Yan
The implementation of classification networks encounters a substantial decline in performance when subjected to degraded images due to factors such as blur, noise, and low resolution. Existing methods focus on addressing a specific kind of degraded images and thus cannot simultaneously adapt to multiple degradation scenarios. Besides, insufficient attention has been given to the causes of the decline
-
A multi-aspect framework for explainable sentiment analysis Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-07 Jothi Prakash V., Arul Antran Vijay S.
The demand for explainable sentiment analysis has intensified, emphasizing the need for models that are both accurate and interpretable. This research introduces the Multi-Aspect Framework for Explainable Sentiment Analysis (MAFESA), a groundbreaking model that seamlessly integrates aspect extraction, sentiment prediction, and explainability. By harnessing the power of Latent Dirichlet Allocation (LDA)
-
MCFP: A multi-target 3D perception method with weak dependence on 2D detectors Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-12 Haoran Guo, Mingyun He, Fan Li, Kexin He, Lina Chen
3D perception for multi-target in complex traffic environments plays an important role in autonomous driving. Nowadays, some mainstream models use 2D detectors to provide auxiliary information for 3D perception. But it is a challenge that the accuracy of 2D detectors has a sensitive impact on the final results. In this paper, we propose MCFP, a multi-target 3D perception model with weak dependence
-
Debiased Visual Question Answering via the perspective of question types Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-11 Tianyu Huai, Shuwen Yang, Junhang Zhang, Jiabao Zhao, Liang He
Visual Question Answering (VQA) aims to answer questions according to the given image. However, current VQA models tend to rely solely on textual information from the questions and ignore the visual information in the images to get answers, which is caused by bias that is generated during the training phase. Previous studies have shown that bias in VQA is mainly caused by the text modality, and our
-
Sequential visual and semantic consistency for semi-supervised text recognition Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-11 Mingkun Yang, Biao Yang, Minghui Liao, Yingying Zhu, Xiang Bai
Scene text recognition (STR) is a challenging task that requires large-scale annotated data for training. However, collecting and labeling real text images is expensive and time-consuming, which limits the availability of real data. Therefore, most existing STR methods resort to synthetic data, which may introduce domain discrepancy and degrade the performance of STR models. To alleviate this problem
-
An effective deep learning adversarial defense method based on spatial structural constraints in embedding space Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-11 Junzhong Miao, Xiangzhan Yu, Zhichao Hu, Yanru Song, Likun Liu, Zhigang Zhou
Deep neural networks are highly vulnerable to adversarial samples. Most existing adversarial defense methods do not consider the distribution of adversarial samples. We argue that very few adversarial samples in the natural sample set prevent the deep neural networks from learning a complete and effective representation of the adversarial samples. This causes the spatial structures between the natural
-
Time-aware and task-transferable adversarial attack for perception of autonomous vehicles Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-11 Yantao Lu, Haining Ren, Weiheng Chai, Senem Velipasalar, Yilan Li
With rapid development of self-driving vehicles, recent work in adversarial machine learning started to study adversarial examples (AEs) for perception of autonomous driving (AD). However, generating practical AEs for the perception module remains a significant challenge. Traditional adversarial attacks tend to focus on a single computer vision task, making it difficult to compromise multiple perception
-
Recovering a clean background: A parallel deep network architecture for single-image deraining Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-10 Nanrun Zhou, Jibin Deng, Meng Pang
Deep convolutional neural networks have been popularly applied in single-image deraining recently. Nevertheless, as the network becomes deeper, it is easy to cause training over-fitting and performance saturation, particularly in the case of insufficient training data. In this paper, we report the design of a new network, namely parallel deraining convolutional neural network (PARDNet), for single-image
-
CATNet: Cross-modal fusion for audio–visual speech recognition Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-09 Xingmei Wang, Jiachen Mi, Boquan Li, Yixu Zhao, Jiaxiang Meng
Automatic speech recognition (ASR) is a typical pattern recognition technology that converts human speeches into texts. With the aid of advanced deep learning models, the performance of speech recognition is significantly improved. Especially, the emerging Audio–Visual Speech Recognition (AVSR) methods achieve satisfactory performance by combining audio-modal and visual-modal information. However,
-
Multilevel depression status detection based on fine-grained prompt learning Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-08 Jun Zhang, Yanrong Guo
As a common psychological disorder, depression is generally detected based on scales and interviews, which are often affected by subjective or environmental factors. In order to assist in the diagnosis, automatic depression detection (ADD) is developed to provide an objective, efficient, and convenient technique based on the analysis of different psychophysiological data. Among them, text is one of
-
Patterns of vehicle lights: Addressing complexities of camera-based vehicle light datasets and metrics Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-08 Ross Greer, Akshay Gopalkrishnan, Maitrayee Keskar, Mohan M. Trivedi
This paper explores the representation of vehicle lights in computer vision and its implications for various pattern recognition tasks in autonomous driving. Different representations for vehicle lights, including bounding boxes, center points, corner points, and segmentation masks, are discussed in terms of their strengths and weaknesses toward a variety of domain tasks, as well as associated data
-
Multi-perspective thought navigation for source-free entity linking Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-05 Bohua Peng, Wei He, Bin Chen, Aline Villavicencio, Chengfu Wu
Neural entity-linking models excel at bridging the lexical gap of multiple facets of facts, such as entity-related claims or evidence documents. Despite advancements in self-supervised learning and pretrained language models, challenges persist in entity linking, particularly in interpretability and transferability. Moreover, these models need many aligned documents to adapt to emerging entities, which
-
TU2Net-GAN: A temporal precipitation nowcasting model with multiple decoding modules Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-05 XuDong Ling, ChaoRong Li, Peng Yang, Yuanyuan Huang, Fengqing Qin
With the Earth’s temperature rising and abnormal weather events becoming frequent, the mechanisms of precipitation formation have become increasingly complex, leading to more significant spatiotemporal variability. This increased variability often results in severe flooding events. Despite extensive research on deep learning methods for rainfall prediction, challenges such as forecasting uncertainty
-
Pyramid hybrid pooling quantization for efficient fine-grained image retrieval Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-05 Ziyun Zeng, Jinpeng Wang, Bin Chen, Tao Dai, Shu-Tao Xia, Zhi Wang
-
GLOCAL: A self-supervised learning framework for global and local motion estimation Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-05 Yihao Zheng, Kunming Luo, Shuaicheng Liu, Zun Li, Ye Xiang, Lifang Wu, Bing Zeng, Chang Wen Chen
-
SemanticFormer: Hyperspectral image classification via semantic transformer Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-05 Yan Liu, Xixi Wang, Bo Jiang, Lan Chen, Bin Luo
Hyperspectral image (HSI) classification is an active research problem in computer vision and multimedia field. Contrary to traditional image data, HSIs contain rich spectral, spatial and semantic information. Thus, how to extract the discriminative features for HSIs by integrating , and cues together is the core issue to address HSI classification task. Existing works mainly focus on exploring spectral
-
Micro-expression spotting with a novel wavelet convolution magnification network in long videos Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-01-05 Jianxiong Zhou, Ying Wu
Facial Micro-Expressions (MEs) are transient and spontaneous, reflecting a person's authentic internal emotions and have more significant value in many fields. Due to the presence of many background disturbances, including irrelevant motion (such as blinks and head movements) and noise in long videos, it is challenging to spot subtle MEs from these disturbances. To spot subtle MEs, a novel Wavelet