样式: 排序: IF: - GO 导出 标记为已读
-
Alignment Relation is What You Need for Diagram Parsing IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-13 Xinyu Zhang, Lingling Zhang, Xin Hu, Jun Liu, Shaowei Wang, Qianying Wang
-
Enhancing Information Maximization with Distance-Aware Contrastive Learning for Source-Free Cross-Domain Few-Shot Learning IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-12 Huali Xu, Li Liu, Shuaifeng Zhi, Shaojing Fu, Zhuo Su, Ming-Ming Cheng, Yongxiang Liu
-
Part-Object Progressive Refinement Network for Zero-Shot Learning IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-12 Man Liu, Chunjie Zhang, Huihui Bai, Yao Zhao
-
YOLOH: You Only Look One Hourglass for Real-time Object Detection. IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-12 Shaobo Wang, Renhai Chen, Hongyue Wu, Xiaozhe Li, Zhiyong Feng
-
A Large-Scale Network Construction and Lightweighting Method for Point Cloud Semantic Segmentation IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-12 Jiawei Han, Kaiqi Liu, Wei Li, Guangzhi Chen, Wenguang Wang, Feng Zhang
-
COFT-AD: COntrastive Fine-Tuning for Few-Shot Anomaly Detection IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-12 Jingyi Liao, Xun Xu, Manh Cuong Nguyen, Adam Goodge, Chuan Sheng Foo
-
Hierarchical Prior-based Super Resolution for Point Cloud Geometry Compression IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-12 Dingquan Li, Kede Ma, Jing Wang, Ge Li
-
Self-Supervised Monocular Depth Estimation with Positional Shift Depth Variance and Adaptive Disparity Quantization IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-12 Juan Luis Gonzalez Bello, Jaeho Moon, Munchurl Kim
-
Layer-Specific Knowledge Distillation for Class Incremental Semantic Segmentation IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-12 Qilong Wang, Yiwen Wu, Liu Yang, Wangmeng Zuo, Qinghua Hu
-
Semi-supervised 3D Shape Segmentation via Self Refining IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-12 Zhenyu Shu, Teng Wu, Jiajun Shen, Shiqing Xin, Ligang Liu
-
Efficient Single Correspondence Voting for Point Cloud Registration IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-12 Xuejun Xing, Zhengda Lu, Yiqun Wang, Jun Xiao
-
Weakly-Supervised RGBD Video Object Segmentation IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-12 Jinyu Yang, Mingqi Gao, Feng Zheng, Xiantong Zhen, Rongrong Ji, Ling Shao, Aleš Leonardis
-
Surface-SOS: Self-Supervised Object Segmentation via Neural Surface Representation IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-12 Xiaoyun Zheng, Liwei Liao, Jianbo Jiao, Feng Gao, Ronggang Wang
-
Comprehensive Attribute Prediction Learning for Person Search by Language IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-12 Kai Niu, Linjiang Huang, Yuzhou Long, Yan Huang, Liang Wang, Yanning Zhang
-
NesTD-Net: Deep NESTA-Inspired Unfolding Network With Dual-Path Deblocking Structure for Image Compressive Sensing IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-08 Hongping Gan, Zhen Guo, Feng Liu
Deep compressive sensing (CS) has become a prevalent technique for image acquisition and reconstruction. However, existing deep learning (DL)-based CS methods often encounter challenges such as block artifacts and information loss during iterative reconstruction, particularly at low sampling rates, resulting in a reduction of reconstructed details. To address these issues, we propose NesTD-Net, an
-
CSFwinformer: Cross-Space-Frequency Window Transformer for Mirror Detection IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-07 Zhifeng Xie, Sen Wang, Qiucheng Yu, Xin Tan, Yuan Xie
Mirror detection is a challenging task since mirrors do not possess a consistent visual appearance. Even the Segment Anything Model (SAM), which boasts superior zero-shot performance, cannot accurately detect the position of mirrors. Existing methods determine the position of the mirror under hypothetical conditions, such as the correspondence between objects inside and outside the mirror, and the
-
VRT: A Video Restoration Transformer IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-07 Jingyun Liang, Jiezhang Cao, Yuchen Fan, Kai Zhang, Rakesh Ranjan, Yawei Li, Radu Timofte, Luc Van Gool
-
Image Understands Point Cloud: Weakly Supervised 3D Semantic Segmentation via Association Learning IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-07 Tianfang Sun, Zhizhong Zhang, Xin Tan, Yanyun Qu, Yuan Xie
Weakly supervised point cloud semantic segmentation methods that require 1% or fewer labels with the aim of realizing almost the same performance as fully supervised approaches have recently attracted extensive research attention. A typical solution in this framework is to use self-training or pseudo-labeling to mine the supervision from the point cloud itself while ignoring the critical information
-
HeadDiff: Exploring Rotation Uncertainty With Diffusion Models for Head Pose Estimation IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-07 Yaoxing Wang, Hao Liu, Yaowei Feng, Zhendong Li, Xiangjuan Wu, Congcong Zhu
In this paper, we propose a probabilistic regression diffusion model for head pose estimation, dubbed HeadDiff, which typically addresses the rotation uncertainty, especially when faces are captured in wild conditions. Unlike conventional image-to-pose methods which cannot explicitly establish the rotational manifold of head poses, our HeadDiff aims to ensure the pose rotation via the diffusion process
-
PointCAT: Contrastive Adversarial Training for Robust Point Cloud Recognition IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-07 Qidong Huang, Xiaoyi Dong, Dongdong Chen, Hang Zhou, Weiming Zhang, Kui Zhang, Gang Hua, Yueqiang Cheng, Nenghai Yu
-
HyperE2VID: Improving Event-Based Video Reconstruction via Hypernetworks IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-07 Burak Ercan, Onur Eker, Canberk Saglam, Aykut Erdem, Erkut Erdem
Event-based cameras are becoming increasingly popular for their ability to capture high-speed motion with low latency and high dynamic range. However, generating videos from events remains challenging due to the highly sparse and varying nature of event data. To address this, in this study, we propose HyperE2VID, a dynamic neural network architecture for event-based video reconstruction. Our approach
-
Task-Specific Normalization for Continual Learning of Blind Image Quality Models IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-07 Weixia Zhang, Kede Ma, Guangtao Zhai, Xiaokang Yang
In this paper, we present a simple yet effective continual learning method for blind image quality assessment (BIQA) with improved quality prediction accuracy, plasticity-stability trade-off, and task-order/-length robustness. The key step in our approach is to freeze all convolution filters of a pre-trained deep neural network (DNN) for an explicit promise of stability, and learn task-specific normalization
-
Mutual Information Driven Equivariant Contrastive Learning for 3D Action Representation Learning IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-07 Lilang Lin, Jiahang Zhang, Jiaying Liu
Self-supervised contrastive learning has proven to be successful for skeleton-based action recognition. For contrastive learning, data transformations are found to fundamentally affect the learned representation quality. However, traditional invariant contrastive learning is detrimental to the performance on the downstream task if the transformation carries important information for the task. In this
-
Consensus-Agent Deep Reinforcement Learning for Face Aging IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-07 Ling Lin, Hao Liu, Jinqiao Liang, Zhendong Li, Jiao Feng, Hu Han
Face aging tasks aim to simulate changes in the appearance of faces over time. However, due to the lack of data on different ages under the same identity, existing models are commonly trained using mapping between age groups. This makes it difficult for most existing aging methods to accurately capture the correspondence between individual identities and aging features, leading to generating faces
-
Unsupervised Modality-Transferable Video Highlight Detection With Representation Activation Sequence Learning IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-07 Tingtian Li, Zixun Sun, Xinyu Xiao
Identifying highlight moments of raw video materials is crucial for improving the efficiency of editing videos that are pervasive on internet platforms. However, the extensive work of manually labeling footage has created obstacles to applying supervised methods to videos of unseen categories. The absence of an audio modality that contains valuable cues for highlight detection in many videos also makes
-
Context Recovery and Knowledge Retrieval: A Novel Two-Stream Framework for Video Anomaly Detection IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-07 Congqi Cao, Yue Lu, Yanning Zhang
Video anomaly detection aims to find the events in a video that do not conform to the expected behavior. The prevalent methods mainly detect anomalies by snippet reconstruction or future frame prediction error. However, the error is highly dependent on the local context of the current snippet and lacks the understanding of normality. To address this issue, we propose to detect anomalous events not
-
Toward Robust Referring Image Segmentation IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-05 Jianzong Wu, Xiangtai Li, Xia Li, Henghui Ding, Yunhai Tong, Dacheng Tao
Referring Image Segmentation (RIS) is a fundamental vision-language task that outputs object masks based on text descriptions. Many works have achieved considerable progress for RIS, including different fusion method designs. In this work, we explore an essential question, “What if the text description is wrong or misleading?” For example, the described objects are not in the image. We term such a
-
RGBT Tracking via Challenge-Based Appearance Disentanglement and Interaction IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-05 Lei Liu, Chenglong Li, Yun Xiao, Rui Ruan, Minghao Fan
RGB and thermal source data suffer from both shared and specific challenges, and how to explore and exploit them plays a critical role in representing the target appearance in RGBT tracking. In this paper, we propose a novel approach, which performs target appearance representation disentanglement and interaction via both modality-shared and modality-specific challenge attributes, for robust RGBT tracking
-
Exploring Hierarchical Information in Hyperbolic Space for Self-Supervised Image Hashing IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-05 Rukai Wei, Yu Liu, Jingkuan Song, Yanzhao Xie, Ke Zhou
In real-world datasets, visually related images often form clusters, and these clusters can be further grouped into larger categories with more general semantics. These inherent hierarchical structures can help capture the underlying distribution of data, making it easier to learn robust hash codes that lead to better retrieval performance. However, existing methods fail to make use of this hierarchical
-
Sparse Action Tube Detection IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-04 Yixuan Li, Zhenzhi Wang, Zhifeng Li, Limin Wang
Action tube detection is a challenging task as it requires not only to locate action instances in each frame, but also link them in time. Existing action tube detection methods often employ multi-stage pipelines with complex designs and time-consuming linking procedure. In this paper, we present a simple end-to-end action tube detection method, termed as Sparse Tube Detector (STDet). Unlike those dense
-
SAAN: Similarity-aware attention flow network for change detection with VHR remote sensing images IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-01 Haonan Guo, Xin Su, Chen Wu, Bo Du, Liangpei Zhang
-
FDSR: An Interpretable Frequency Division Stepwise Process Based Single-Image Super-Resolution Network IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-28 Pengcheng Xu, Qun Liu, Huanan Bao, Ruhui Zhang, Lihua Gu, Guoyin Wang
Deep learning has excelled in single-image super-resolution (SISR) applications, yet the lack of interpretability in most deep learning-based SR networks hinders their applicability, especially in fields like medical imaging that require transparent computation. To address these problems, we present an interpretable frequency division SR network that operates in the image frequency domain. It comprises
-
Exploring the Application of Large-Scale Pre-Trained Models on Adverse Weather Removal IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-28 Zhentao Tan, Yue Wu, Qiankun Liu, Qi Chu, Le Lu, Jieping Ye, Nenghai Yu
Image restoration under adverse weather conditions (e.g., rain, snow, and haze) is a fundamental computer vision problem that has important implications for various downstream applications. Distinct from early methods that are specially designed for specific types of weather, recent works tend to simultaneously remove various adverse weather effects based on either spatial feature representation learning
-
SC_LPR: Semantically Consistent LiDAR Place Recognition Based on Chained Cascade Network in Long-Term Dynamic Environments IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-23 Dong Kong, Xu Li, Qimin Xu, Yue Hu, Peizhou Ni
-
Unsupervised Spectral Demosaicing With Lightweight Spectral Attention Networks IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-22 Kai Feng, Haijin Zeng, Yongqiang Zhao, Seong G. Kong, Yuanyang Bu
This paper presents a deep learning-based spectral demosaicing technique trained in an unsupervised manner. Many existing deep learning-based techniques relying on supervised learning with synthetic images, often underperform on real-world images, especially as the number of spectral bands increases. This paper presents a comprehensive unsupervised spectral demosaicing (USD) framework based on the
-
Zero-Shot Video Grounding With Pseudo Query Lookup and Verification IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-19 Yu Lu, Ruijie Quan, Linchao Zhu, Yi Yang
Video grounding, the process of identifying a specific moment in an untrimmed video based on a natural language query, has become a popular topic in video understanding. However, fully supervised learning approaches for video grounding that require large amounts of annotated data can be expensive and time-consuming. Recently, zero-shot video grounding (ZS-VG) methods that leverage pre-trained object
-
Multimodal Action Quality Assessment IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-19 Ling-An Zeng, Wei-Shi Zheng
Action quality assessment (AQA) is to assess how well an action is performed. Previous works perform modelling by only the use of visual information, ignoring audio information. We argue that although AQA is highly dependent on visual information, the audio is useful complementary information for improving the score regression accuracy, especially for sports with background music, such as figure skating
-
Dual-View Curricular Optimal Transport for Cross-Lingual Cross-Modal Retrieval IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-19 Yabing Wang, Shuhui Wang, Hao Luo, Jianfeng Dong, Fan Wang, Meng Han, Xun Wang, Meng Wang
Current research on cross-modal retrieval is mostly English-oriented, as the availability of a large number of English-oriented human-labeled vision-language corpora. In order to break the limit of non-English labeled data, cross-lingual cross-modal retrieval (CCR) has attracted increasing attention. Most CCR methods construct pseudo-parallel vision-language corpora via Machine Translation (MT) to
-
CLASH: Complementary Learning with Neural Architecture Search for Gait Recognition IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-16 Huanzhang Dou, Pengyi Zhang, Yuhan Zhao, Lu Jin, Xi Li
-
BPMTrack: Multi-Object Tracking With Detection Box Application Pattern Mining IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-16 Yan Gao, Haojun Xu, Jie Li, Xinbo Gao
The key to multi-object tracking is its stability and the retention of identity information. A common problem with most detection-based approaches is trusting and using all the detector outputs for the association. However, some settings of detectors can affect stable long-range tracking. Based on the principle of reducing the association noise in the detection processing step, we propose a new framework
-
TC-SfM: Robust Track-Community-Based Structure-From-Motion IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-16 Lei Wang, Linlin Ge, Shan Luo, Zihan Yan, Zhaopeng Cui, Jieqing Feng
Structure-from-Motion (SfM) aims to recover 3D scene structures and camera poses based on the correspondences between input images, and thus the ambiguity caused by duplicate structures (i.e., different structures with strong visual resemblance) always results in incorrect camera poses and 3D structures. To deal with the ambiguity, most existing studies resort to additional constraint information or
-
Neuromorphic Synergy for Video Binarization IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-15 Shijie Lin, Xiang Zhang, Lei Yang, Lei Yu, Bin Zhou, Xiaowei Luo, Wenping Wang, Jia Pan
Bimodal objects, such as the checkerboard pattern used in camera calibration, markers for object tracking, and text on road signs, to name a few, are prevalent in our daily lives and serve as a visual form to embed information that can be easily recognized by vision systems. While binarization from intensity images is crucial for extracting the embedded information in the bimodal objects, few previous
-
NormAUG: Normalization-Guided Augmentation for Domain Generalization IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-15 Lei Qi, Hongpeng Yang, Yinghuan Shi, Xin Geng
Deep learning has made significant advancements in supervised learning. However, models trained in this setting often face challenges due to domain shift between training and test sets, resulting in a significant drop in performance during testing. To address this issue, several domain generalization methods have been developed to learn robust and domain-invariant features from multiple training domains
-
Enhancing Face Recognition With Detachable Self-Supervised Bypass Networks IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-15 Mingjie He, Jie Zhang, Shiguang Shan, Xilin Chen
Attributed to the development of deep networks and abundant data, automatic face recognition (FR) has quickly reached human-level capacity in the past few years. However, the FR problem is not perfectly solved in case of large poses and uncontrolled occlusions. In this paper, we propose a novel bypass enhanced representation learning (BERL) method to improve face recognition under unconstrained scenarios
-
Training and Testing Texture Similarity Metrics for Structurally Lossless Compression IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-15 Kaixuan Zhang, Zhaochen Shi, Jana Zujovic, Huib de Ridder, René van Egmond, David L. Neuhoff, Thrasyvoulos N. Pappas
We present a systematic approach for training and testing structural texture similarity metrics (STSIMs) so that they can be used to exploit texture redundancy for structurally lossless image compression. The training and testing is based on a set of image distortions that reflect the characteristics of the perturbations present in natural texture images. We conduct empirical studies to determine the
-
Pseudo Label Association and Prototype-Based Invariant Learning for Semi-Supervised NIR-VIS Face Recognition IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-15 Weipeng Hu, Yiming Yang, Haifeng Hu
Remarkable success of the existing Near-InfraRed and VISible (NIR-VIS) approaches owes to sufficient labeled training data. However, collecting and tagging data from different domains is a time-consuming and expensive task. In this paper, we tackle the NIR-VIS face recognition problem in a semi-supervised manner, termed as semi-supervised NIR-VIS Heterogeneous Face Recognition (NIR-VIS-sHFR). To cope
-
Toward Generalized Few-Shot Open-Set Object Detection IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-15 Binyi Su, Hua Zhang, Jingzhi Li, Zhong Zhou
Open-set object detection (OSOD) aims to detect the known categories and reject unknown objects in a dynamic world, which has achieved significant attention. However, previous approaches only consider this problem in data-abundant conditions, while neglecting the few-shot scenes. In this paper, we seek a solution for the generalized few-shot open-set object detection (G-FOOD), which aims to avoid detecting
-
Progressive Frame-Proposal Mining for Weakly Supervised Video Object Detection IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-15 Mingfei Han, Yali Wang, Mingjie Li, Xiaojun Chang, Yi Yang, Yu Qiao
In this paper, we focus on the weakly supervised video object detection problem, where each training video is only tagged with object labels, without any bounding box annotations of objects. To effectively train object detectors from such weakly-annotated videos, we propose a Progressive Frame-Proposal Mining (PFPM) framework by exploiting discriminative proposals in a coarse-to-fine manner. First
-
Uncertainty Modeling for Gaze Estimation IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-15 Wenqi Zhong, Chen Xia, Dingwen Zhang, Junwei Han
-
Dual Branch Multi-Level Semantic Learning for Few-Shot Segmentation IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-14 Yadang Chen, Ren Jiang, Yuhui Zheng, Bin Sheng, Zhi-Xin Yang, Enhua Wu
Few-shot semantic segmentation aims to segment novel-class objects in a query image with only a few annotated examples in support images. Although progress has been made recently by combining prototype-based metric learning, existing methods still face two main challenges. First, various intra-class objects between the support and query images or semantically similar inter-class objects can seriously
-
Disentangled Cross-Modal Transformer for RGB-D Salient Object Detection and Beyond IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-14 Hao Chen, Feihong Shen, Ding Ding, Yongjian Deng, Chao Li
Previous multi-modal transformers for RGB-D salient object detection (SOD) generally directly connect all patches from two modalities to model cross-modal correlation and perform multi-modal combination without differentiation, which can lead to confusing and inefficient fusion. Instead, we disentangle the cross-modal complementarity from two views to reduce cross-modal fusion ambiguity: 1) Context
-
Plug-and-Play Image Reconstruction Is a Convergent Regularization Method IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-13 Andrea Ebner, Markus Haltmeier
Non-uniqueness and instability are characteristic features of image reconstruction methods. As a result, it is necessary to develop regularization methods that can be used to compute reliable approximate solutions. A regularization method provides a family of stable reconstructions that converge to a specific solution of the noise-free problem as the noise level tends to zero. The standard regularization
-
UCL-Dehaze: Toward Real-World Image Dehazing via Unsupervised Contrastive Learning IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-09 Yongzhen Wang, Xuefeng Yan, Fu Lee Wang, Haoran Xie, Wenhan Yang, Xiao-Ping Zhang, Jing Qin, Mingqiang Wei
While the wisdom of training an image dehazing model on synthetic hazy data can alleviate the difficulty of collecting real-world hazy/clean image pairs, it brings the well-known domain shift problem. From a different yet new perspective, this paper explores contrastive learning with an adversarial training effort to leverage unpaired real-world hazy and clean images, thus alleviating the domain shift
-
Active Factor Graph Network for Group Activity Recognition IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-09 Zhao Xie, Chang Jiao, Kewei Wu, Dan Guo, Richang Hong
Group activity recognition aims to identify a consistent group activity from different actions performed by respective individuals. Most existing methods focus on learning the interaction between each two individuals (i.e., second-order interaction). In this work, we argue that the second-order interactive relation is insufficient to address this task. We propose a third-order active factor graph network
-
Learning Domain Invariant Prompt for Vision-Language Models IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-09 Cairong Zhao, Yubin Wang, Xinyang Jiang, Yifei Shen, Kaitao Song, Dongsheng Li, Duoqian Miao
Prompt learning stands out as one of the most efficient approaches for adapting powerful vision-language foundational models like CLIP to downstream datasets by tuning learnable prompt vectors with very few samples. However, despite its success in achieving remarkable performance on in-domain data, prompt learning still faces the significant challenge of effectively generalizing to novel classes and
-
Learning Generalizable Models via Disentangling Spurious and Enhancing Potential Correlations IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-08 Na Wang, Lei Qi, Jintao Guo, Yinghuan Shi, Yang Gao
Domain generalization (DG) intends to train a model on multiple source domains to ensure that it can generalize well to an arbitrary unseen target domain. The acquisition of domain-invariant representations is pivotal for DG as they possess the ability to capture the inherent semantic information of the data, mitigate the influence of domain shift, and enhance the generalization capability of the model
-
Hybrid Perturbation Strategy for Semi-Supervised Crowd Counting IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-08 Xin Wang, Yue Zhan, Yang Zhao, Tangwen Yang, Qiuqi Ruan
A simple yet effective semi-supervised method is proposed in this paper based on consistency regularization for crowd counting, and a hybrid perturbation strategy is used to generate strong, diverse perturbations, and enhance unlabeled images information mining. The conventional CNN-based counting methods are sensitive to texture perturbation and imperceptible noises raised by adversarial attack, therefore
-
Automatic Quaternion-Domain Color Image Stitching IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-08 Jiaxue Li, Yicong Zhou
Taking advantages of the quaternion representation of the color image, this paper proposes a quaternion perceptual seamline detection model to generate the seamline in the quaternion domain. It considers seamline detection as a quaternion-domain color image labeling problem and minimizes the local-area quaternion perceptual difference cost to obtain the optimal seamline. To assess seamline quality
-
Universal and Scalable Weakly-Supervised Domain Adaptation IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-08 Xuan Liu, Ying Huang, Hao Wang, Zheng Xiao, Shigeng Zhang
Domain adaptation leverages labeled data from a source domain to learn an accurate classifier for an unlabeled target domain. Since the data collected in practical applications usually contain noise, the weakly-supervised domain adaptation algorithm has attracted widespread attention from researchers that tolerates the source domain with label noises or/and features noises. Several weakly-supervised
-
Bayesian Statistical Analysis for Bacterial Detection in Pulmonary Endomicroscopic Fluorescence Lifetime Imaging IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-02-07 Mehmet Demirel, Bethany Mills, Erin Gaughan, Kevin Dhaliwal, James R. Hopgood
Pneumonia, a respiratory disease often caused by bacterial infection in the distal lung, requires rapid and accurate identification, especially in settings such as critical care. Initiating or de-escalating antimicrobials should ideally be guided by the quantification of pathogenic bacteria for effective treatment. Optical endomicroscopy is an emerging technology with the potential to expedite bacterial