样式: 排序: IF: - GO 导出 标记为已读
-
Spatiotemporal feature learning for no-reference gaming content video quality assessment J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-03-13 Ngai-Wing Kwong, Yui-Lam Chan, Sik-Ho Tsang, Ziyin Huang, Kin-Man Lam
Recently, over-the-top live gaming content video (GCV) services have significantly contributed to the overall internet traffic. Consequently, there is a growing demand of GCV quality assessment (GCVQA) to maintain service quality. Although recent literature has proposed a few GCVQA methods, these mainly focus on extracting spatial features and temporal fusion separately, limiting their performance
-
LIIS: Low-light image instance segmentation J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-03-13 Wei Li, Ya Huang, Xinyuan Zhang, Guijin Han
Image features in low-light scenes become hard to distinguish and full of noise, which makes the performance of current popular instance segmentation models drastically degraded. We propose a two-stage approach for instance segmentation of low-light images with enhancement followed by segmentation. Stage-I corresponds to the Low-Light Image Enhancement (LLIE) process. We propose a post-processing Detail
-
Unsupervised single image dehazing — A contour approach J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-03-12 Chintan Dave, Hetal Patel, Ahlad Kumar
Small particles present in the air degrade the visual clarity of images due to light scattering phenomena caused by particles. This degradation of the image is in the context of attenuation of light intensity and poor contrast, which ultimately have an impact on the image’s quality. Thus, image dehazing is a necessity for better visualization and image analysis. The proposed method uses an unsupervised
-
Dense-sparse representation matters: A point-based method for volumetric medical image segmentation J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-03-11 Yun Jiang, Bingxi Liu, Zequn Zhang, Yao Yan, Huanting Guo, Yuhang Li
Deep learning methods utilizing Convolutional Neural Networks (CNNs) and Transformers have achieved remarkable success in volumetric medical image analysis. While successful, the symmetrical structure of numerous networks pays insufficient attention to the encoding phase, and the large amount of memory occupied by voxels leads to unnecessary redundancy in the network. In this paper, we present a novel
-
Adapting projection-based LiDAR semantic segmentation to natural domains J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-03-05 Kelian J.L. Massa, Hans Grobler
In this paper, an approach to the semantic segmentation of 3D LiDAR point clouds obtained from natural scenes is introduced. Using a state-of-the-art projection-based semantic segmentation model as the core segmentation network, several recent advances in projection-based 3D semantic segmentation methods are aggregated into a single model. These adaptions include: scan unfolding, soft-kNN post-processing
-
Lightweight Patch-Wise Casformer for dynamic scene deblurring J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-03-04 Ziyi Chen, Guangmang Cui, Zihan Li, Jufeng Zhao
In dynamic scenes, motion blur can often occur, which is non-uniform and can be difficult to remove. Recently, the Transformer has shown excellent performance in various image-related tasks such as classification, recognition, and segmentation. Using a Transformer-based backbone network has also shown potential advantages in image deblurring. However, the computational complexity of Transformers increases
-
Occupancy map-based low complexity motion prediction for video-based point cloud compression J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-03-04 Yihan Wang, Yongfang Wang, Tengyao Cui, Zhijun Fang
This paper proposes an occupancy map-based low complexity motion prediction method for video-based point cloud compression (V-PCC). We propose to utilize the occupancy map, direction gradient, and regional dispersion to divide the attribute maps into static, complex, and common blocks. Then, we propose an early termination method for static blocks, an adaptive motion search range method for complex
-
FFLDGA-Net: Image retrieval method based on Feature Fusion Learnable Descriptor Graph Attention Network J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-03-04 Xiaoyu Hu, Xingmei Wang, Dongmei Yang, Wei Ren, Jinli Wang, Bo Liu
Image retrieval aims to retrieve and return the image in the database that is most similar to the query image. However, the performance of image retrieval models is often hindered by the limited dimensionality of images, which lacks depth information about objects. To address this issue, we propose a novel image retrieval model called FFLDGA-Net (Feature Fusion-based Learnable Descriptor Graph Attention
-
TransGANomaly: Transformer based Generative Adversarial Network for Video Anomaly Detection J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-03-04 Nazia Aslam, Maheshkumar H. Kolekar
Video anomaly detection aims to identify a set of abnormal events in videos. Deep reconstruction and prediction-based models have been employed to detect anomalies. Deep reconstruction models sometimes recreate the abnormal events along with the normal ones. However, the prediction-based approaches have demonstrated encouraging results. This paper presents a video vision transformer (ViViT) based generative
-
Integrating category-related key regions with a dual-stream network for remote sensing scene classification J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-03-02 Fen Xiao, Xiang Li, Wei Li, Junjie Shi, Ningru Zhang, Xieping Gao
Remote sensing image scene classification has made great progress with deep learning. Due to complex backgrounds and the large number of objects with inhomogeneous sizes, remote sensing image scene classification still remains challenging. In fact, only a few parts of regions are expected to be representative of the scene. In this paper, we propose a dual-stream framework for remote sensing classification
-
A 4-channelled hazy image input generation and deep learning-based single image dehazing J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-03-01 Pavan Kumar Balla, Arvind Kumar, Rajoo Pandey
The images in foggy weather are often degraded which may affect the object detection process. Hence, in recent times, several schemes have been designed to eliminate the haze effect and enhance the performance of computer vision systems. Although these methods reduce the haze effect, they also produce over-saturation and over-degradation on most occasions. These problems occur due to inadequate utilization
-
Non-iterative reversible information hiding in the secret sharing domain J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-28 Feng-Qing Liu, Tao Liu, Bin Yan, Jeng-Shyang Pan, Hong-Mei Yang
Long running time due to multiple iterations is the main drawback of existing information hiding in the sharing domain (IHSD) algorithm. To address this problem, we propose non-iterative reversible information hiding in the secret sharing domain (NIIHSD). We calculate the coefficients of the polynomial that make the least significant bit of the shadow pixel equal to ‘0’ or ‘1’ and store them in two
-
COC-UFGAN: Underwater image enhancement based on color opponent compensation and dual-subnet underwater fusion generative adversarial network J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-28 Zhenkai Liu, Xinxiao Fu, Chi Lin, Haiyong Xu
Due to the complex underwater environment and the attenuation of light, the underwater image is produced with various distortions such as color loss, low contrast, noise, blur, and haze-like, which bring significant challenges to underwater image applications. To alleviate these problems, a novel color opponent compensation and a dual-subnet underwater fusion generative adversarial network (COC-UFGAN)
-
Blind image quality assessment with semi-supervised learning J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-28 Xiwen Li, Zhihua Wang, Binwei Xu
Blind image quality assessment (BIQA) aims to automatically predict the perceptual quality of an image without requiring access to its pristine reference counterpart. BIQA models are typically developed through supervised learning, optimizing and testing them by comparing their predictions to human ratings, usually expressed as mean opinion scores (MOS), which can be labor-intensive to collect. The
-
Audio-visual saliency prediction for movie viewing in immersive environments: Dataset and benchmarks J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-28 Zhao Chen, Kao Zhang, Hao Cai, Xiaoying Ding, Chenxi Jiang, Zhenzhong Chen
In this paper, an eye-tracking dataset of movie viewing in the immersive environment is developed, which contains 256 movie clips with 2K QHD resolution and corresponding movie genre labels from IMDb (Internet Movie Database). The dataset provides the audio-visual clues for studying the human visual attention when watching movie using a VR headset, by recording the eye movements using integrated eye
-
Dual contrastive attention-guided deformable convolutional network for single image super-resolution J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-27 Fengjuan Qiao, Yonggui Zhu, Guofang Li, Bin Li
With its powerful ability to model geometric transformations, the deformable convolutional network brings great improvements for single image super-resolution (SISR). Nevertheless, its location-variant sampling method leads to an escalation in spatial variance as the deformable convolutional layers are stacked, consequently resulting in limited performance. Hence, we propose a novel and effective approach
-
AI-assisted deepfake detection using adaptive blind image watermarking J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-24 L, i, n, g, -, Y, u, a, n, , H, s, u
This paper proposes a new adaptive blind watermarking technology for deepfake detection, which can embed deepfake detection information into the image and verify the image's authenticity without requiring additional information. The proposed scheme utilizes mixed modulation combined with partly sign-altered mean value to embed a set of coefficients that enhance robustness against attacks while maintaining
-
AMP-BCS: AMP-based image block compressed sensing with permutation of sparsified DCT coefficients J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-19 Junhui Li, Xingsong Hou, Huake Wang, Shuhao Bi, Xueming Qian
Block compressive sensing (BCS), an emerging approach for signal acquisition and reconstruction, combines high-speed sampling and compression, making it widely applicable in various imaging tasks. However, image BCS generally face the issues: challenges in accurate sampling rate allocation (SRA) and block artifact removal, and poor reconstruction algorithms. In this paper, we propose an approximate
-
LFSimCC: Spatial fusion lightweight network for human pose estimation J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-16 Qian Zheng, Hualing Guo, Yunhua Yin, Bin Zheng, Hongxu Jiang
To address the limitations of existing 2D human pose estimation methods in terms of speed and lightweight, we propose a method called Lightweight Fusion SimCC (LFSimCC). LFSimCC incorporates two modules: LiteFNet, which enhances multi-scale spatial information fusion, and LKC-GAU, which improves the modeling capability of spatial information. Specifically, LiteFNet utilizes a combination of self-attention
-
Learning dual attention enhancement feature for visible–infrared person re-identification J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-15 Guoqing Zhang, Yinyin Zhang, Hongwei Zhang, Yuhao Chen, Yuhui Zheng
Most previous visible–infrared person re-identification methods emphasized learning modality-shared features to narrow the modality differences, while neglecting the benefits of modality-specific features for feature embedding and narrowing the modality gap. To tackle this issue, our paper designs a method based on dual attention enhancement features to use shallow and deep features simultaneously
-
Multi frame multi-head attention learning on deep features for recognizing Indian classical dance poses J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-15 Anil Kumar D., Kishore P.V.V., Chaithanya T.R., Sravani K.
The aim of this work is to develop a classifier for the Indian classical dance (ICD) online Bharatanatyam videos using deep learning techniques. Bharatanatyam is the most ancient and popular of all eight types of Indian classical dance forms recognized by the government of India. Many ICD enthusiasts struggle to synchronize their minds to the lyrics and complex dance poses during live performances
-
Multiple correlation filters with gaussian constraint for fast online tracking J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-12 Jianyi Liu, Xingxing Huang, Xinyu Shu, Xudong Dong
The correlation filter based online tracking methods usually can achieve high real-time performance due to the leverage of the well-known FFT. However, they are also apt to generate the “corrupted training samples” in scenarios with complex background, which will trigger the model drift and deteriorate the tracking accuracy rapidly. The existing methods usually consider this problem from certain aspect
-
Double branch synergies with modal reinforcement for weakly supervised temporal action detection J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-12 Chuanxu Wang, Jing Wang, Wenting Xu
Weakly supervised Temporal Action localization (WTAL) aims to locate the action instances and identify their corresponding labels. Most current methods rely on a Multi-Instance Learning (MIL) framework to predict start and end boundaries of each action in a video. However, they have shortcomings of incomplete positioning and context confusion. Therefore, we propose an algorithm of Double Branch Synergies
-
Multiple object tracking with segmentation and interactive multiple model J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-07 Ke Qi, Wenhao Xu, Wenbin Chen, Xi Tao, Peijia Chen
Multiple object tracking (MOT) is a sophisticated computer vision task that aims to detect and track the trajectories of all objects within a given scene. MOT necessitates the establishment of unique identifiers for each object in the scene and currently, the majority of MOT works adopt tracking-by-detection, using re-identification techniques to associate objects based on appearance or motion features
-
-
Copy Move Forgery detection and localisation robust to rotation using block based Discrete Cosine Transform and eigenvalues J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-07 A.U. Shehin, Deepa Sankar
The contemporary era faces a widespread issue with digital image forgery, posing a significant challenge due to its ease and the broad reach enabled by high-speed internet. This manipulation of images carries substantial socio-political implications globally. Hence, robust digital image forensic methods are critical for detecting such forgeries. This article presents an innovative algorithm specifically
-
Correlation-attention guided regression network for efficient crowd counting J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-07 Xin Zeng, Huake Wang, Qiang Guo, Yunpeng Wu
As a valuable component of intelligent video surveillance, crowd counting has received lots of attention. In practice, however, crowd counting always suffers from the problem of the scale change of pedestrians. To mitigate this limitation, we propose a novel correlation-attention guided regression network to estimate the number of people, termed CGR-Net. To make the generation process of spatial attention
-
MRN-LOD: Multi-exposure Refinement Network for Low-light Object Detection J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-06 Kavinder Singh, Anil Singh Parihar
Low-light conditions present a myriad of intricacies for object detection, with many existing methods relying primarily on image enhancement before detection. Sometimes, the enhancement methods are unable to improve the detection performance in low-light conditions. In this paper, we present a new Multi-exposure refinement network for low-light object detection (MRN-LOD) to avoid the need for enhancement
-
Context-aided unicity matching for person re-identification J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-05 Min Cao, Cong Ding, Chen Chen, Silong Peng
Most existing person re-identification methods compute the matching relations between person images based on the similarity ranking. It lacks the global viewpoint and context consideration, inevitably leading to ambiguous matching results and sub-optimal performance. Based on a natural assumption that images belonging to the same identity should not match with images belonging to different identities
-
Blind quality assessment of light field image based on view and focus stacks J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-05 Fucui Li, Mengmeng Ye, Feng Shao
Light field imaging can capture rich information of real scenes, but various distortions will inevitably be introduced in the process of light field image processing. Therefore, it is very important to effectively evaluate the quality of light field images. Due to the lack of commercial light field displays, this paper proposes a blind quality assessment method of light field image based on view and
-
Methods for countering attacks on image watermarking schemes: Overview J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-02 Anna Melman, Oleg Evsutin
Image watermarking is an effective and promising technology. Robust watermarks that are resistant to various attacks allow authors and owners of digital images to protect their rights to digital content, control its distribution and confirm its authenticity. Most of the modern algorithms for robust image watermarking aim to achieve resistance to a large number of different attacks. However, some authors
-
An active contour model based on Jeffreys divergence and clustering technology for image segmentation J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-01 Pengqiang Ge, Yiyang Chen, Guina Wang, Guirong Weng
Classic active contour models (ACMs) generally implement Euclidean distance to measure the gap between true image and fitted one, which may cause issues such as edge leakage and falling into false boundary. In addition, some existing ACMs are sensitive to noise and different initial contours. To resolve these problems, this study raises an ACM based on Jeffreys divergence (KJD) and clustering technique
-
A channel-wise contextual module for learned intra video compression J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-02-01 Yanrui Zhan, Shuhua Xiong, Xiaohai He, Bowen Tang, Honggang Chen
In the multimedia era, exploding image and video data highlight the importance of video compression for storage and transmission. The All-Intra structure is a coding mode in HEVC and VVC, in which each frame is encoded using intra coding, and in this paper learned All-Intra coding is explored on the basis of the research of the learned image compression. A channel-wise contextual module based on channel
-
Multi-image super-resolution based low complexity deep network for image compressive sensing reconstruction J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-30 Qiming Xiong, Zhirong Gao, Jiayi Ma, Yong Ma
-
Low-complexity [formula omitted]-compression of light field images with a deep-decompression stage J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-30 M. Umair Mukati, Xi Zhang, Xiaolin Wu, Søren Forchhammer
To enrich the functionalities of traditional cameras, light field cameras record both the intensity and direction of light rays, so that images can be rendered with user-defined camera parameters via computations. The added capability and flexibility are gained at the cost of gathering typically more than 100 perspectives of the same scene, resulting in large data volume. To cope with this issue, several
-
UnifiedTT: Visual tracking with unified transformer J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-29 Peng Yu, Zhuolei Duan, Sujie Guan, Min Li, Shaobo Deng
Target tracking is an important research task in computer vision. Existing tracking algorithms based on Siamese networks often suffer from the problem of information redundancy between adjacent frames and a lack of ability to capture global dependencies. When similar backgrounds appear around the target, the tracking performance usually significantly decreases. Although target tracking algorithms based
-
CTHD-Net: CNN-Transformer hybrid dehazing network via residual global attention and gated boosting strategy J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-29 Haiyan Li, Renchao Qiao, Pengfei Yu, Haijiang Li, Mingchuan Tan
Single image dehazing is one of crucial tasks in the field of computer vision. However, existing methods are challenged on how to handle unevenly distributed haze, capture global contextual information, and filter noise while preserving details. To overcome these limitations, a novel dehazing network with residual global attention and gated boosting strategy based on a CNN-Transformer hybrid architecture
-
Coarse-to-fine underwater image enhancement with lightweight CNN and attention-based refinement J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-28 Ali Khandouzi, Mehdi Ezoji
-
Deep-MDS framework for recovering the 3D shape of 2D landmarks from a single image J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-26 Shima Kamyab, Zohreh Azimifar
Using 3D reconstruction techniques within computer vision frameworks can result in more robust and accurate solutions. However, the main challenge lies in the high computation and memory resources required by such methods. To reduce the complexity of these frameworks, a practical solution is to use geometric landmarks instead of the entire image. Therefore, in this paper the problem of 3D shape recovery
-
Learning informative and discriminative semantic features for robust facial expression recognition J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-23 Yumei Tan, Haiying Xia, Shuxiang Song
Facial expression recognition (FER) becomes challenging in real-world scenarios, which requires learning informative and discriminative features from challenging datasets to obtain robust facial expression recognition. In this paper, we propose an Informative and Discriminative Semantic Features Learning (IDSFL) network for FER against occlusion and head pose in the wild. Specifically, IDSFL aims to
-
Self-supervised learning monocular depth estimation from internet photos J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-26 Xiaocan Lin, Nan Li
Monocular depth estimation (MDE) is a fundamental problem in computer vision. Recently, self-supervised learning (SSL) approaches have attracted significant attention due to the ability to train an MDE network without ground-truth depth data. However, the performance of most existing SSL-MDE methods is yet limited by the available real training dataset, which are either binocular stereo pairs or monocular
-
DRC: Chromatic aberration intensity priors for underwater image enhancement J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-23 Qian Liu, Zongxin He, Dehuan Zhang, Weishi Zhang, Zifan Lin, Ferdous Sohel
Underwater imaging technology is a crucial tool for monitoring marine flora and fauna. However, selective light absorption and scattering properties of water make underwater imagery frequently appear blurred and exhibit color biases, hindering the extraction of vital aquacultural insights. To address this challenge, we propose a method, namely DRC, which is a holistic approach to enhancing underwater
-
Context-based modeling for accurate logo detection in complex environments J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-17 Zhixiang Jia, Sujuan Hou, Peng Li
Logo detection involves the tasks of locating and classifying logo objects in images and videos, and has been widely applied in the real world. However, most existing approaches rely on general object detection strategies that do not fully utilize the unique characteristics of logos. This can lead to sub-optimal performance in complex environments, especially when logos are small or have varying sizes
-
Exploring a context-gated network for effective image deraining J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-17 Tianyu Song, Pengpeng Li, Shumin Fan, Jiyu Jin, Guiyue Jin, Lei Fan
The existing deraining methods have obtained noteworthy improvements, but it is a challenging problem to extend the methods for complicated rain conditions where rain streaks exhibit different distribution densities, sizes, shapes, etc. The main challenges are the ability to fully explore and utilize the multi-scale context information of rain streaks that maintain both global structure completeness
-
Learning a Holistic-Specific color transformer with Couple Contrastive constraints for underwater image enhancement and beyond J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-17 Debin Wei, Hongji Xie, Zengxi Zhang, Tiantian Yan
Underwater images suffer from different types of degradation due to medium characteristics and interfere with underwater tasks. While deep learning methods based on the Convolutional Neural Network (CNN) excel at detection tasks, they have inherent limitations when it comes to handling long-range dependencies. The enhanced images generated by these methods often have problems such as color cast, artificial
-
Texture-aware and color-consistent learning for underwater image enhancement J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-12 Shuteng Hu, Zheng Cheng, Guodong Fan, Min Gan, C.L. Philip Chen
Texture and color are pivotal factors for evaluating the quality of underwater image enhancement. However, current methods for enhancing underwater images still exhibit deficiencies in the restoration of texture and color. Diverging from previous approaches, we conduct heuristic modeling specifically targeting color and texture, culminating in the proposal of a texture-aware and color-consistent network
-
Surveillance video synopsis framework base on tube set J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-11 Yunzuo Zhang, Pengfei Zhu, Tingting Zheng, Puze Yu, Jianming Wang
Video synopsis technology can shorten the length of the video, which has attracted wide attention. However, due to the limitation of object extraction technology and the difficulty of preserving interactivity, synopsis videos will lose the semantic information of the original video. To address the above problems, we propose a video synopsis framework based on tube sets. Firstly, we propose a video
-
Fast HEVC inter-frame coding based on LSTM neural network technology J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-10 Chang Liu
High Efficiency Video Coding (HEVC) is the most commonly used video coding standard. However, its high coding complexity is a heavy burden for real-time video applications. But, coding tools designed based on traditional coding frameworks have reached limits. Furthermore, existing low-complexity video coding methods have not thoroughly analyzed the characteristics of compressed video, making it impossible
-
Subspace learning machine (SLM): Methodology and performance evaluation J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-11 Hongyu Fu, Yijing Yang, Vinod K. Mishra, C.-C. Jay Kuo
Inspired by the feedforward multilayer perceptron (FF-MLP), decision tree (DT) and extreme learning machine (ELM), a new classification model, called the subspace learning machine (SLM), is proposed in this work. SLM first identifies a discriminant subspace, S0, by examining the discriminant power of each input feature. Then, it uses probabilistic projections of features in S0 to yield 1D subspaces
-
A domain generalized person re-identification algorithm based on meta-bond domain alignment☆ J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-12 Baohua Zhang, Dongyang Wu, Xiaoqi Lu, Yongxiang Li, Yu Gu, Jianjun Li, Jingyu Wang
Domain Generalization (DG) model is an important tool to improve the robustness of person re-identification algorithm, but the domain gap makes it difficult to transfer knowledge cross-domain effectively. To solve the above problems, this paper proposes a generalization model based on a Meta-Bond Domain Alignment (M−BDA) model. To learn a generalizable model, a meta-learning strategy is introduced
-
Contextual recovery network for low-light image enhancement with texture recovery J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-11 Zhen Wang, Xiaohuan Zhang
Low-light image enhancement has been a challenging topic in computer vision. In order to recover colors and detailed textures in images, several data-driven enhancement based methods have been developed and obtained encouraging results. However, the network generalization ability is not satisfactory due to the uncertainty of the collected data. In order to address this issue, we propose a network with
-
Human skin detection: An unsupervised machine learning way J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-10 ABM Rezbaul Islam, Ali Alammari, Bill Buckles
Researchers have been involved for decades in search of an efficient skin detection method. However, current methods have not overcome the significant challenges of skin detection, such as variation of illumination, various skin tones of different ethnic groups, and many others. This research proposed a clustering and region-growing-based skin detection method to overcome these limitations. Together
-
SICNet: Learning selective inter-slice context via Mask-Guided Self-knowledge distillation for NPC segmentation J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-10 Jinhong Zhang, Bin Li, Qianhui Qiu, Hongqiang Mo, Lianfang Tian
Accurate segmentation of nasopharyngeal carcinoma (NPC) in magnetic resonance (MR) images is crucial for radiotherapy planning. However, vanilla 2D/3D deep convolutional networks fail to gain satisfying NPC segmentation results, since 3D methods suffer from inter-slice discontinuities and 2D methods lack inter-slice context learning. To address this problem, this paper proposes a 2.5D learning Selective
-
MFCANet: A road scene segmentation network based on Multi-Scale feature fusion and context information aggregation J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-10 Yunfeng Wang, Yi Zhou, Hao Wu, Xiyu Liu, Xiaodi Zhai, Kuizhi Sun, Chengliang Tian, Haixia Zhao, Tao Li, Wenguang Jia, Yan Zhang
Road scene segmentation is the basic task of autonomous driving. Recent representative scene segmentation methods adopt the full convolutional network based on the encoder-decoder. However, the framework can cause the loss of image fine-grained information in the process of down-sampling, feature extraction and feature fusion, resulting in blurred boundary details and chaotic segmentation effect. In
-
A hierarchical probabilistic underwater image enhancement model with reinforcement tuning J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-10 Wei Song, Zhihao Shen, Minghua Zhang, Yan Wang, Antonio Liotta
Underwater Image Enhancement (UIE) is a challenging problem due to the complex underwater environment. Traditional UIE methods can hardly adapt to various underwater environments. Deep learning-based UIE methods are more powerful but often rely on a large deal of real-world underwater images with distortion-free reference images. This gives rise to two issues: First, the reference images are highly
-
Neighbor2Global: Self-supervised image denoising for Poisson-Gaussian noise J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-05 Qiuqiu Chen, Yuanxiu Xing, Linlin Song
-
Subdomain alignment based open-set domain adaptation image classification J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-03 Kangkang Ji, Qingliang Zhang, Songhao Zhu
Domain adaptation has achieved great success in using labeled source domain samples to identify unlabeled target domain samples. Here, we aim to solve the open-set domain adaptation, which is different from the closed-set domain adaptation in that it contains categories in target domain that do not appear in source domain. To solve this problem, this paper proposes open-set domain adaptation model
-
Style Elimination and Information Restitution for generalizable person re-identification J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-04 Qian Zhao, Wentao Yu, Tangyu Ji
Domain generalizable person re-identification (DG ReID) aims to obtain a model that can be applied directly to unseen domains once trained on a set of source domains (datasets collected from different camera networks). Among current DG ReID methods, instance normalization is a promising solution to reduce the effect of domain bias, but it inevitably filters out some discriminative information. Besides
-
A survey on just noticeable distortion estimation and its applications in video coding J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-04 Guoxiang Wang, Hongkui Wang, Hui Li, Li Yu, Haibing Yin, Haifeng Xu, Zhen Ye, Junfeng Song
With the developing explosion in video data delivery, perceptual video coding (PVC) plays an increasingly significant role in video compression. The just noticeable distortion (JND) reflects the tolerance limit of human visual system (HVS) to coding distortion directly, resulting in that JND-based PVC is the most important branch of video coding. This paper provides an extensive overview of JND estimation
-
REQA: Coarse-to-fine assessment of image quality to alleviate the range effect J. Visual Commun. Image Represent. (IF 2.6) Pub Date : 2024-01-03 Bingheng Li, Fushuo Huo
Blind image quality assessment (BIQA) of User Generated Content (UGC) suffers from the range effect, which indicates that on the overall quality range, mean opinion score (MOS) and predicted MOS (pMOS) are well correlated, but when focusing on a particular narrow range, the correlation is lower. To tackle this problem, a novel method is proposed from coarse-grained metric to fine-grained prediction