-
Dyhand: dynamic hand gesture recognition using BiLSTM and soft attention methods Vis. Comput. (IF 3.5) Pub Date : 2024-03-18
Abstract Hand gesture recognition is an essential task in computer vision. It is the most intuitive and natural medium for communication when dealing with computers. Recently, with the advent of innovative technologies and high performing computer systems, there has been a surge in the research of Gesture Recognition. Traditional approaches to modelling skeletons are typically based on hand-crafted
-
MakeupDiffuse: a double image-controlled diffusion model for exquisite makeup transfer Vis. Comput. (IF 3.5) Pub Date : 2024-03-18
Abstract Makeup transfer is a challenging task, involving the transfer of a reference makeup style onto the source face while preserving the original appearance. Current GAN-based methods, representing makeup styles through reduced-dimensional matrices, often generate smooth high-frequency attributes and imprecise images. Additionally, these models are difficult to train and prone to model collapse
-
Self-supervised facial expression recognition with fine-grained feature selection Vis. Comput. (IF 3.5) Pub Date : 2024-03-17 Heng-Yu An, Rui-Sheng Jia
-
Refined dense face alignment through image matching Vis. Comput. (IF 3.5) Pub Date : 2024-03-17 Chunlu Li, Feipeng Da
-
LSDNet: lightweight stochastic depth network for human pose estimation Vis. Comput. (IF 3.5) Pub Date : 2024-03-16 Hengrui Zhang, Yongfeng Qi, Huili Chen, Panpan Cao, Anye Liang, Shengcong Wen
-
A lightweight multi-granularity asymmetric motion mode video frame prediction algorithm Vis. Comput. (IF 3.5) Pub Date : 2024-03-16 Jie Yan, Guihe Qin, Minghui Sun, Yanhua Liang, Zhonghan Zhang, Yinghui Xu
-
Multi-camera tracking of mechanically thrown objects for automated in-plant logistics by cognitive robots in Industry 4.0 Vis. Comput. (IF 3.5) Pub Date : 2024-03-16 Nauman Qadeer, Jamal Hussain Shah, Muhammad Sharif, Fadl Dahan, Fahad Ahmed Khokhar, Rubina Ghazal
-
Robust extrinsic symmetry estimation in 3D point clouds Vis. Comput. (IF 3.5) Pub Date : 2024-03-15 Rajendra Nagar
-
SLOD2+WIN: semantics-aware addition and LoD of 3D window details for LoD2 CityGML models with textures Vis. Comput. (IF 3.5) Pub Date : 2024-03-15 Xingzi Zhang, Kan Chen, Henry Johan, Marius Erdt
-
HAPiCLR: heuristic attention pixel-level contrastive loss representation learning for self-supervised pretraining Vis. Comput. (IF 3.5) Pub Date : 2024-03-15
Abstract Recent self-supervised contrastive learning methods are powerful and efficient for robust representation learning, pulling semantic features from different cropping views of the same image while pushing other features away from other images in the embedding vector space. However, model training for contrastive learning is quite inefficient. In the high-dimensional vector space of the images
-
Image deblocking algorithm based on GC and SSR Vis. Comput. (IF 3.5) Pub Date : 2024-03-13 Zhe Li, Hui Lv, Libo Cheng, Xiaoning Jia
-
Spectral normalization and dual contrastive regularization for image-to-image translation Vis. Comput. (IF 3.5) Pub Date : 2024-03-13
Abstract Existing image-to-image (I2I) translation methods achieve state-of-the-art performance by incorporating the patch-wise contrastive learning into generative adversarial networks. However, patch-wise contrastive learning only focuses on the local content similarity but neglects the global structure constraint, which affects the quality of the generated images. In this paper, we propose a new
-
CurveML: a benchmark for evaluating and training learning-based methods of classification, recognition, and fitting of plane curves Vis. Comput. (IF 3.5) Pub Date : 2024-03-12
Abstract We propose CurveML, a benchmark for evaluating and comparing methods for the classification and identification of plane curves represented as point sets. The dataset is composed of 520k curves, of which 280k are generated from specific families characterised by distinctive shapes, and 240k are obtained from Bézier or composite Bézier curves. The dataset was generated starting from the parametric
-
Ghost-Unet: multi-stage network for image deblurring via lightweight subnet learning Vis. Comput. (IF 3.5) Pub Date : 2024-03-10 Ziliang Feng, Ju Zhang, Xusong Ran, Donglu Li, Chengfang Zhang
-
A multi-target cow face detection model in complex scenes Vis. Comput. (IF 3.5) Pub Date : 2024-03-08
Abstract The development of intelligent agriculture has accelerated the automation and scale of cattle farming. Recognizing cattle faces as a significant biological characteristic is crucial for accurate reproduction and health tracking. We propose a multi-target cow face detection model (MT-CF-DM) in complex scenes based on the YOLOv7 framework. The backbone of our proposal model consists of the GhostNet
-
A video compression-cum-classification network for classification from compressed video streams Vis. Comput. (IF 3.5) Pub Date : 2024-03-08
Abstract Video analytics can achieve increased speed and efficiency by operating directly on the compressed video format, thereby alleviating the decoding burden on the analytics server. The encoded video streams are rich in semantic binary information and this information can be utilized more efficiently to train the classifiers. Motivated by the same notion, a deep learning-based video compressi
-
Memory-based gradient-guided progressive propagation network for video deblurring Vis. Comput. (IF 3.5) Pub Date : 2024-03-06
Abstract Video deblurring is a challenging visual task because it requires handling temporal correlations among frames and dealing with various sources of uncertainty in motion blur. To tackle these challenges and effectively capture the spatiotemporal relationships in video sequences, we propose a memory-based gradient-guided progressive propagation network. Our network combines the memory and deblurring
-
TripleFormer: improving transformer-based image classification method using multiple self-attention inputs Vis. Comput. (IF 3.5) Pub Date : 2024-03-01 Yu Gong, Peng Wu, Renjie Xu, Xiaoming Zhang, Tao Wang, Xuan Li
-
Channel and spatial attention-guided network for deep high dynamic range imaging with large motions Vis. Comput. (IF 3.5) Pub Date : 2024-03-01 Pingwei Zhang, Wenbiao Zhou, Luyao Fan
-
Transferable adversarial sample purification by expanding the purification space of diffusion models Vis. Comput. (IF 3.5) Pub Date : 2024-02-13 Jun Ji, Song Gao, Wei Zhou
-
Efficient odd–even multigrid for pointwise incompressible fluid simulation on GPU Vis. Comput. (IF 3.5) Pub Date : 2024-02-13 Luan Lyu, Wei Cao, Xiaohua Ren, Enhua Wu, Zhi-Xin Yang
-
CAF-AHGCN: context-aware attention fusion adaptive hypergraph convolutional network for human-interpretable prediction of gigapixel whole-slide image Vis. Comput. (IF 3.5) Pub Date : 2024-02-13 Meiyan Liang, Xing Jiang, Jie Cao, Bo Li, Lin Wang, Qinghui Chen, Cunlin Zhang, Yuejin Zhao
-
Grownbb: Gromov–Wasserstein learning of neural best buddies for cross-domain correspondence Vis. Comput. (IF 3.5) Pub Date : 2024-02-12 Ruolan Tang, Weiwei Wang, Yu Han, Xiangchu Feng
-
StairNetV3: depth-aware stair modeling using deep learning Vis. Comput. (IF 3.5) Pub Date : 2024-02-12 Chen Wang, Zhongcai Pei, Shuang Qiu, Yachun Wang, Zhiyong Tang
-
Improved biharmonic kernel signature for 3D non-rigid shape matching and retrieval Vis. Comput. (IF 3.5) Pub Date : 2024-02-12 Yuhuan Yan, Mingquan Zhou, Dan Zhang, Shengling Geng
-
MaCo: efficient unsupervised low-light image enhancement via illumination-based magnitude control Vis. Comput. (IF 3.5) Pub Date : 2024-02-09 Yiqi Shi, Duo Liu, Liguo Zhang, Xuezhi Xia, Jianguo Sun
-
Masked cross-attention and multi-head channel attention guiding single-stage generative adversarial networks for text-to-image generation Vis. Comput. (IF 3.5) Pub Date : 2024-02-09
Abstract Although the text-to-image model aims to generate realistic images that correspond to the text description, generating high-quality, and accurate images remains a significant challenge. Most existing text-to-image methods are implemented through a two-stage stacking model, where the generation process is initiated by creating an initial image with a basic outline and subsequently refined to
-
A novel robust digital image watermarking scheme based on attention U-Net++ structure Vis. Comput. (IF 3.5) Pub Date : 2024-02-08
Abstract With the advancement of the internet, digital image watermarking techniques have found widespread application across various domains, including copyright protection and information security. However, traditional digital image watermarking techniques are susceptible to geometric distortions due to their limited feature extraction capabilities and reliance on manually designed watermark embedding
-
Improved image dehazing model with color correction transform-based dark channel prior Vis. Comput. (IF 3.5) Pub Date : 2024-02-08 Jeena Thomas, Ebin Deni Raj
-
Parallel multi-image encryption based on cross-plane DNA manipulation and a novel 2D chaotic system Vis. Comput. (IF 3.5) Pub Date : 2024-02-07
Abstract In this paper, we propose a novel parallel multi-image encryption algorithm based on cross-plane DNA operations. Firstly, a two-dimensional chaotic system, 2D-SCIM, is constructed. Secondly, for a set of images, whether they are color images, grayscale images, or their combinations, we perform bit-plane decomposition according to the channels without limitations on quantity and arrangement
-
Attribute-guided face adversarial example generation Vis. Comput. (IF 3.5) Pub Date : 2024-02-07 Yan Gan, Xinyao Xiao, Tao Xiang
-
Image inpainting based on fusion structure information and pixelwise attention Vis. Comput. (IF 3.5) Pub Date : 2024-02-07 Dan Wu, Jixiang Cheng, Zhidan Li, Zhou Chen
-
Residual deep gated recurrent unit-based attention framework for human activity recognition by exploiting dilated features Vis. Comput. (IF 3.5) Pub Date : 2024-02-06 Ajeet Pandey, Piyush Kumar
-
A human activity recognition framework in videos using segmented human subject focus Vis. Comput. (IF 3.5) Pub Date : 2024-02-06 Shaurya Gupta, Dinesh Kumar Vishwakarma, Nitin Kumar Puri
-
A deep learning-based steganography method for high dynamic range images Vis. Comput. (IF 3.5) Pub Date : 2024-02-06 Yongqing Huo, Yan Qiao, Yaohui Liu
-
PCCFormer: Parallel coupled convolutional transformer for image super-resolution Vis. Comput. (IF 3.5) Pub Date : 2024-02-05 Bowen Hou, Gongyan Li
-
Multi-scale gradient wavelet-based image quality assessment Vis. Comput. (IF 3.5) Pub Date : 2024-02-05 Mobina Mobini, Mohammad Reza Faraji
-
Jointly modeling association and motion cues for robust infrared UAV tracking Vis. Comput. (IF 3.5) Pub Date : 2024-02-03
Abstract UAV tracking plays a crucial role in computer vision by enabling real-time monitoring UAVs, enhancing safety and operational capabilities while expanding the potential applications of drone technology. Off-the-shelf deep learning based trackers have not been able to effectively address challenges such as occlusion, complex motion, and background clutter for UAV objects in infrared modality
-
Graphical representation of data prediction potential: correlation graphs and correlation chains Vis. Comput. (IF 3.5) Pub Date : 2024-01-23 Adam Dudáš
-
A point cloud self-learning network based on contrastive learning for classification and segmentation Vis. Comput. (IF 3.5) Pub Date : 2024-01-23 Haoran Zhou, Wenju Wang, Gang Chen, Xiaolin Wang
-
Comparing dimensionality reduction techniques for visual analysis of the LSTM hidden activity on multi-dimensional time series modeling Vis. Comput. (IF 3.5) Pub Date : 2024-01-22 Lianen Ji, Shirong Qiu, Zhi Xu, Yue Liu, Guang Yang
-
ISA-GAN: inception-based self-attentive encoder–decoder network for face synthesis using delineated facial images Vis. Comput. (IF 3.5) Pub Date : 2024-01-22 Nand Kumar Yadav, Satish Kumar Singh, Shiv Ram Dubey
-
An improved iterative closest point algorithm based on the particle filter and K-means clustering for fine model matching Vis. Comput. (IF 3.5) Pub Date : 2024-01-22 Ahmad Reza Saleh, Hamid Reza Momeni
-
SMC-SRGAN-Lightning super-resolution algorithm based on optical micro-scanning thermal microscope image Vis. Comput. (IF 3.5) Pub Date : 2024-01-22 Meijing Gao, Yang Bai, Yunjia Xie, Bozhi Zhang, Shiyu Li, Zhilong Li
-
Outfit compatibility model using fully connected self-adjusting graph neural network Vis. Comput. (IF 3.5) Pub Date : 2024-01-22 Hong Liu, Li Li, Neng Yu, Kai Ma, Tao Peng, Xinrong Hu
-
A smart video analytical framework for sarcasm detection using novel adaptive fusion network and SarcasNet-99 model Vis. Comput. (IF 3.5) Pub Date : 2024-01-22 Jamuna S. Murthy, G. M. Siddesh
-
An efficient parallel fusion structure of distilled and transformer-enhanced modules for lightweight image super-resolution Vis. Comput. (IF 3.5) Pub Date : 2024-01-22 Guanqiang Wang, Mingsong Chen, Yongcheng Lin, Xianhua Tan, Chizhou Zhang, Wenxin Yao, Baihui Gao, Weidong Zeng
-
Improving cache placement for efficient cache-based rendering Vis. Comput. (IF 3.5) Pub Date : 2024-01-21 Yu-Ting Wu, I-Chao Shen
-
Defocus blur detection via adaptive cross-level feature fusion and refinement Vis. Comput. (IF 3.5) Pub Date : 2024-01-20 Zijian Zhao, Hang Yang, Peiyu Liu, Haitao Nie, Zhongbo Zhang, Chunyu Li
-
Regularity-constrained point cloud reconstruction of building models via global alignment Vis. Comput. (IF 3.5) Pub Date : 2024-01-20 Hang Yu, Juan Cao, Xiangrong Liu, Zhonggui Chen
-
Hearing with the eyes: modulating lyrics typography for music visualization Vis. Comput. (IF 3.5) Pub Date : 2024-01-19
Abstract In human–computer interaction (HCI), typography was initially used for visual communication, which enhanced visual interest in graphic design. The investigation of how modulating visual elements (e.g., typography) to visualize sound (e.g., voice) has received substantial attention. Musical lyrics typography is a commonly used form of visual communication. However, the mapping of musical features
-
Pupil localization algorithm based on lightweight convolutional neural network Vis. Comput. (IF 3.5) Pub Date : 2024-01-18
Abstract Pupil localization is one of the most critical and essential requirements for eye gaze estimation and eye movement tracking. Because pupil images contain monotonous and uncomplicated information, the dataset uses a single class of labels to describe the image content, and using convolutional neural networks can quickly and accurately identify the pupil position on the input image. On low-resolution
-
A motion denoising algorithm with Gaussian self-adjusting threshold for event camera Vis. Comput. (IF 3.5) Pub Date : 2024-01-18
Abstract Event cameras, characterized by their low power consumption, expansive dynamic range, and high temporal resolution, have attracted great attentions in various computer vision tasks. Compared to frame-based cameras, event cameras exemplify a marked paradigmatic transition in data formation and output. However, the quality of event streams is compromised by background activity and hot pixels
-
FS-3DSSN: an efficient few-shot learning for single-stage 3D object detection on point clouds Vis. Comput. (IF 3.5) Pub Date : 2024-01-18 Alok Kumar Tiwari, G. K. Sharma
-
Salient-aware multiple instance learning optimized network for weakly supervised object detection Vis. Comput. (IF 3.5) Pub Date : 2024-01-18
Abstract In recent years, weakly supervised object detection network has achieved great development. However, due to the lack of bounding box supervision, the framework based on multiple instance learning tends to activate a part of the object rather than the whole object, which severely affects the detection performance for nonrigid objects. To solve this problem, this paper uses traditional features
-
Defect detection in automotive glass based on modified YOLOv5 with multi-scale feature fusion and dual lightweight strategy Vis. Comput. (IF 3.5) Pub Date : 2024-01-17 Zhe Chen, Shihao Huang, Hui Lv, Zhixue Luo, Jinhao Liu
-
A multi-color and multistage collaborative network guided by refined transmission prior for underwater image enhancement Vis. Comput. (IF 3.5) Pub Date : 2024-01-12 Ting Ouyang, Yongjun Zhang, Haoliang Zhao, Zhongwei Cui, Yitong Yang, Yujie Xu
-
Attention-based network for passive non-light-of-sight reconstruction in complex scenes Vis. Comput. (IF 3.5) Pub Date : 2024-01-10
Abstract Passive non-line-of-sight (NLOS) reconstruction has received considerable success in diverse fields. However, the existing reconstruction methods ignore that complex scenes attenuate object-related information and view object-related information and noise in measured images as equivalent, yielding low-quality recovery. We propose an attention-based encoder–decoder (AED) network to tackle this
-
Complexity aware center loss for facial expression recognition Vis. Comput. (IF 3.5) Pub Date : 2024-01-09 Huihui Li, Xu Yuan, Chunlin Xu, Rui Zhang, Xiaoyong Liu, Lianqi Liu
-
Lipschitz-agnostic, efficient and accurate rendering of implicit surfaces Vis. Comput. (IF 3.5) Pub Date : 2024-01-08 Rene Winchenbach, Michael Möller, Andreas Kolb