-
Material augmented semantic segmentation of point clouds for building elements Comput. Aided Civ. Infrastruct. Eng. (IF 9.6) Pub Date : 2024-04-17 Houhao Liang, Justin K. W. Yeoh, David K. H. Chua
Point clouds are utilized to enable automated engineering applications for their ability to represent spatial geometry. However, they inherently lack detailed surface textures, posing challenges in differentiating objects at the texture level. Hence, this study introduces a 2D–3D fusing approach, leveraging material properties recognized from registered images as an augmented feature to enhance deep
-
-
-
Multi-Relational Deep Hashing for Cross-Modal Search IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-16 Xiao Liang, Erkun Yang, Yanhua Yang, Cheng Deng
-
An efficient Bayesian method with intrusive homotopy surrogate model for stochastic model updating Comput. Aided Civ. Infrastruct. Eng. (IF 9.6) Pub Date : 2024-04-16 Hui Chen, Bin Huang, Heng Zhang, Kaiyi Xue, Ming Sun, Zhifeng Wu
This paper proposes a new stochastic model updating method based on the homotopy surrogate model (HSM) and Bayesian sampling. As a novel intrusive surrogate model, the HSM is established by the homotopy stochastic finite element (FE) method. Then combining the advanced delayed‐rejection adaptive Metropolis–Hastings sampling technology with HSM, the structural FE model can be updated by uncertain measurement
-
Integrated corridor management by cooperative traffic signal and ramp metering control Comput. Aided Civ. Infrastruct. Eng. (IF 9.6) Pub Date : 2024-04-16 Abdullah Al Farabi, Rasool Mohebifard, Ramin Niroumand, Ali Hajbabaie, Mohammed Hadi, Lily Elefteriadou
This paper formulates a cooperative traffic control methodology that integrates traffic signal timing and ramp metering decisions into an optimization model to improve traffic operations in a corridor network. A mixed integer linear model is formulated and is solved in real time within a model predictive controller framework, where the cell transmission model is used as the system state predictor.
-
GLPanoDepth: Global-to-Local Panoramic Depth Estimation IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-15 Jiayang Bai, Haoyu Qin, Shuichang Lai, Jie Guo, Yanwen Guo
-
Traffic prediction via clustering and deep transfer learning with limited data Comput. Aided Civ. Infrastruct. Eng. (IF 9.6) Pub Date : 2024-04-15 Xiexin Zou, Edward Chung
This paper proposes a method based on the clustering algorithm, deep learning, and transfer learning (TL) for short‐term traffic prediction with limited data. To address the challenges posed by limited data and the complex and diverse traffic patterns observed in traffic networks, we propose a profile model based on few‐shot learning to extract each detector's unique profiles. These profiles are then
-
A lightweight Transformer‐based neural network for large‐scale masonry arch bridge point cloud segmentation Comput. Aided Civ. Infrastruct. Eng. (IF 9.6) Pub Date : 2024-04-15 Yixiong Jing, Brian Sheil, Sinan Acikgoz
Transformer architecture based on the attention mechanism achieves impressive results in natural language processing (NLP) tasks. This paper transfers the successful experience to a 3D point cloud segmentation task. Inspired by newly proposed 3D Transformer neural networks, this paper introduces a new Transformer‐based module, which is called Local Geo‐Transformer. To alleviate the heavy memory consumption
-
Learning with Noisy Correspondence Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-13 Zhenyu Huang, Peng Hu, Guocheng Niu, Xinyan Xiao, Jiancheng Lv, Xi Peng
-
Ensemble Quadratic Assignment Network for Graph Matching Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-13 Haoru Tan, Chuang Wang, Sitong Wu, Xu-Yao Zhang, Fei Yin, Cheng-Lin Liu
-
ISTR: Mask-Embedding-Based Instance Segmentation Transformer IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-12 Jie Hu, Yao Lu, Shengchuan Zhang, Liujuan Cao
-
Deep Variation Prior: Joint Image Denoising and Noise Variance Estimation Without Clean Data IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-12 Rihuan Ke
-
Saliency Guided Deep Neural Network for Color Transfer With Light Optimization IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-12 Yuming Fang, Pengwei Yuan, Chenlei Lv, Chen Peng, Jiebin Yan, Weisi Lin
-
Data‐driven out‐of‐order model for synchronized planning, scheduling, and execution in modular construction fit‐out management Comput. Aided Civ. Infrastruct. Eng. (IF 9.6) Pub Date : 2024-04-12 Yishuo Jiang, Mingxing Li, Benedict Jun Ma, Ray Y. Zhong, George Q. Huang
Fit‐out operations in modular construction exhibit unique features, such as limited room space and diversly distributed operations in the building. These features pose significant challenges to planning, scheduling, and execution (PSE) of fit‐out activities due to operational parallelism, distributional diversity, and narrower constrained time window in modular construction. Hence, logistics‐operation
-
Estimation of load for tunnel lining in elastic soil using physics‐informed neural network Comput. Aided Civ. Infrastruct. Eng. (IF 9.6) Pub Date : 2024-04-11 G. Wang, Q. Fang, J. Wang, Q. M. Li, J. Y. Chen, Y. Liu
A reverse calculation method termed soil and lining physics‐informed neural network (SL‐PINN) is proposed for the estimation of load for tunnel lining in elastic soil based on radial displacement measurements of the tunnel lining. To achieve efficient and accurate calculations, the framework of SL‐PINN is specially designed to consider the respective displacement characteristics of surrounding soil
-
Single Stage Adaptive Multi-Attention Network for Image Restoration IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-10 Anas Zafar, Danyal Aftab, Rizwan Qureshi, Xinqi Fan, Pingjun Chen, Jia Wu, Hazrat Ali, Shah Nawaz, Sheheryar Khan, Mubarak Shah
-
High-quality and Diverse Few-shot Image Generation via Masked Discrimination IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-10 Jingyuan Zhu, Huimin Ma, Jiansheng Chen, Jian Yuan
-
RefQSR: Reference-Based Quantization for Image Super-Resolution Networks IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-10 Hongjae Lee, Jun-Sang Yoo, Seung-Won Jung
Single image super-resolution (SISR) aims to reconstruct a high-resolution image from its low-resolution observation. Recent deep learning-based SISR models show high performance at the expense of increased computational costs, limiting their use in resource-constrained environments. As a promising solution for computationally efficient network design, network quantization has been extensively studied
-
Nonconvex Robust High-Order Tensor Completion Using Randomized Low-Rank Approximation IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-10 Wenjin Qin, Hailin Wang, Feng Zhang, Weijun Ma, Jianjun Wang, Tingwen Huang
-
Smartphone‐based method for measuring maximum peak tensile and compressive strain Comput. Aided Civ. Infrastruct. Eng. (IF 9.6) Pub Date : 2024-04-10 Xixian Chen, Huan Li, Chenhao Zhao, Guangyi Zhou, Weijie Li, Xue Zhang, Xuefeng Zhao
This paper proposes an innovative smartphone‐based strain sensing method (named MaxCpture) for measuring maximum peak tensile and compressive strains. The MaxCpture method is able to record the maximum peak strain of a structure without continuous power supply and real‐time monitoring. This method combines the maximum peak strain sensor, a smartphone, and the microimage sensing algorithm. Crucially
-
Error-Aware Conversion from ANN to SNN via Post-training Parameter Calibration Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-08 Yuhang Li, Shikuang Deng, Xin Dong, Shi Gu
-
Source-Guided Target Feature Reconstruction for Cross-Domain Classification and Detection IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-09 Yifan Jiao, Hantao Yao, Bing-Kun Bao, Changsheng Xu
Existing cross-domain classification and detection methods usually apply a consistency constraint between the target sample and its self-augmentation for unsupervised learning without considering the essential source knowledge. In this paper, we propose a Source-guided Target Feature Reconstruction (STFR) module for cross-domain visual tasks, which applies source visual words to reconstruct the target
-
CRetinex: A Progressive Color-Shift Aware Retinex Model for Low-Light Image Enhancement Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-08 Han Xu, Hao Zhang, Xunpeng Yi, Jiayi Ma
-
An advanced cooperative multi-hive drone swarm system for global dynamic multi-source information awareness J. Ind. Inf. Integr. (IF 15.7) Pub Date : 2024-04-08 Jinkun Men, Chunmeng Zhao
With the advancement of unmanned aerial vehicle technology, dynamic monitoring with drones has been widely adopted to enhance multi-source information awareness capabilities. The cooperative strategy among drones still poses a significant challenge. Redundant actions within the drone swarm system can lead to a noticeable decrease in awareness performance. In this work, an advanced cooperative multi-hive
-
Relationship-Incremental Scene Graph Generation by a Divide-and-Conquer Pipeline with Feature Adapter IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-08 Xuewei Li, Guangcong Zheng, Yunlong Yu, Naye Ji, Xi Li
-
Context‐aware hand gesture interaction for human–robot collaboration in construction Comput. Aided Civ. Infrastruct. Eng. (IF 9.6) Pub Date : 2024-04-08 Xin Wang, Dharmaraj Veeramani, Fei Dai, Zhenhua Zhu
Construction robots play a pivotal role in enabling intelligent processes within the construction industry. User‐friendly interfaces that facilitate efficient human–robot collaboration are essential for promoting robot adoption. However, most of the existing interfaces do not consider contextual information in the collaborative environment. The situation where humans and robots work together in the
-
A digital shadow approach for enhancing process monitoring in wire arc additive manufacturing using sensor fusion J. Ind. Inf. Integr. (IF 15.7) Pub Date : 2024-04-06 Haochen Mu, Fengyang He, Lei Yuan, Philip Commins, Donghong Ding, Zengxi Pan
-
FSODv2: A Deep Calibrated Few-Shot Object Detection Network Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-04 Qi Fan, Wei Zhuo, Chi-Keung Tang, Yu-Wing Tai
Traditional methods for object detection typically necessitate a substantial amount of training data, and creating high-quality training data is time-consuming. We propose a novel Few-Shot Object Detection network (FSODv2) in this paper that aims to detect objects from previously unseen categories using only a few annotated examples. Attention RPN, Multi-Relation Detector, and Contrastive Training
-
DriftRec: Adapting Diffusion Models to Blind JPEG Restoration IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-05 Simon Welker, Henry N. Chapman, Timo Gerkmann
In this work, we utilize the high-fidelity generation abilities of diffusion models to solve blind JPEG restoration at high compression levels. We propose an elegant modification of the forward stochastic differential equation of diffusion models to adapt them to this restoration task and name our method DriftRec. Comparing DriftRec against an $L_{2}$ regression baseline with the same network architecture
-
Generalizing to Out-of-Sample Degradations via Model Reprogramming IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-05 Runhua Jiang, Yahong Han
Existing image restoration models are typically designed for specific tasks and struggle to generalize to out-of-sample degradations not encountered during training. While zero-shot methods can address this limitation by fine-tuning model parameters on testing samples, their effectiveness relies on predefined natural priors and physical models of specific degradations. Nevertheless, determining out-of-sample
-
E-fulfillment cost management in omnichannel retailing: An exploratory study Comput. Ind. (IF 10.0) Pub Date : 2024-04-05 Miguel Rodríguez-García, Iria González-Romero, Ángel Ortiz-Bas, José Carlos Prado-Prado
The purpose of this study is twofold: investigating how omnichannel (OC) retailers manage e-fulfillment costs and establishing how these costs relate to the evolution of OC retailers' e-fulfillment strategies. Experts in e-fulfillment from 34 European OC retailers across various sectors participated in an exploratory survey. The study's results reveal that although e-fulfillment costs significantly
-
-
AI‐enabled airport runway pavement distress detection using dashcam imagery Comput. Aided Civ. Infrastruct. Eng. (IF 9.6) Pub Date : 2024-04-05 Arman Malekloo, Xiaoyue Cathy Liu, David Sacharny
Maintaining airport runways is crucial for safety and efficiency, yet traditional monitoring relies on manual inspections, prone to time consumption and inaccuracy. This study pioneers the utilization of low‐cost dashcam imagery for the detection and geolocation of airport runway pavement distresses, employing novel deep‐learning frameworks. A significant contribution of our work is the creation of
-
-
Shared Manifold Regularized Joint Feature Selection for Joint Classification and Regression in Alzheimer’s Disease Diagnosis IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-04 Zhi Chen, Yongguo Liu, Yun Zhang, Jiajing Zhu, Qiaoqin Li, Xindong Wu
In Alzheimer’s disease (AD) diagnosis, joint feature selection for predicting disease labels (classification) and estimating cognitive scores (regression) with neuroimaging data has received increasing attention. In this paper, we propose a model named Shared Manifold regularized Joint Feature Selection (SMJFS) that performs classification and regression in a unified framework for AD diagnosis. For
-
Parallel heterogeneous data‐fusion convolutional neural networks for improved rail bridge strike detection Comput. Aided Civ. Infrastruct. Eng. (IF 9.6) Pub Date : 2024-04-04 Hussam Khresat, Jase D. Sitton, Brett A. Story
Low clearance rail bridges provide vital crossings for freight and passenger trains and are susceptible to frequent strikes from overheight vehicles or equipment. Impact detection systems can help ensure the safety of railroad bridges and their users; such systems streamline monitoring efforts by providing near real‐time strike notifications to rail managers responsible for assessing a bridge after
-
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-02
Abstract Motivated by biological evolution, this paper explains the rationality of Vision Transformer by analogy with the proven practical evolutionary algorithm (EA) and derives that both have consistent mathematical formulation. Then inspired by effective EA variants, we propose a novel pyramid EATFormer backbone that only contains the proposed EA-based transformer (EAT) block, which consists of
-
MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-02 Jianbin Zheng, Daqing Liu, Chaoyue Wang, Minghui Hu, Zuopeng Yang, Changxing Ding, Dacheng Tao
Existing multimodal conditional image synthesis (MCIS) methods generate images conditioned on any combinations of various modalities that require all of them must be exactly conformed, hindering the synthesis controllability and leaving the potential of cross-modality under-exploited. To this end, we propose to generate images conditioned on the compositions of multimodal control signals, where modalities
-
Orthogonal Spatial Binary Coding Method for High-Speed 3D Measurement IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-01 Haitao Wu, Yiping Cao, Yongbo Dai, Zhimi Wei
Temporal phase unwrapping based on single auxiliary binary coded pattern has been proven to be effective for high-speed 3D measurement. However, in traditional spatial binary coding, it often leads to an imbalance between the number of periodic divisions and codewords. To meet this challenge, a large codewords orthogonal spatial binary coding method is proposed in this paper. By expanding spatial multiplexing
-
Hierarchical Perceptual Noise Injection for Social Media Fingerprint Privacy Protection IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-01 Simin Li, Huangxinxin Xu, Jiakai Wang, Ruixiao Xu, Aishan Liu, Fazhi He, Xianglong Liu, Dacheng Tao
Billions of people share images from their daily lives on social media every day. However, their biometric information (e.g., fingerprints) could be easily stolen from these images. The threat of fingerprint leakage from social media has created a strong desire to anonymize shared images while maintaining image quality, since fingerprints act as a lifelong individual biometric password. To guard the
-
A response‐compatible ground motion generation method using physics‐guided neural networks Comput. Aided Civ. Infrastruct. Eng. (IF 9.6) Pub Date : 2024-04-01 Youshui Miao, Hao Kang, Wei Hou, Yang Liu, Yixin Zhang, Cheng Wang
Selecting or generating ground motions (GMs) that elicit seismic responses matching specific standards or expected benchmarks for nonlinear time‐history analysis (NLTHA) is crucial for ensuring the rationality of structural seismic design and analysis. Typical GM inputs for NLTHA, either natural or artificial, are normally spectrum‐compatible, which may produce significant variations in analysis results
-
Bilateral Context Modeling for Residual Coding in Lossless 3D Medical Image Compression IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-25 Xiangrui Liu, Meng Wang, Shiqi Wang, Sam Kwong
Residual coding has gained prevalence in lossless compression, where a lossy layer is initially employed and the reconstruction errors (i.e., residues) are then losslessly compressed. The underlying principle of the residual coding revolves around the exploration of priors based on context modeling. Herein, we propose a residual coding framework for 3D medical images, involving the off-the-shelf video
-
Anomaly Detection for Medical Images Using Heterogeneous Auto-Encoder IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-29 Shuai Lu, Weihang Zhang, He Zhao, Hanruo Liu, Ningli Wang, Huiqi Li
Anomaly detection is an important task for medical image analysis, which can alleviate the reliance of supervised methods on large labelled datasets. Most existing methods use a pixel-wise self-reconstruction framework for anomaly detection. However, there are two challenges of these studies: 1) they tend to overfit learning an identity mapping between the input and output, which leads to failure in
-
Region Aware Video Object Segmentation With Deep Motion Modeling IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-29 Bo Miao, Mohammed Bennamoun, Yongsheng Gao, Ajmal Mian
Current semi-supervised video object segmentation (VOS) methods often employ the entire features of one frame to predict object masks and update memory. This introduces significant redundant computations. To reduce redundancy, we introduce a Region Aware Video Object Segmentation (RAVOS) approach, which predicts regions of interest (ROIs) for efficient object segmentation and memory storage. RAVOS
-
Automated building damage assessment and large‐scale mapping by integrating satellite imagery, GIS, and deep learning Comput. Aided Civ. Infrastruct. Eng. (IF 9.6) Pub Date : 2024-03-29 Abdullah M. Braik, Maria Koliou
Efficient and accurate building damage assessment is crucial for effective emergency response and resource allocation following natural hazards. However, traditional methods are often time consuming and labor intensive. Recent advancements in remote sensing and artificial intelligence (AI) have made it possible to automate the damage assessment process, and previous studies have made notable progress
-
Knowledge-Augmented Visual Question Answering With Natural Language Explanation IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-28 Jiayuan Xie, Yi Cai, Jiali Chen, Ruohang Xu, Jiexin Wang, Qing Li
Visual question answering with natural language explanation (VQA-NLE) is a challenging task that requires models to not only generate accurate answers but also to provide explanations that justify the relevant decision-making processes. This task is accomplished by generating natural language sentences based on the given question-image pair. However, existing methods often struggle to ensure consistency
-
Robust Fine-Grained Visual Recognition With Neighbor-Attention Label Correction IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-28 Shunan Mao, Shiliang Zhang
Existing deep learning methods for fine-grained visual recognition often rely on large-scale, well-annotated training data. Obtaining fine-grained annotations in the wild typically requires concentration and expertise, such as fine category annotation for species recognition, instance annotation for person re-identification (re-id) and dense annotation for segmentation, which inevitably leads to label
-
Assessing user performance in augmented reality assembly guidance for industry 4.0 operators Comput. Ind. (IF 10.0) Pub Date : 2024-03-28 Emanuele Marino, Loris Barbieri, Fabio Bruno, Maurizio Muzzupappa
In the realm of smart manufacturing, Augmented Reality (AR) technology has gained increasing attention among researchers and manufacturers due to its practicality and adaptability. For this reason, it has been widely embraced in various industrial fields, especially for helping operators assemble products. Despite its widespread adoption, there is a debate in the research community about how effective
-
Multi‐network coordinated charging infrastructure planning for the self‐sufficient renewable power highway Comput. Aided Civ. Infrastruct. Eng. (IF 9.6) Pub Date : 2024-03-28 Tian‐Yu Zhang, En‐Jian Yao, Yang Yang, Hong‐Ming Yang, David Z. W. Wang
Developing a self‐sufficient renewable power (RP) road transport (SRPRT) system is an important future direction for transport–energy integration. More well‐developed studies must be conducted on the coordinated planning of transport, power supply, and power generation networks. This paper carries out the joint operation and planning of highway charging networks with the wind‐photovoltaic‐energy storage
-
SegViT v2: Exploring Efficient and Continual Semantic Segmentation with Plain Vision Transformers Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-01 Bowen Zhang, Liyang Liu, Minh Hieu Phan, Zhi Tian, Chunhua Shen, Yifan Liu
-
Label-Aware Calibration and Relation-Preserving in Visual Intention Understanding IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27 QingHongYa Shi, Mang Ye, Wenke Huang, Weijian Ruan, Bo Du
Visual intention understanding is a challenging task that explores the hidden intention behind the images of publishers in social media. Visual intention represents implicit semantics, whose ambiguous definition inevitably leads to label shifting and label blemish. The former indicates that the same image delivers intention discrepancies under different data augmentations, while the latter represents
-
Weakly-Supervised Contrastive Learning for Unsupervised Object Discovery IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27 Yunqiu Lv, Jing Zhang, Nick Barnes, Yuchao Dai
Unsupervised object discovery (UOD) refers to the task of discriminating the whole region of objects from the background within a scene without relying on labeled datasets, which benefits the task of bounding-box-level localization and pixel-level segmentation. This task is promising due to its ability to discover objects in a generic manner. We roughly categorize existing techniques into two main
-
Temporal Feature Fusion for 3D Detection in Monocular Video IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27 Haoran Cheng, Liang Peng, Zheng Yang, Binbin Lin, Xiaofei He, Boxi Wu
Previous monocular 3D detection works focus on the single frame input in both training and inference. In real-world applications, temporal and motion information naturally exists in monocular video. It is valuable for 3D detection but under-explored in monocular works. In this paper, we propose a straightforward and effective method for temporal feature fusion, which exhibits low computation cost and
-
Instance-Specific Semantic Augmentation for Long-Tailed Image Classification IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27 Jiahao Chen, Bing Su
Recent long-tailed classification methods generally adopt the two-stage pipeline and focus on learning the classifier to tackle the imbalanced data in the second stage via re-sampling or re-weighting, but the classifier is easily prone to overconfidence in head classes. Data augmentation is a natural way to tackle this issue. Existing augmentation methods either perform low-level transformations or
-
BadCM: Invisible Backdoor Attack Against Cross-Modal Learning IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27 Zheng Zhang, Xu Yuan, Lei Zhu, Jingkuan Song, Liqiang Nie
Despite remarkable successes in unimodal learning tasks, backdoor attacks against cross-modal learning are still underexplored due to the limited generalization and inferior stealthiness when involving multiple modalities. Notably, since works in this area mainly inherit ideas from unimodal visual attacks, they struggle with dealing with diverse cross-modal attack circumstances and manipulating imperceptible
-
Toward Accurate Human Parsing Through Edge Guided Diffusion IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27 Ting Liu, Hongkun Zhu, Yunchao Wei, Shikui Wei, Yao Zhao, Yanning Zhang
Existing human parsing frameworks commonly employ joint learning of semantic edge detection and human parsing to facilitate the localization around boundary regions. Nevertheless, the parsing prediction within the interior of the part contour may still exhibit inconsistencies due to the inherent ambiguity of fine-grained semantics. In contrast, binary edge detection does not suffer from such fine-grained
-
In Defense of Clip-Based Video Relation Detection IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27 Meng Wei, Long Chen, Wei Ji, Xiaoyu Yue, Roger Zimmermann
Video Visual Relation Detection (VidVRD) aims to detect visual relationship triplets in videos using spatial bounding boxes and temporal boundaries. Existing VidVRD methods can be broadly categorized into bottom-up and top-down paradigms, depending on their approach to classifying relations. Bottom-up methods follow a clip-based approach where they classify relations of short clip tubelet pairs and
-
Cross-Layer Contrastive Learning of Latent Semantics for Facial Expression Recognition IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27 Weicheng Xie, Zhibin Peng, Linlin Shen, Wenya Lu, Yang Zhang, Siyang Song
Convolutional neural networks (CNNs) have achieved significant improvement for the task of facial expression recognition. However, current training still suffers from the inconsistent learning intensities among different layers, i.e., the feature representations in the shallow layers are not sufficiently learned compared with those in deep layers. To this end, this work proposes a contrastive learning
-
Single-Image-Based Deep Learning for Segmentation of Early Esophageal Cancer Lesions IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-26 Haipeng Li, Dingrui Liu, Yu Zeng, Shuaicheng Liu, Tao Gan, Nini Rao, Jinlin Yang, Bing Zeng