-
S5: Sketch-to-image Synthesis via Scene and Size Sensing IEEE Multimed. (IF 3.2) Pub Date : 2024-03-11 Samah S. Baraheem, Tam V. Nguyen
-
Multimodal Temporal Fusion Transformers Are Good Product Demand Forecasters IEEE Multimed. (IF 3.2) Pub Date : 2024-03-07 Maarten Sukel, Stevan Rudinac, Marcel Worring
-
CNN Ensemble for Video Source Camera Forensics IEEE Multimed. (IF 3.2) Pub Date : 2024-03-05 Maryna Veksler, Ramazan Aygun, Kemal Akkaya, Sitharama Iyengar
-
Image-relevant Entities Knowledge aware News Image Captioning IEEE Multimed. (IF 3.2) Pub Date : 2024-02-07 Sonali Ajankar, Tanima Dutta
-
YOLO’s multiple-strategy PCB defect detection model IEEE Multimed. (IF 3.2) Pub Date : 2024-01-30 Xin Wang, Hongyan Zhang, Qianhe Liu, Wei Gong, Song Bai, Hezhen You
-
Cryptanalyzing an Image Encryption Algorithm Underpinned by 2D Lag-Complex Logistic Map IEEE Multimed. (IF 3.2) Pub Date : 2024-01-22 Chengqing Li, Xianhui Shen, Sheng Liu
-
Perceptual Hashing with Deep and Texture Features IEEE Multimed. (IF 3.2) Pub Date : 2024-01-17 Mengzhu Yu, Zhenjun Tang, Xiaoping Liang, Xianquan Zhang, Xinpeng Zhang
-
-
-
-
-
-
-
-
-
-
-
-
-
-
A Real-Time Image Encryption Algorithm for a Distributed Energy System Based on the 13-D Complex Chaotic Sequence IEEE Multimed. (IF 3.2) Pub Date : 2023-11-09 Zhe Huang, Fangfang Zhang, Lei Kou, Quande Yuan, Yuanhong Liu
In distribution energy systems (DESs), information security plays an essential role. Real-time encryption techniques can guarantee transmission rate and confidentiality simultaneously. We propose a novel real-time encryption algorithm for DESs. First, we construct a 13-D chaotic system based on quaternion, Chua’s circuit, and complex variables, and discuss the existence of the attractors and other
-
Taking a “Deep” Look at Multimedia Streaming IEEE Multimed. (IF 3.2) Pub Date : 2023-10-06 Balakrishnan Prabhakaran
Streaming multimedia content has become an integral part of our lives influencing the way we consume daily news, communicate with friends, family and in office, and entertain ourselves. Quality of multimedia content has been improving by leaps and bounds with advances in camera and other sensing technologies. In parallel, advances in multimedia display technologies have been equally amazing providing
-
Artistic Line Drawing Rendering With Priors of Depth and Edge Density IEEE Multimed. (IF 3.2) Pub Date : 2023-09-25 Qianwen Lu, Jinho Lee, Yuki Endo, Shunsuke Kamijo
Line drawing is a form of painting that uses lines as expressive elements and often employs the combination of abstraction and figuration (CAF) technique to enhance artistic expression. However, existing methods tend to focus on generating semantically accurate line drawings, resulting in lackluster images. We observe that by varying the details of lines in the drawing, it is possible to represent
-
DSMGN: Dual-Supervised Mask Generation Network for Infrared and Visible Image Fusion IEEE Multimed. (IF 3.2) Pub Date : 2023-09-06 Yong Yang, Yukun Xia, Shuying Huang, Weiguo Wan, Xuemei Sun
Due to the lack of reference images for the training of infrared and visible image fusion (IVIF) network, the deep learning models cannot fuse the modal features of different source images well, resulting in fusion results that are biased toward one modality. This study proposes an IVIF method based on a dual-supervised mask generation network (DSMGN) that includes three parts: an encoder–decoder-based
-
Short-Long-Term Propagation-Based Video Inpainting IEEE Multimed. (IF 3.2) Pub Date : 2023-08-28 Shibo Li, Shuyuan Zhu, Yuzhou Huang, Shuaicheng Liu, Bing Zeng, Muhammad Ali Imran, Qammer H. Abbasi, Jonathan Cooper
In this article, we propose a new method to inpaint videos with removed regions. Our method was developed based on combining both short-term propagation-based inpainting (STPI) and long-term propagation-based inpainting (LTPI) modules. The STPI module is designed to in-fill an image from a single frame with local reference information, while the LTPI module uses multiple STPI modules to inpaint the
-
A Novel Learning Dictionary for Sparse Coding-Based Key Point Detection IEEE Multimed. (IF 3.2) Pub Date : 2023-08-24 Phuoc-Thanh Hong, Ling Guan
Recently, a sparse coding-based key point detector (SCK) was proposed. An SCK shows very impressive performance compared with state-of-the-art key point detection methods on different challenging conditions, such as variations in scale, rotation, context, and nonuniform lighting. The rotational-invariant dictionary in the SCK is, however, manually generated using a time-consuming process of selecting
-
Encoding of Media Value Chain Processes Through Blockchains and MPEG-21 Smart Contracts for Media IEEE Multimed. (IF 3.2) Pub Date : 2023-08-11 Mirko Zichichi, Víctor Rodríguez-Doncel
This article describes the combination of the current set of MPEG-21 multimedia framework standards with distributed ledger technologies and smart contracts. Their gathering shapes the smart contracts for media, a specification that can be used to encode the terms and conditions of a contract for media-related delivery and consumption. We provide the implementation of a system based on the smart contract
-
Deep Blind Chest X-Ray Image Quality Assessment With Region-of-Interest-Guided Attention IEEE Multimed. (IF 3.2) Pub Date : 2023-08-02 Jianqiu Jin, Xiaoming Chen, Yu Meng, Xiangyang Gong, Bailin Yang, Yuk Ying Chung
The image quality assessment (IQA) for medical images has been challenging due to their usage for diagnostic purposes. Traditional convolutional-neural-network-based IQA models usually assess the image quality on a global scale, which is unsuitable for inferring the local quality of medical images from the diagnostic attention perspective. To alleviate this problem, we design and introduce a region-of-interest
-
eCubeLand: An Intelligent Multiview Video Data Modeling IEEE Multimed. (IF 3.2) Pub Date : 2023-07-19 Tanveer Hussain, Samee Ullah Khan, Waseem Ullah, Ijaz Ul Haq, Min Je Kim, Mi Young Lee, Sung Wook Baik
The extensive use of surveillance systems, particularly those installed in Internet of Things environments, leads to the continuous harvesting of tremendous amounts of video data. The effective analysis and management of these data are challenging tasks for surveillance experts due to unstructured storage and variability. We propose an intelligent modeling framework, offering a convenient representation
-
Recent Advances in Immersive Multimedia IEEE Multimed. (IF 3.2) Pub Date : 2023-07-06 Maha Abdallah, Balakrishnan Prabhakaran, Wei Cai, Cheng-Hsin Hsu
This special issue focuses on the recent advances in immersive multimedia, and presents eight articles that tackle a wide spectrum of challenges related to the adoption of immersive multimedia in novel applications spreading across various domains. Immersive multimedia has become a reality due to the active development of extended reality , which is an umbrella term for emulating a physical world in
-
Content-Aware Latent Semantic Direction Fusion for Multi-Attribute Editing IEEE Multimed. (IF 3.2) Pub Date : 2023-06-14 Xiwen Wei, Yihan Tang, Si Wu
For facial attribute editing, significant progress has been made in discovering semantic directions in the latent space of StyleGAN, and the manipulation is performed by mapping an input image to a latent code and then moving along a direction associated with a target attribute. In this case, multi-attribute editing typically needs a sequential transformation process, which may cause ineffective manipulation
-
Perceptual Authentication Hashing for Digital Images With Contrastive Unsupervised Learning IEEE Multimed. (IF 3.2) Pub Date : 2023-05-26 Guopeng Gao, Chuan Qin, Yaodong Fang, Yuanding Zhou
In recent years, many perceptual image hashing schemes for content authentication have been proposed based on classical methods and deep learning. However, most existing schemes target specific and limited content-preserving manipulations and cannot provide satisfactory robustness to unknown manipulations. In this work, we propose a new perceptual authentication hashing model for digital images based
-
Optimizing Multidimensional Perceptual Quality in Online Interactive Multimedia IEEE Multimed. (IF 3.2) Pub Date : 2023-05-18 Benjamin W. Wah, Jingxi X. Xu
Network latencies and losses in online interactive multimedia applications may lead to a degraded perception of quality, such as lower interactivity or sluggish responses. We can measure these degradations in perceptual quality by the just-noticeable difference, awareness, or probability of noticeability ($p_{\text{note}}$pnote); the latter measures the likelihood that subjects can notice a change
-
Welcome to the New Team Members IEEE Multimed. (IF 3.2) Pub Date : 2023-05-11 Balakrishnan Prabhakaran
It is my great honor to take over as the editor-in-chief (EIC) of IEEE MultiMedia. As a pioneer in disseminating breakthrough advances in multimedia research, technology, and the associated standards, IEEE MultiMedia continues to be a highly valuable source to multimedia researchers and other experts in academia and industry. This has been possible due to the tireless work of professor Shu-Ching Chen
-
The JPEG AI Standard: Providing Efficient Human and Machine Visual Data Consumption IEEE Multimed. (IF 3.2) Pub Date : 2023-05-11 João Ascenso, Elena Alshina, Touradj Ebrahimi
The Joint Photographic Experts Group (JPEG) AI learning-based image coding system is an ongoing joint standardization effort between International Organization for Standardization (ISO), International Electrotechnical Commission (IEC), and International Telecommunication Union - Telecommunication Sector (ITU-T) for the development of the first image coding standard based on machine learning (a subset
-
Interpretability of Machine Learning: Recent Advances and Future Prospects IEEE Multimed. (IF 3.2) Pub Date : 2023-05-02 Lei Gao, Ling Guan
The proliferation of machine learning (ML) has drawn unprecedented interest in the study of various multimedia contents such as text, image, audio, and video, among others. Consequently, understanding and learning ML-based representations have taken center stage in knowledge discovery in intelligent multimedia research and applications. Nevertheless, the black-box nature of contemporary ML, especially
-
Reviving Standard-Dynamic-Range Videos for High-Dynamic-Range Devices: A Learning Paradigm With Hybrid Attention Mechanisms IEEE Multimed. (IF 3.2) Pub Date : 2023-04-27 Peilin Chen, Wenhan Yang, Shiqi Wang
With the prevalence of high-dynamic-range (HDR) display devices, the demand to convert existing standard-dynamic-range television (SDRTV) video content to its corresponding HDR television (HDRTV) counterpart is growing exponentially. Herein, we propose a two-stage learning paradigm with hybrid attention mechanisms to fully exploit spatial, channelwise, and regional correlations for faithfully driving
-
PP8K: A New Dataset for 8K UHD Video Compression and Processing IEEE Multimed. (IF 3.2) Pub Date : 2023-04-21 Wei Gao, Hang Yuan, Guibiao Liao, Zixuan Guo, Jianing Chen
In the new era of ultra-high definition (UHD) videos, 8K is becoming more popular in diversified applications to boost the human visual experience and the performances of related vision tasks. However, researchers still suffer from the lack of 8K video sources to develop better processing algorithms for the compression, saliency detection, quality assessment, and vision analysis tasks. To ameliorate
-
Learning From Coding Features: High Efficiency Rate Control for AOMedia Video 1 IEEE Multimed. (IF 3.2) Pub Date : 2023-04-10 Yi Chen, Yunhao Mao, Shiqi Wang, Xianguo Zhang, Sam Kwong
Rate control, which typically includes bit allocation and quantization parameter (QP) determination, plays an important role in real-world video coding applications. In this article, we propose a novel rate control scheme for AOMedia Video 1 (AV1) that provides adaptive bit allocation and effective QP determination. In particular, two supporting vector regression models are learned for the hierarchical
-
VR2Gather: A Collaborative, Social Virtual Reality System for Adaptive, Multiparty Real-Time Communication IEEE Multimed. (IF 3.2) Pub Date : 2023-04-03 Irene Viola, Jack Jansen, Shishir Subramanyam, Ignacio Reimat, Pablo Cesar
Virtual reality telecommunication systems promise to overcome the limitations of current real-time teleconferencing solutions by enabling a better sense of immersion and fostering more natural interpersonal interactions. Many solutions that currently enable immersive teleconferencing employ synthetic avatars to represent their users. However, photorealistic reconstructions have been shown to increase
-
Specular Detection and Rendering for Immersive Multimedia IEEE Multimed. (IF 3.2) Pub Date : 2023-03-28 The Van Le, Yongho Choi, Jin Young Lee
Immersive multimedia has received a lot of attention because of its huge impact on user experience. To realize high immersion in virtual environments, many virtual views should be generated at arbitrary viewpoints with advanced display devices. However, specular regions, which affect user experience, have not been fully investigated in an immersive multimedia field. In this article, we propose specular
-
Bandwidth-Aware High-Efficiency Video Coding Design Scheme on a Multiprocessor System on Chip IEEE Multimed. (IF 3.2) Pub Date : 2023-03-08 Jui-Hung Hsieh, Zhi-Yu Zhang, Jing-Cheng Syu, Mao-Cheng Hsieh
H.265/high-efficiency video coding (HEVC) provides a multitude of video data compression to minimize data storage and data transmission while preserving video coding quality and ameliorating coding bit rates. However, HEVC encoder chips are frequently integrated into mobile multiprocessor system-on-chip (MPSoC) systems that adopt intelligent thermal and power management techniques for heat- and power-dissipation
-
A New Fog-Based Transmission Scheduler on the Internet of Multimedia Things Using a Fuzzy-Based Quantum Genetic Algorithm IEEE Multimed. (IF 3.2) Pub Date : 2023-02-28 Kouros Zanbouri, Hamza Mohammed Ridha Al-Khafaji, Nima Jafari Navimipour, Şenay Yalçın
The Internet of Multimedia Things (IoMT) has recently experienced a considerable surge in multimedia-based services. Due to the fast proliferation and transfer of massive data, the IoMT has service quality challenges. This article proposes a novel fog-based multimedia transmission scheme for the IoMT using the Sugano interference system with a quantum genetic optimization algorithm. The fuzzy system
-
Edge Intelligence-Empowered Immersive Media IEEE Multimed. (IF 3.2) Pub Date : 2023-02-22 Zhi Wang, Jiangchuan Liu, Wenwu Zhu
Recent years have witnessed many immersive media services and applications, ranging from 360° video streaming to augmented and virtual reality (VR) and the recent metaverse experiences. These new applications usually have common features, including high fidelity, immersive interaction, and open data exchange between people and the environment. As an emerging paradigm, edge computing has become increasingly
-
Blockchain-Empowered Privacy-Preserving Digital Object Trading in the Metaverse IEEE Multimed. (IF 3.2) Pub Date : 2023-02-22 Yao Xiao, Lei Xu, Can Zhang, Liehuang Zhu, Yan Zhang
The metaverse is an advanced digital world where users can have interactive and immersive experiences. Users enter the metaverse through digital objects created by extended reality and digital twin technologies. The ownership issue regarding these digital objects can be solved by the blockchain-based nonfungible token (NFT), which is of vital importance for the economics of the metaverse. Users can
-
Background Music for Studying: A Naturalistic Experiment on Music Characteristics and User Perception IEEE Multimed. (IF 3.2) Pub Date : 2023-02-14 Fanjie Li, Xiao Hu
Despite the advances in context-aware background music (BM) recommendation, automated BM selection for studying-related contexts is still challenging in that the BM has to not only increase users’ activation and task engagement but also avoid distraction. This study investigated how characteristics of BM linked to users’ perceptions on task engagement and distraction. In a one-week naturalistic user
-
A Cross-Domain Multimodal Supervised Latent Topic Model for Item Tagging and Cold-Start Recommendation IEEE Multimed. (IF 3.2) Pub Date : 2023-02-06 Rui Tang, Cheng Yang, Yuxuan Wang
Cross-domain data analysis is playing an increasingly important role in media convergence and can be adopted for many applications. Most existing methods consider the domain discrimination as the multimodal representation difference or the imbalanced item classification distribution, ignoring the different tag semantics among domains. To this end, we propose an explainable cross-domain multimodal supervised
-
Reversible Modal Conversion Model for Thermal Infrared Tracking IEEE Multimed. (IF 3.2) Pub Date : 2023-01-30 Yufei Zha, Fan Li, Huanyu Li, Peng Zhang, Wei Huang
Learning powerful CNN representation of the target is a key issue for thermal infrared (TIR) tracking. The lack of massive training TIR data is one of the obstacles to training the network in an end-to-end way from the scratch. Compared to the time-consuming and labor-intensive method of heavily relabeling data, we obtain trainable TIR images by leveraging the massive annotated RGB images in this article
-
Passthrough Mixed Reality With Oculus Quest 2: A Case Study on Learning Piano IEEE Multimed. (IF 3.2) Pub Date : 2023-01-25 Mariano Banquiero, Gracia Valdeolivas, Sergio Trincado, Natasha García, M.-Carmen Juan
Mixed reality (MR) in standalone headsets has many advantages over other types of devices. With the recent appearance of the Passthrough of Oculus Quest 2, new possibilities open up. This work details the features of the current Passthrough and how its potential was harnessed and its drawbacks minimized for developing a satisfying MR experience. It has been applied to learning to play the piano as
-
Edge Distraction-aware Salient Object Detection IEEE Multimed. (IF 3.2) Pub Date : 2023-01-10 Sucheng Ren, Wenxi Liu, Jianbo Jiao, Guoqiang Han, Shengfeng He
Integrating low-level edge features has been proven to be effective in preserving clear boundaries of salient objects. However, the locality of edge features makes it difficult to capture globally salient edges, leading to distraction in the final predictions. To address this problem, we propose to produce distraction-free edge features by incorporating cross-scale holistic interdependencies between
-
The Next Frontier For MPEG-5 LCEVC: From HDR and Immersive Video to the Metaverse IEEE Multimed. (IF 3.2) Pub Date : 2023-01-05 Simone Ferrara, Lorenzo Ciccarelli, Amaya Jiménez Moreno, Shiruo Zhao, Yetish Joshi, Guido Meardi, Stefano Battista
In 2021, the newest MPEG standard was published as MPEG-5 low complexity enhancement video coding (LCEVC). Contrary to typical video codecs, LCEVC is an enhancement codec, meaning it works in combination with other codecs, to produce a more efficiently compressed video. Thanks to its simplified architecture, it is designed to be deployed as a software enhancer, which uses hardware blocks more efficiently
-
Computing in Science & Engineering IEEE Multimed. (IF 3.2) Pub Date : 2023-01-05
Presents information about the content of the publication.
-
The Metaverse From a Multimedia Communications Perspective IEEE Multimed. (IF 3.2) Pub Date : 2023-01-05 Haiwei Dong, Jeannie S. A. Lee
eXtended reality (XR) technologies such as virtual reality and 360° stereoscopic streaming enable the concept of the Metaverse, an immersive virtual space for collaboration and interaction. To ensure a high-fidelity display of immersive media, the bandwidth, latency and network traffic patterns will need to be considered to ensure a user's Quality of Experience (QoE). In this article, examples and
-
Edge-Assisted Virtual Viewpoint Generation for Immersive Light Field IEEE Multimed. (IF 3.2) Pub Date : 2022-12-29 Xinjue Hu, Chenchen Wang, Lin Zhang, Guo Chen, Shervin Shirmohammadi
Light field (LF), which describes the light rays that emanate at each point in a scene, can be used as a six-degrees-of-freedom (6DOF) immersive media. Similar to the traditional multiview video, LF is also captured by an array of cameras, leading to a large data volume that needs to be streamed from a server to users. When a user wishes to watch the scene from a viewpoint that no camera has captured
-
Multiview Language Bias Reduction for Visual Question Answering IEEE Multimed. (IF 3.2) Pub Date : 2022-10-26 Pengju Li, Zhiyi Tan, Bing-Kun Bao
Current visual question answering models overly rely on language bias and fail to understand visual information sufficiently. Many recent works concentrate on mitigating the intraquestion type bias (bias in the distribution of answers to a question type) without taking the interquestion type bias (bias in distribution between question types) into consideration, causing the model to ignore the tail
-
Could Head Motions Affect Quality When Viewing 360° Videos? IEEE Multimed. (IF 3.2) Pub Date : 2022-10-17 Burak Kara, Mehmet N. Akcay, Ali C. Begen, Saba Ahsan, Igor D.D. Curcio, Emre B. Aksu
Measuring quality accurately and quickly (preferably in real time) when streaming 360$^\circ$∘ videos is essential to enhance the user experience. Most quality-of-experience metrics have primarily used viewport quality as a simple surrogate for such experiences at a given time. While this baseline approach has been later augmented by some researchers using pupil and gaze tracking, head tracking has
-
Transferring Deep Gaussian Denoiser for Compressed Sensing MRI Reconstruction IEEE Multimed. (IF 3.2) Pub Date : 2022-10-17 Zhonghua Xie, Lingjun Liu
Deep neural networks have achieved the most outstanding performance in compressed sensing magnetic resonance imaging (CS-MRI) reconstruction by learning the potential structures of images from a large number of training samples. However, the required data comprising hundreds of subjects are usually rare. In this article, we remedy this problem by transferring the easy-to-get deep Gaussian denoisers
-
CADW: CGAN-Based Attack on Deep Robust Image Watermarking IEEE Multimed. (IF 3.2) Pub Date : 2022-10-10 Chuan Qin, Shengyan Gao, Xinpeng Zhang, Guorui Feng
Robust watermarking plays an essential role in copyright protection for digital images. Meanwhile, the studies of robust image watermarking and corresponding attack methods promote and complement each other. Because the traditional attack methods are weak to attack the deep watermarking, in this work, we thus design an effective watermarking attack method based on the conditional generative adversarial
-
DiOS—An Extended Reality Operating System for the Metaverse IEEE Multimed. (IF 3.2) Pub Date : 2022-10-03 Tristan Braud, Lik-Hang Lee, Ahmad Alhilal, Carlos Bermejo Fernández, Pan Hui
Driven by the recent improvements in device and networks capabilities, extended reality (XR) is becoming more pervasive; industry and academia alike envision ambitious projects, such as the metaverse. However, XR is still limited by the current architecture of mobile systems. This article makes the case for an XR-specific operating system (XROS). An XROS integrates hardware-support, computer vision
-
Front Cover IEEE Multimed. (IF 3.2) Pub Date : 2022-09-21
Presents the front cover for this issue of the publication.