
显示样式: 排序: IF: - GO 导出
-
Perceptual Image Compression with Block-Level Just Noticeable Difference Prediction ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2021-01-28 Tao Tian; Hanli Wang; Sam Kwong; C.-C. Jay Kuo
A block-level perceptual image compression framework is proposed in this work, including a block-level just noticeable difference (JND) prediction model and a preprocessing scheme. Specifically speaking, block-level JND values are first deduced by utilizing the OTSU method based on the variation of block-level structural similarity values between two adjacent picture-level JND values in the MCL-JCI
-
Table of Contents: Online Supplement Volume 16, Number 3s ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2021-01-20 Suraj Sharma
No abstract available.
-
Eye-based Recognition for User Identification on Mobile Devices ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Huiru Shao; Jing Li; Jia Zhang; Hui Yu; Jiande Sun
User identification is becoming more and more important for Apps on mobile devices. However, the identity recognition based on eyes, e.g., iris recognition, is rarely used on mobile devices comparing with those based on face and fingerprint due to its extra cost in hardware and complicated operations during recognition. In this article, an eye-based recognition method is designed for identity recognition
-
A Novel (t,s,k,n)-Threshold Visual Secret Sharing Scheme Based on Access Structure Partition ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Zuquan Liu; Guopu Zhu; Yuan-Gen Wang; Jianquan Yang; Sam Kwong
Visual secret sharing (VSS) is a new technique for sharing a binary image into multiple shadows. For VSS, the original image can be reconstructed from the shadows in any qualified set, but cannot be reconstructed from those in any forbidden set. In most traditional VSS schemes, the shadows held by participants have the same importance. However, in practice, a certain number of shadows are given a higher
-
Am I Done? Predicting Action Progress in Videos ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Federico Becattini; Tiberio Uricchio; Lorenzo Seidenari; Lamberto Ballan; Alberto Del Bimbo
In this article, we deal with the problem of predicting action progress in videos. We argue that this is an extremely important task, since it can be valuable for a wide range of interaction applications. To this end, we introduce a novel approach, named ProgressNet, capable of predicting when an action takes place in a video, where it is located within the frames, and how far it has progressed during
-
Correlation Discrepancy Insight Network for Video Re-identification ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Weijian Ruan; Chao Liang; Yi Yu; Zheng Wang; Wu Liu; Jun Chen; Jiayi Ma
Video-based person re-identification (ReID) aims at re-identifying a specified person sequence from videos that were captured by disjoint cameras. Most existing works on this task ignore the quality discrepancy across frames by using all video frames to develop a ReID method. Additionally, they adopt only the person self-characteristic as the representation, which cannot adapt to cross-camera variation
-
Smart Scribbles for Image Matting ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Xin Yang; Yu Qiao; Shaozhe Chen; Shengfeng He; Baocai Yin; Qiang Zhang; Xiaopeng Wei; Rynson W. H. Lau
Image matting is an ill-posed problem that usually requires additional user input, such as trimaps or scribbles. Drawing a fine trimap requires a large amount of user effort, while using scribbles can hardly obtain satisfactory alpha mattes for non-professional users. Some recent deep learning–based matting networks rely on large-scale composite datasets for training to improve performance, resulting
-
Depth Image Denoising Using Nuclear Norm and Learning Graph Model ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Chenggang Yan; Zhisheng Li; Yongbing Zhang; Yutao Liu; Xiangyang Ji; Yongdong Zhang
Depth image denoising is increasingly becoming the hot research topic nowadays, because it reflects the three-dimensional scene and can be applied in various fields of computer vision. But the depth images obtained from depth camera usually contain stains such as noise, which greatly impairs the performance of depth-related applications. In this article, considering that group-based image restoration
-
Motion-Aware Structured Matrix Factorization for Foreground Detection in Complex Scenes ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Lin Zhu; Xiurong Jiang; Jianing Li; Yuanhong Hao; Yonghong Tian
Foreground detection is one of the key steps in computer vision applications. Many foreground and background models have been proposed and achieved promising performance in static scenes. However, due to challenges such as dynamic background, irregular movement, and noise, most algorithms degrade sharply in complex scenes. To address the problem, we propose a motion-aware structured matrix factorization
-
Controlling Neural Learning Network with Multiple Scales for Image Splicing Forgery Detection ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Yang Wei; Zhuzhu Wang; Bin Xiao; Ximeng Liu; Zheng Yan; Jianfeng Ma
The guarantee of social stability comes from many aspects of life, and image information security as one of them is being subjected to various malicious attacks. As a means of information attack, image splicing forgery refers to copying some areas of an image to another image to hide the traces of the original information and leads to grave consequences. Image splicing forgery is extremely complex
-
Vertical Retargeting for Stereoscopic Images via Stereo Seam Carving ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Kun Zeng; Jiangchuan Hu; Yongyi Gong; Kanoksak Wattanachote; Runpeng Yu; Xiaonan Luo
Vertical retargeting for stereoscopic images using seam manipulation-based approaches has remained an open challenge over the years. Even though horizontal retargeting had attracted a huge amount of interest, its seam coupling strategies were not capable to construct valid seam pairs for vertical retargeting. In this article, we propose two seam coupling strategies for vertical retargeting, namely
-
Make Full Use of Priors: Cross-View Optimized Filter for Multi-View Depth Enhancement ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Xin He; Qiong Liu; You Yang
Multi-view video plus depth (MVD) is the promising and widely adopted data representation for future 3D visual applications and interactive media. However, compression distortions on depth videos impede the development of such applications, and filters are crucially needed for the quality enhancement at the terminal side. Cross-view priors can intuitively be involved in filter design, but these priors
-
Adaptive Attention-based High-level Semantic Introduction for Image Caption ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Xiaoxiao Liu; Qingyang Xu
There have been several attempts to integrate a spatial visual attention mechanism into an image caption model and introduce semantic concepts as the guidance of image caption generation. High-level semantic information consists of the abstractedness and generality indication of an image, which is beneficial to improve the model performance. However, the high-level information is always static representation
-
Evaluation of Information Comprehension in Concurrent Speech-based Designs ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Muhammad Abu Ul Fazal; Sam Ferguson; Andrew Johnston
In human-computer interaction, particularly in multimedia delivery, information is communicated to users sequentially, whereas users are capable of receiving information from multiple sources concurrently. This mismatch indicates that a sequential mode of communication does not utilise human perception capabilities as efficiently as possible. This article reports an experiment that investigated various
-
Learning a Deep Agent to Predict Head Movement in 360-Degree Images ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Yucheng Zhu; Guangtao Zhai; Xiongkuo Min; Jiantao Zhou
Virtual reality adequately stimulates senses to trick users into accepting the virtual environment. To create a sense of immersion, high-resolution images are required to satisfy human visual system, and low latency is essential for smooth operations, which put great demands on data processing and transmission. Actually, when exploring in the virtual environment, viewers only perceive the content in
-
MMFN: Multimodal Information Fusion Networks for 3D Model Classification and Retrieval ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Weizhi Nie; Qi Liang; Yixin Wang; Xing Wei; Yuting Su
In recent years, research into 3D shape recognition in the field of multimedia and computer vision has attracted wide attention. With the rapid development of deep learning, various deep models have achieved state-of-the-art performance based on different representations. There are many modalities for representing a 3D model, such as point cloud, multiview, and panorama view. Deep learning models based
-
GuessUNeed: Recommending Courses via Neural Attention Network and Course Prerequisite Relation Embeddings ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Zhongying Zhao; Yonghao Yang; Chao Li; Liqiang Nie
Massive Open Online Courses, offering millions of high-quality courses from prestigious universities and prominent experts, are picking up momentum in popularity. Although users enrolling on MOOCs have free access to abundant knowledge, they may easily get overwhelmed by information overload. Therefore, there is a need of recommending technology as a fundamental and well-accepted effective solution
-
Knowledge-driven Egocentric Multimodal Activity Recognition ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Yi Huang; Xiaoshan Yang; Junyu Gao; Jitao Sang; Changsheng Xu
Recognizing activities from egocentric multimodal data collected by wearable cameras and sensors, is gaining interest, as multimodal methods always benefit from the complementarity of different modalities. However, since high-dimensional videos contain rich high-level semantic information while low-dimensional sensor signals describe simple motion patterns of the wearer, the large modality gap between
-
Part-based Structured Representation Learning for Person Re-identification ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Yaoyu Li; Hantao Yao; Tianzhu Zhang; Changsheng Xu
Person re-identification aims to match person of interest under non-overlapping camera views. Therefore, how to generate a robust and discriminative representation is crucial for person re-identification. Mining local clues from human body parts to describe pedestrians has been extensively studied in existing methods. However, existing methods locate human body parts coarsely and do not consider the
-
An LSH-based Offloading Method for IoMT Services in Integrated Cloud-Edge Environment ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2021-01-08 Xiaolong Xu; Qihe Huang; Yiwen Zhang; Shancang Li; Lianyong Qi; Wanchun Dou
Benefiting from the massive available data provided by Internet of multimedia things (IoMT), enormous intelligent services requiring information of various types to make decisions are emerging. Generally, the IoMT devices are equipped with limited computing power, interfering with the process of computation-intensive services. Currently, to satisfy a wide range of service requirements, the novel computing
-
Differentially Private Tensor Train Deep Computation for Internet of Multimedia Things ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-31 Nicholaus J. Gati; Laurence T. Yang; Jun Feng; Yijun Mo; Mamoun Alazab
The significant growth of the Internet of Things (IoT) takes a key and active role in healthcare, smart homes, smart manufacturing, and wearable gadgets. Due to complexness and difficulty in processing multimedia data, the IoT based scheme, namely Internet of Multimedia Things (IoMT) exists that is specialized for services and applications based on multimedia data. However, IoMT generated data are
-
MV2Flow: Learning Motion Representation for Fast Compressed Video Action Recognition ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-31 Hezhen Hu; Wengang Zhou; Xingze Li; Ning Yan; Houqiang Li
In video action recognition, motion is a very crucial clue, which is usually represented by optical flow. However, optical flow is computationally expensive to obtain, which becomes the bottleneck for the efficiency of traditional action recognition algorithms. In this article, we propose a network called MV2Flow to learn motion representation efficiently from the signals in the compressed domain.
-
Social-sensed Image Aesthetics Assessment ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-31 Chaoran Cui; Peiguang Lin; Xiushan Nie; Muwei Jian; Yilong Yin
Image aesthetics assessment aims to endow computers with the ability to judge the aesthetic values of images, and its potential has been recognized in a variety of applications. Most previous studies perform aesthetics assessment purely based on image content. However, given the fact that aesthetic perceiving is a human cognitive activity, it is necessary to consider users’ perception of an image when
-
Fog-based Secure Service Discovery for Internet of Multimedia Things: A Cross-blockchain Approach ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Haoran Liang; Jun Wu; Xi Zheng; Mengshi Zhang; Jianhua Li; Alireza Jolfaei
The Internet of Multimedia Things (IoMT) has become the backbone of innumerable multimedia applications in various fields. The wide application of IoMT not only makes our life convenient but also brings challenges to service discovery. Service discovery aims to leverage location information and trust evidence scattered in a variety of multimedia applications to find trusted IoMT devices that can provide
-
Analysis of the Security of Internet of Multimedia Things ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Zhihan Lv; Liang Qiao; Houbing Song
To study the security performance of the Internet of multimedia things on the privacy protection of user identity, behavior trajectory, and preference under the new information technology industry wave, in this study, aiming at the problems of the sharing of Internet of things perception data and the exposure of users’ privacy information, the Anonymous Batch Authentication Scheme (ABAH) for privacy
-
SDN-Assisted DDoS Defense Framework for the Internet of Multimedia Things ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Kshira Sagar Sahoo; Deepak Puthal
The Internet of Things is visualized as a fundamental networking model that bridges the gap between the cyber and real-world entity. Uniting the real-world object with virtualization technology is opening further opportunities for innovation in nearly every individual’s life. Moreover, the usage of smart heterogeneous multimedia devices is growing extensively. These multimedia devices that communicate
-
Securing Multimedia by Using DNA-Based Encryption in the Cloud Computing Environment ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Suyel Namasudra; Rupak Chakraborty; Abhishek Majumder; Nageswara Rao Moparthi
Today, the size of a multimedia file is increasing day by day from gigabytes to terabytes or even petabytes, mainly because of the evolution of a large amount of real-time data. As most of the multimedia files are transmitted through the internet, hackers and attackers try to access the users’ personal and confidential data without any authorization. Thus, maintaining a strong security technique has
-
Privacy Protection for Medical Data Sharing in Smart Healthcare ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 Liming Fang; Changchun Yin; Juncen Zhu; Chunpeng Ge; M. Tanveer; Alireza Jolfaei; Zehong Cao
In virtue of advances in smart networks and the cloud computing paradigm, smart healthcare is transforming. However, there are still challenges, such as storing sensitive data in untrusted and controlled infrastructure and ensuring the secure transmission of medical data, among others. The rapid development of watermarking provides opportunities for smart healthcare. In this article, we propose a new
-
Data Hiding: Current Trends, Innovation and Potential Challenges ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-12-16 A. K. Singh
With the widespread growth of digital information and improved internet technologies, the demand for improved information security techniques has significantly increased due to privacy leakage, identity theft, illegal copying, and data distribution. Because of this, data hiding approaches have received much attention in several application areas. However, those approaches are unable to solve many issues
-
Constrained LSTM and Residual Attention for Image Captioning ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-07-05 Liang Yang; Haifeng Hu; Songlong Xing; Xinlong Lu
Visual structure and syntactic structure are essential in images and texts, respectively. Visual structure depicts both entities in an image and their interactions, whereas syntactic structure in texts can reflect the part-of-speech constraints between adjacent words. Most existing methods either use visual global representation to guide the language model or generate captions without considering the
-
Deep Triplet Neural Networks with Cluster-CCA for Audio-Visual Cross-Modal Retrieval ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-07-14 Donghuo Zeng; Yi Yu; Keizo Oyama
Cross-modal retrieval aims to retrieve data in one modality by a query in another modality, which has been a very interesting research issue in the field of multimedia, information retrieval, and computer vision, and database. Most existing works focus on cross-modal retrieval between text-image, text-video, and lyrics-audio. Little research addresses cross-modal retrieval between audio and video due
-
Multi-View Graph Matching for 3D Model Retrieval ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-07-05 Yu-Ting Su; Wen-Hui Li; Wei-Zhi Nie; An-An Liu
3D model retrieval has been widely utilized in numerous domains, such as computer-aided design, digital entertainment, and virtual reality. Recently, many graph-based methods have been proposed to address this task by using multi-view information of 3D models. However, these methods are always constrained by many-to-many graph matching for the similarity measure between pairwise models. In this article
-
Recurrent Attention Network with Reinforced Generator for Visual Dialog ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-07-05 Hehe Fan; Linchao Zhu; Yi Yang; Fei Wu
In Visual Dialog, an agent has to parse temporal context in the dialog history and spatial context in the image to hold a meaningful dialog with humans. For example, to answer “what is the man on her left wearing?” the agent needs to (1) analyze the temporal context in the dialog history to infer who is being referred to as “her,” (2) parse the image to attend “her,” and (3) uncover the spatial context
-
Attention-Based Modality-Gated Networks for Image-Text Sentiment Analysis ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-07-05 Feiran Huang; Kaimin Wei; Jian Weng; Zhoujun Li
Sentiment analysis of social multimedia data has attracted extensive research interest and has been applied to many tasks, such as election prediction and products evaluation. Sentiment analysis of one modality (e.g., text or image) has been broadly studied. However, not much attention has been paid to the sentiment analysis of multimodal data. Different modalities usually have information that is
-
Posed and Spontaneous Expression Distinction Using Latent Regression Bayesian Networks ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-07-14 Shangfei Wang; Longfei Hao; Qiang Ji
Facial spatial patterns can help distinguish between posed and spontaneous expressions, but this information has not been thoroughly leveraged by current studies. We present several latent regression Bayesian networks (LRBNs) to capture the patterns existing in facial landmark points and to use those points to differentiate posed from spontaneous expressions. The visible nodes of the LRBN represent
-
Upgrading the Newsroom: An Automated Image Selection System for News Articles ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-07-05 Fangyu Liu; Rémi Lebret; Didier Orel; Philippe Sordet; Karl Aberer
We propose an automated image selection system to assist photo editors in selecting suitable images for news articles. The system fuses multiple textual sources extracted from news articles and accepts multilingual inputs. It is equipped with char-level word embeddings to help both modeling morphologically rich languages, e.g., German, and transferring knowledge across nearby languages. The text encoder
-
3D Facial Similarity Measurement and Its Application in Facial Organization ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-07-05 Chenlei Lv; Zhongke Wu; Xingce Wang; Mingquan Zhou
We propose a novel framework for 3D facial similarity measurement and its application in facial organization. The construction of the framework is based on Kendall shape space theory. Kendall shape space is a quotient space that is constructed by shape features. In Kendall shape space, the shape features can be measured and is robust to similarity transformations. In our framework, a 3D face is represented
-
Image Captioning with a Joint Attention Mechanism by Visual Concept Samples ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-07-05 Jin Yuan; Lei Zhang; Songrui Guo; Yi Xiao; Zhiyong Li
The attention mechanism has been established as an effective method for generating caption words in image captioning; it explores one noticed subregion in an image to predict a related caption word. However, even though the attention mechanism could offer accurate subregions to train a model, the learned captioner may predict wrong, especially for visual concept words, which are the most important
-
Improving Multiperson Pose Estimation by Mask-aware Deep Reinforcement Learning ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-07-05 Xun Wang; Yan Tian; Xuran Zhao; Tao Yang; Judith Gelernter; Jialei Wang; Guohua Cheng; Wei Hu
Research on single-person pose estimation based on deep neural networks has recently witnessed progress in both accuracy and execution efficiency. However, multiperson pose estimation is still a challenging topic, partially because the object regions are selected greedily from proposals via class-agnostic nonmaximum suppression (NMS), and the misalignment in the redundant detection yields inaccurate
-
Learning Joint Structure for Human Pose Estimation ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-07-05 Shenming Feng; Haifeng Hu
Recently, tremendous progress has been achieved on human pose estimation with the development of convolutional neural networks (CNNs). However, current methods still suffer from severe occlusion, back view, and large pose variation due to the lack of consideration of the spatial relationship between different joints, which can provide strong cues for localizing the hidden keypoints. In this work, we
-
Single-stage Instance Segmentation ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-07-05 Feng Lin; Bin Li; Wengang Zhou; Houqiang Li; Yan Lu
Albeit the highest accuracy of object detection is generally acquired by multi-stage detectors, like R-CNN and its extension approaches, the single-stage object detectors also achieve remarkable performance with faster execution and higher scalability. Inspired by this, we propose a single-stage framework to tackle the instance segmentation task. Building on a single-stage object detection network
-
Few-shot Food Recognition via Multi-view Representation Learning ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-07-14 Shuqiang Jiang; Weiqing Min; Yongqiang Lyu; Linhu Liu
This article considers the problem of few-shot learning for food recognition. Automatic food recognition can support various applications, e.g., dietary assessment and food journaling. Most existing works focus on food recognition with large numbers of labelled samples, and fail to recognize food categories with few samples. To address this problem, we propose a Multi-View Few-Shot Learning (MVFSL)
-
Sketch-guided Deep Portrait Generation ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-07-05 Trang-Thi Ho; John Jethro Virtusio; Yung-Yao Chen; Chih-Ming Hsu; Kai-Lung Hua
Generating a realistic human class image from a sketch is a unique and challenging problem considering that the human body has a complex structure that must be preserved. Additionally, input sketches often lack important details that are crucial in the generation process, hence making the problem more complicated. In this article, we present an effective method for synthesizing realistic images from
-
Design, Analysis, and Implementation of Efficient Framework for Image Annotation ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-07-05 Gargi Srivastava; Rajeev Srivastava
In this article, a general framework of image annotation is proposed by involving salient object detection (SOD), feature extraction, feature selection, and multi-label classification. For SOD, Augmented-Gradient Vector Flow (A-GVF) is proposed, which fuses benefits of GVF and Minimum Directional Contrast. The article also proposes to control the background information to be included for annotation
-
Kernel Attention Network for Single Image Super-Resolution ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-07-05 Dongyang Zhang; Jie Shao; Heng Tao Shen
Recently, attention mechanisms have shown a developing tendency toward convolutional neural network (CNN), and some representative attention mechanisms, i.e., channel attention (CA) and spatial attention (SA) have been fully applied to single image super-resolution (SISR) tasks. However, the existing architectures directly apply these attention mechanisms to SISR without much consideration of the nature
-
Blind Image Quality Assessment by Natural Scene Statistics and Perceptual Characteristics ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-08-25 Yutao Liu; Ke Gu; Xiu Li; Yongbing Zhang
Opinion-unaware blind image quality assessment (OU BIQA) refers to establishing a blind quality prediction model without using the expensive subjective quality scores, which is a highly promising direction in the BIQA research. In this article, we focus on OU BIQA and propose a novel OU BIQA method. Specifically, in our proposed method, we deeply investigate the natural scene statistics (NSS) and the
-
A Unified Tensor Framework for Clustering and Simultaneous Reconstruction of Incomplete Imaging Data ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-08-25 Jobin Francis; Baburaj M; Sudhish N George
Incomplete observations in the data are always troublesome to data clustering algorithms. In fact, most of the well-received techniques are not designed to encounter such imperative scenarios. Hence, clustering of images under incomplete samples is an inquisitive yet unaddressed area of research. Therefore, the aim of this article is to design a single-stage optimization procedure for clustering as
-
Introduction to the Special Issue on Smart Communications and Networking for Future Video Surveillance ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-06-20 Honghao Gao; Yudong Zhang
No abstract available.
-
Smart Diagnosis: A Multiple-Source Transfer TSK Fuzzy System for EEG Seizure Identification ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-06-20 Yizhang Jiang; Xiaoqing Gu; Dingcheng Ji; Pengjiang Qian; Jing Xue; Yuanpeng Zhang; Jiaqi Zhu; Kaijian Xia; Shitong Wang
To effectively identify electroencephalogram (EEG) signals in multiple-source domains, a multiple-source transfer learning-based Takagi–Sugeno–Kang (TSK) fuzzy system (FS), called MST-TSK, is proposed, which combines multiple-source transfer learning and manifold regularization (MR) learning mechanisms together into the TSK-FS framework. Specifically, the advantages of MST-TSK include the following:
-
DenseNet-201-Based Deep Neural Network with Composite Learning Factor and Precomputation for Multiple Sclerosis Classification ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-06-20 Shui-Hua Wang; Yu-Dong Zhang
(Aim) Multiple sclerosis is a neurological condition that may cause neurologic disability. Convolutional neural network can achieve good results, but tuning hyperparameters of CNN needs expert knowledge and are difficult and time-consuming. To identify multiple sclerosis more accurately, this article proposed a new transfer-learning-based approach. (Method) DenseNet-121, DenseNet-169, and DenseNet-201
-
Cross-Domain Brain CT Image Smart Segmentation via Shared Hidden Space Transfer FCM Clustering ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-06-20 Kaijian Xia; Hongsheng Yin; Yong Jin; Shi Qiu; Hongru Zhao
Clustering is an important issue in brain medical image segmentation. Original medical images used for clinical diagnosis are often insufficient for clustering in the current domain. As there are sufficient medical images in the related domains, transfer clustering can improve the clustering performance of the current domain by transferring knowledge across the related domains. In this article, we
-
Spatio-Temporal Deep Residual Network with Hierarchical Attentions for Video Event Recognition ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-06-20 Yonggang Li; Chunping Liu; Yi Ji; Shengrong Gong; Haibao Xu
Event recognition in surveillance video has gained extensive attention from the computer vision community. This process still faces enormous challenges due to the tiny inter-class variations that are caused by various facets, such as severe occlusion, cluttered backgrounds, and so forth. To address these issues, we propose a spatio-temporal deep residual network with hierarchical attentions (STDRN-HA)
-
Modeling Long-Term Dependencies from Videos Using Deep Multiplicative Neural Networks ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-07-14 Wen Si; Cong Liu; Zhongqin Bi; Meijing Shan
Understanding temporal dependencies of videos is fundamental for vision problems, but deep learning–based models are still insufficient in this field. In this article, we propose a novel deep multiplicative neural network (DMNN) for learning hierarchical long-term representations from video. The DMNN is built upon the multiplicative block that remembers the pairwise transformations between consecutive
-
Proposal Complementary Action Detection ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-06-20 Suguo Zhu; Xiaoxian Yang; Jun Yu; Zhenying Fang; Meng Wang; Qingming Huang
Temporal action detection not only requires correct classification but also needs to detect the start and end times of each action accurately. However, traditional approaches always employ sliding windows or actionness to predict the actions, and it is different to train to model with sliding windows or actionness by end-to-end means. In this article, we attempt a different idea to detect the actions
-
A New Transfer Function for Volume Visualization of Aortic Stent and Its Application to Virtual Endoscopy ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-06-20 Chenxi Huang; Yisha Lan; Guokai Zhang; Gaowei Xu; Landu Jiang; Nianyin Zeng; Jenhong Tan; E. Y. K. Ng; Yongqiang Cheng; Ningzhi Han; Rongrong Ji; Yonghong Peng
Aortic stent has been widely used in restoring vascular stenosis and assisting patients with cardiovascular disease. The effective visualization of aortic stent is considered to be critical to ensure the effectiveness and functions of the aortic stent in clinical practice. Volume rendering with ray casting has been used as an effective approach to enable the effective visualization of aortic stent
-
Introduction to the Best Papers from the ACM Multimedia Systems (MMSys) 2019 and Co-Located Workshops ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-06-20 Michael Zink; Laura Toni; Ali C. Begen
No abstract available.
-
A Practical Learning-based Approach for Viewer Scheduling in the Crowdsourced Live Streaming ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-06-20 Rui-Xiao Zhang; Ming Ma; Tianchi Huang; Haitian Pang; Xin Yao; Chenglei Wu; Lifeng Sun
Scheduling viewers effectively among different Content Delivery Network (CDN) providers is challenging owing to the extreme diversity in the crowdsourced live streaming (CLS) scenarios. Abundant algorithms have been proposed in recent years, which, however, suffer from a critical limitation: Due to their inaccurate feature engineering or naive rules, they cannot optimally schedule viewers. To address
-
QoE-Fair DASH Video Streaming Using Server-side Reinforcement Learning ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-06-20 Sa’di Altamimi; Shervin Shirmohammadi
To design an optimal adaptive video streaming method, video service providers need to consider both the efficiency and the fairness of the Quality of Experience (QoE) of their users. In Reference [8], we proposed a server-side QoE-fair rate adaptation method that considers both efficiency and fairness of the QoE. The server uses Reinforcement Learning (RL) to select a bitrate for each client sharing
-
Performance Analysis of ACTE: A Bandwidth Prediction Method for Low-latency Chunked Streaming ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-06-20 Abdelhak Bentaleb; Christian Timmerer; Ali C. Begen; Roger Zimmermann
HTTP adaptive streaming with chunked transfer encoding can offer low-latency streaming without sacrificing the coding efficiency. This allows media segments to be delivered while still being packaged. However, conventional schemes often make widely inaccurate bandwidth measurements due to the presence of idle periods between the chunks and hence this is causing sub-optimal adaptation decisions. To
-
Evaluation of Shared Resource Allocation Using SAND for ABR Streaming ACM Trans. Multimed. Comput. Commun. Appl. (IF 3.275) Pub Date : 2020-07-10 Stefan Pham; Patrick Heeren; Calvin Schmidt; Daniel Silhavy; Stefan Arbanowski
Adaptive bitrate media streaming clients adjust the quality of media content depending on the current network conditions. The shared resource allocation (SRA) feature defined in MPEG-SAND (server and network assisted DASH) allows servers to allocate bandwidth to streaming clients. This enables coordination and prioritization of clients that are connected to the same network bottleneck (e.g., to maximize