当前期刊: Multimedia Systems Go to current issue    加入关注   
显示样式:        排序: 导出
我的关注
我的收藏
您暂时未登录!
登录
  • Towards a real-time image/video cryptosystem: problems, analysis and recommendations
    Multimedia Syst. (IF 1.956) Pub Date : 2020-01-20
    Tasnime Omrani, Rabei Becheikh, Rhouma Rhouma

    Abstract With the massive use of images over the Internet, the security of some of them has become critical. One of the most used encryption algorithms is AES. Although this algorithm provides high security for text data, it is not suitable for images in view of their characteristics including correlation and redundancy. To have a more suitable and efficient algorithm for the encryption of images, we have deeply studied the features of this kind of data as well as the weaknesses of the well-known cryptosystems designed for images. In addition, we have examined the common building blocks used in the design of a “good” image cipher. Based on this analysis, we propose some recommendations to construct a well-secured and efficient cryptosystem for images that can be easily extended or adapted to video streams.

    更新日期:2020-01-21
  • Combining CNN streams of dynamic image and depth data for action recognition
    Multimedia Syst. (IF 1.956) Pub Date : 2020-01-14
    Roshan Singh, Rajat Khurana, Alok Kumar Singh Kushwaha, Rajeev Srivastava

    Abstract RGB-D sensors have been in great demand due to its capability of producing large amount of multimodal data like RGB images and depth maps, useful for better training of deep learning models. In this paper, a deep learning model for recognizing human activities in a video sequence by combining multiple CNN streams has been proposed. The proposed work comprises the use of dynamic images generated from RGB images and depth map for three different dimensions. The proposed model is trained using these four streams on VGG Net for action recognition purpose. Further, it is evaluated and compared with the other state-of-the-art methods available in literature, on three challenging datasets, namely MSR daily Activity, UTD MHAD and CAD 60, in terms of accuracy, error, recall, specificity, precision and f-score. From obtained results, it has been observed that the proposed method outperforms other methods.

    更新日期:2020-01-15
  • Con(dif)fused voice to convey secret: a dual-domain approach
    Multimedia Syst. (IF 1.956) Pub Date : 2020-01-10
    C. Lakshmi, Vasudharini Moranam Ravi, K. Thenmozhi, John Bosco Balaguru Rayappan, Rengarajan Amirtharajan

    Abstract Owing to the rapid development and advancements in the field of networks and communication, sharing of multimedia contents over insecure networks has become vital. The confidentiality of audio signals is predominantly needed in military and intelligence bureau applications. The proposed algorithm addresses this issue by encrypting audio signal using chaotic maps in spatial and transform domain. Discrete Fourier transform (DFT), discrete cosine transform (DCT) and integer wavelet transform (IWT) approaches are considered for the experiment. The algorithm involves three-layer security of confusion and diffusion in the spatial domain, and confusion in the transform domain. The confusion in the transform domain is equivalent to diffusion in the spatial domain. Different sizes of audio samples are considered to validate the effectiveness of the proposed scheme. Experimental results prove that the DFT-assisted encryption scheme is more efficient than the DCT- and IWT-based methods because the DFT scheme employs effective diffusion through reversible phase coding. Effectiveness of the proposed method is substantiated using various metrics. Correlation coefficients arrive significantly closer to zero; number of samples changes rate (NSCR) value is at 100% and scrambling degree close to 1. Besides, the proposed scheme has a larger keyspace higher than 2128. Thus, the proposed algorithm has the potency to withstand the statistical, differential and brute force attacks.

    更新日期:2020-01-11
  • A context-aware mobile application framework using audio watermarking
    Multimedia Syst. (IF 1.956) Pub Date : 2020-01-09
    Yusuf Yaslan, Bilge Gunsel

    In this paper, we propose a proximity-based indoor positioning system which is capable of monitoring mobile device user’s indoor locations where the commonly used GPS signal is unavailable or weak. The designed system is aimed to be integrated into a context-aware communications system to prevent transmission of irrelevant content to all users but easing delivery of the location-based information. Similar to the beacon technology that assigns a code to each targeted position in an indoor location, our system labels the locations with audio watermark codes where user’s mobile device monitors and receives the watermarked audio. The proposed encoder performs code-division multiplexing that allows insertion of several location indexes into the same audio file. Watermark embedded through spread spectrum improves robustness to noise and guarantees a satisfactory performance even though the mobile device has a low band microphone. The designed decoder installs synchronization between the mobile device and the watermarked audio emitter in real time, and extracts the embedded watermark code words assigned to specific indoor locations. This invokes the context-aware content delivery module and the delivery is initiated. Position displacements of the mobile users are estimated by the time-of-flight technique and the users moving within the coverage range of the emitters are continuously monitored. Decoding is achieved in real time that enables the mobile users to reach to content delivered from different emitters within their coverage range. Performance tests demonstrate that the developed system enables to estimate the user position within the 7-m distance from the emitter while keeping inaudibility. We reached 2-m spatial resolution in discrimination of different emitters. The proposed framework can be considered as a promising alternative to latest technologies, i.e., Wi-Fi-based fingerprinting systems or beacons.

    更新日期:2020-01-09
  • Keyframe extraction using Pearson correlation coefficient and color moments
    Multimedia Syst. (IF 1.956) Pub Date : 2019-12-18
    Reddy Mounika Bommisetty, Om Prakash, Ashish Khare

    Abstract Keyframe extraction plays a significant role in wide variety of real-time video processing applications such as video summarization, video management and retrieval, etc. A keyframe captures the whole content of its shot and does not contain any redundant information. The keyframe extraction algorithms are facing challenges due to different visual characteristics in videos of different categories. Therefore, a single feature is not enough to capture visual characteristics of a variety of videos. In order to tackle this problem, we propose an approach of keyframe extraction that uses hybridization of features. In the present article, we propose a novel shot detection-based keyframe extraction algorithm based on combination of two features: one is Pearson correlation coefficient (PCC) and other is color moments (CM). The linear transformation invariance property of PCC facilitates the proposed algorithm to work well under varying lighting conditions. On the other hand, the scale and rotation invariance properties of color moments are beneficial for representation of complex objects that may be present in different poses and orientations. These sustained reasons support the combination of these two features, which brings significant benefits for keyframe extraction in the proposed method. The proposed method detects shot boundaries by employing combo feature set (PCC and CM). From each shot, the frame with highest mean and standard deviation is selected as keyframe. Furthermore, another important contribution is that we developed a new dataset by collecting the videos of different categories such as movies, news, serials, animations and personal interviews and made it available online. The proposed method is experimented on three datasets: two publicly available datasets and one dataset developed by us. The performance of the proposed method on these datasets has been evaluated on the basis of different evaluation parameters: figure of merit, detection percentage, accuracy, and missing factor. Principal advantage of proposed work lies in the fact that it is capable to detect both the abrupt and gradual shot transitions. In real-time videos, it is common to have abrupt and small transitions. The experimental results show the superior performance of the proposed method over the other state-of-the-art methods.

    更新日期:2020-01-04
  • Lifetime-aware solid-state disk (SSD) cache management for video servers
    Multimedia Syst. (IF 1.956) Pub Date : 2019-05-27
    Jungwoo Lee, Hwangje Han, Sungjin Lee, Minseok Song

    Abstract Solid-state disks (SSDs) are now being used as enterprise storage servers owing to their technical merits such as low power consumption, shock resistance, and excellent random read performance. To handle the large storage requirements for video data, they can be used as a cache for hard disk drives (HDDs) in video servers, but this poses several questions such as (1) which video segments can be cached on SSD, (2) how to guarantee the lifetime of SSD, and (3) how to make combined use of dynamic random-access memory (DRAM) and SSD for caching. We start by introducing the concept of caching gain to express the amount of disk bandwidth saved by caching, and go on to propose three algorithms: (1) a dynamic programming algorithm that allows for segment popularity in determining which videos should have initial segments (prefixes) stored on the SSD; (2) a throttling algorithm, which limits the number of cache replacements to guarantee the specified lifetime while maximizing caching gain using a parametric search technique; (3) an algorithm that determines the intervals between pairs of consecutive requests to be stored on the DRAM. We quantitatively explore the effect of this caching scheme through simulations, which show that: (1) prefix caching is quite effective for disk bandwidth saving, (2) our throttling algorithm guarantees the lifetime of the SSD, and (3) DRAM caching can be effectively combined with SSD caching with the aim of maximizing overall caching gain.

    更新日期:2020-01-04
  • Toward efficient indexing structure for scalable content-based music retrieval
    Multimedia Syst. (IF 1.956) Pub Date : 2019-04-27
    Jialie Shen, Mei Tao, Qiang Qu, Dacheng Tao, Yong Rui

    With advancement of various information processing and storage techniques, the scale of digital music collections has been growing at very fast speed during recent decades. To support high-quality content-based retrieval over such a large volume of music data, how to develop indexing structure with good effectiveness, efficiency and scalability becomes an important research issue. However, existing techniques mainly focus on improving query efficiency. Very few approaches have been proposed to address issues related to scalability and accuracy. In this study, we address the problem via introducing a novel indexing technique called effective music indexing framework (EMIF) to facilitate scalable and accurate music retrieval. It is designed based on a “classification-and-indexing” principle and consists of two main functionality modules: (1) music classification—a novel semantic-sensitive classification to identify an input song’s category and (2) indexing module—multiple local indexing structures, one for each semantic category to reduce query response time significantly. In particular, the classification model combining linear discriminative mixture model (LDMM) and advanced score fusion scheme has been applied to estimate category of music accurately. Layered architecture enables EMIF to enjoy superior scalability and efficiency. To evaluate the approach, a set of experimental studies has been carried out using two large music test collections and the results demonstrate various advantages of EMIF over state-of-the-art approaches including efficiency, scalability and effectiveness.

    更新日期:2020-01-04
  • A novel depth perception prediction metric for advanced multimedia applications
    Multimedia Syst. (IF 1.956) Pub Date : 2019-06-28
    Gokce Nur Yilmaz

    Abstract Ubiquitous multimedia applications diffuse our everyday life activities which appreciate their significance about improving our experiences. Therefore, proliferation of the multimedia applications enhancing these experiences needs critical attention of the researchers. Considering this motivation, to overcome the possible barrier of the proliferation of the 3D video-related multimedia applications providing enhanced quality of experience (QoE) to the end users, an objective metric is proposed in this study. The proposed metric tackles the depth perception prediction part reflecting the most important aspect of the 3D video QoE from the user point of view. Considering that the no reference metric type is the most effective one compared to its counterparts, the proposed metric is developed based on this type. In the light of the envision that human visual system-related cues have critical importance on developing accurate metrics, the focus of the proposed metric is directed on the association of the z-direction motion and stereopsis depth cues in the metric development. These cues are derived from the depth map contents having stressed significant depth levels. In addition, the analysis results of the conducted subjective experiments which are currently the “gold standards” for the reliable depth perception prediction are incorporated with the proposed metric. Considering the effective correlation coefficient and root mean square error performance assessment results taken using the proposed metric in comparison to the widely exploited quality assessment metrics in literature, it can be clearly stated that the development of the improved 3D video multimedia applications can be accelerated using it.

    更新日期:2020-01-04
  • Multi-guiding long short-term memory for video captioning
    Multimedia Syst. (IF 1.956) Pub Date : 2018-11-10
    Ning Xu, An-An Liu, Weizhi Nie, Yuting Su

    Recently, research interests have been paid for using recurrent neural network (RNN) as the decoder in video captioning task. However, the generated sentence seems to “lose track” of the video content due to the fixed language rule. Though existing methods try to “guide” the decoder and keep it “on track”, they mainly rely on a single-modal feature that does not fit the multi-modal (visual and semantic) and the complementary (local and global) nature of the video captioning task. To this end, we propose the multi-guiding long short-term memory (mg-LSTM), an extension of LSTM network for video captioning. We add global information (i.e., detected attributes) and local information (i.e., appearance features) extracted from the video as extra input to each cell of LSTM, with the aim of collaboratively guiding the model towards solutions that are more tightly coupled to the video content. In particular, the appearance and attribute features are first used to produce local and global guiders, respectively. We propose a novel cell-wise ensemble, where the weight matrix of each cell of LSTM is extended to be a set of attribute-dependent and attention-dependent weight matrices, by which the guiders induce each cell optimization over time. Extensive experiments on three benchmark datasets (i.e., MSVD, MSR-VTT, and MPII-MD) show that our method can achieve competitive results against the state of the art. Additional ablation studies are conducted on variants of the proposed mg-LSTM.

    更新日期:2020-01-04
  • Panorama based on multi-channel-attention CNN for 3D model recognition
    Multimedia Syst. (IF 1.956) Pub Date : 2019-02-07
    Weizhi Nie, Kun Wang, Qi Liang, Roubing He

    With the development of 3D model reconstruction, manufacturing, and 3D model vision technologies, 3D model recognition has attracted much attention recently. To handle the 3D model recognition problem, in this paper, we propose a panorama based on multi-channel-attention (MCA) CNN network for the representation of the 3D model. The proposed method is composed of three parts: extracting views, transform function learning, and generating 3D model descriptor. Concretely, we first extract the 2D panoramic views for each 3D model, and we use the multi-channel-attention neural network to extract the descriptor for each 3D model. Here, the attention model is used to find the unequal weights of each panorama view to generate the more robust 3D model descriptor. Finally, The fusion feature is used to handle the 3D model classification and retrieval problem. The popular data sets ModelNet and ShapeNet are used to demonstrate the performance of our approach. The experiments also demonstrate the superiority of our proposed method over the state-of-art methods.

    更新日期:2020-01-04
  • Deep learning-based automatic downbeat tracking: a brief review
    Multimedia Syst. (IF 1.956) Pub Date : 2019-03-12
    Bijue Jia, Jiancheng Lv, Dayiheng Liu

    Abstract As an important format of multimedia, music has filled almost everyone’s life. Automatic analyzing of music is a significant step to satisfy people’s need for music retrieval and music recommendation in an effortless way. Thereinto, downbeat tracking has been a fundamental and continuous problem in Music Information Retrieval (MIR) area. Despite significant research efforts, downbeat tracking still remains a challenge. Previous researches either focus on feature engineering (extracting certain features by signal processing, which are semi-automatic solutions); or have some limitations: they can only model music audio recordings within limited time signatures and tempo ranges. Recently, deep learning has surpassed traditional machine learning methods and has become the primary algorithm in feature learning; the combination of traditional and deep learning methods also has made better performance. In this paper, we begin with a background introduction of downbeat tracking problem. Then, we give detailed discussions of the following topics: system architecture, feature extraction, deep neural network algorithms, data sets, and evaluation strategy. In addition, we take a look at the results from the annual benchmark evaluation—Music Information Retrieval Evaluation eXchange—as well as the developments in software implementations. Although much has been achieved in the area of automatic downbeat tracking, some problems still remain. We point out these problems and conclude with possible directions and challenges for future research.

    更新日期:2020-01-04
  • Crowdsourced subjective 3D video quality assessment
    Multimedia Syst. (IF 1.956) Pub Date : 2019-05-22
    Emil Dumic, Kresimir Sakic, Luis A. da Silva Cruz

    Abstract This article proposes a new method for subjective 3D video quality assessment based on crowdsourced workers—Crowd3D. The limitations of traditional laboratory-based grade collection procedures are outlined, and their solution through the use of a crowd-based approach is described. Several conceptual and technical requirements of crowd-based 3D video quality assessment methods are identified and the solutions adopted described in detail. The system built takes the form of a web-based platform that supports 3D video monitors, and orchestrates the entire process of observer validation, content presentation and quality, depth, and comfort grade recording in a remote database. The crowdsourced subjective 3D quality assessment system uses as source contents a set of 3D video and grades database assembled earlier in a laboratory setting. To evaluate the validity of the crowd-based approach, the grades gathered using the crowdsourced system were analysed and compared to a set of grades obtained in laboratory settings using the same data set. Results show that it is possible to obtain Pearson’s and Spearman’s correlation up to 0.95 for quality Difference Mean Opinion Score and 0.96 for quality Mean Opinion Score, when comparing with laboratory grades. Apart from the present study, the 3D video quality assessment platform proposed can be used with advantage for further related research activities, reducing the time and cost compared to the traditional laboratory-based quality assessments.

    更新日期:2020-01-04
  • Analyzing autostereoscopic environment configurations for the design of videogames
    Multimedia Syst. (IF 1.956) Pub Date : 2019-05-28
    José Martínez Sotoca, Miguel Chover, Inmaculada Remolar, Ricardo Loreto

    Stereoscopic devices are becoming more popular every day. The 3D visualization that these displays offer is being used by videogame designers to enhance the user’s game experience. Autostereoscopic monitors offer the possibility of obtaining this 3D visualization without the need for extra device. This fact makes them more attractive to videogame developers. However, the configuration of the cameras that make it possible to obtain an immersive 3D visualization inside the game is still an open problem. In this paper, some system configurations that create autostereoscopic visualization in a 3D game engine were evaluated to obtain a good accommodation of the user experience with the game. To achieve this, user tests that take into account the movement of the player were carried out to evaluate different camera configurations, namely, dynamic and static converging optical axis and parallel optical axis. The purpose of these tests is to evaluate the user experience regarding visual discomfort resulting from the movement of the objects, with the purpose of assessing the preference for one configuration or the other. The results show that the users tend to have a preference trend for the parallel optical axis configuration set. This configuration seems to be optimal because the area where the moving objects are focused is deeper than in the other configurations.

    更新日期:2020-01-04
  • Fashion clothes matching scheme based on Siamese Network and AutoEncoder
    Multimedia Syst. (IF 1.956) Pub Date : 2019-05-16
    Guangyu Gao, Liling Liu, Li Wang, Yihang Zhang

    Owing to the rise of living standard, people attach greater importance to personal appearance, especially clothes matching. With image processing and machine learning technology, we can analyze the pattern of clothes matching for recommendation on clothes images. However, we still face great challenges. To be more specific, there exist excessive complicated factors influencing relation among clothes items, such as color or material, and we also struggle against the problem about how to extract efficient and accurate features. Thus, with the purpose of dealing with such challenges, this paper proposes an efficient clothes matching scheme with Siamese Network and AutoEncoder based on both labeled data from dataset FashionVC and unlabeled data from MicroBlog. More specifically, at first, except for clothes suiting with text from FashionVC, the gallery data also include matching clothes outfits recommended by fashionista in MicroBlog (MbFashion). Meanwhile, a semi-supervised clustering based on assembling was also proposed to generate negative samples to form a comprehensive dataset. Secondly, with consideration of matching patterns from MbFashion, we promoted the Siamese Network properly to more efficiently extract vision features on the constructed training dataset. After that, the traditional features are also extracted, while the Triple AutoEncoder and Bayesian Personalized Ranking are used to map the three kinds of features into the same latent space to learn the compatibility between tops and bottoms. Finally, we conducted a series of experiments and evaluated our results to demonstrate the usefulness and effectiveness of the whole scheme on FashionVC and MbFashion.

    更新日期:2020-01-04
  • An app usage recommender system: improving prediction accuracy for both warm and cold start users
    Multimedia Syst. (IF 1.956) Pub Date : 2019-01-10
    Di Han, Jianqing Li, Wenting Li, Ruibin Liu, Hai Chen

    Abstract It is becoming increasingly difficult to find a particular app on a smartphone due to the increasing number of apps installed. Consequently, it is important to be able to quickly and accurately predict the next app to be used. Two problems arise in predicting next-app usage from the app usage history. One is that some algorithms do not consider the increasing amount of training data available over time, which causes the prediction accuracy to decrease over time. The other is that although some algorithms do consider the aggregation of training data over time, they rebuild their models using all historical data once the amount of new data has reached a certain limit, thus greatly increasing the remodeling time. To reduce the remodeling time, we utilize an modified incremental k-nearest neighbours (IkNN) algorithm to implement a recommender system called Predictor. When the IkNN is used for predicting next-app usage, a new problem is found. When modeling the training data, the classification accuracy decreases as the number of app features increases. After studying the relationships among the contextual features of apps, we design a cluster effective value (CEV), which can compensate for the error induced by multidimensional features, to improve the classification accuracy. It is shown that the IkNN algorithm with the CEV achieves a higher and more stable prediction accuracy compared with that of the algorithm without the CEV. Furthermore, we proposed a the Cold Start strategy: an efficient dynamic collaborative filtering fusion algorithm that provides app Cold Start prediction. Large-scale experiments show that Predictor offers a reduced remodeling time and an improved prediction accuracy.

    更新日期:2020-01-04
  • Reconstruction of occluded ROI in multi-person gait based on numerical methods
    Multimedia Syst. (IF 1.956) Pub Date : 2019-11-20
    Jasvinder Pal Singh, Sanjeev Jain, Sakshi Arora, Uday Pratap Singh

    Abstract Occlusion is an important factor for analysis of human gait recognition in real-time scenarios. In multi-person gait (MPG) or dynamic occlusion, gait recognition is affected due to occluded body parts known as region of interests (ROIs). The aim of this article is to reconstruct the occluded ROIs and measure the errors associated with the reconstruction methods. The contribution of this article is threefold: firstly, we segment five dynamic ROIs; secondly, reconstruction of ROIs using Lagrange, piecewise cubic hermite (PCH) and cubic spline and thirdly, a comparison among the above methods in MPG scenario. We consider the human body into two parts, i.e., lower and upper body. In lower body, we have considered ankle, while knee in upper body: wrist, elbow, and shoulder have been considered. The dataset used in this study consists of dynamic occlusion scenarios. The quantitative assessment of the above methods are based on four parameters such as mean square error, root mean square error, mean absolute error and mean absolute percentage error. Results show that PCH consistently outperforms the other methods in the reconstruction of occluded ROIs in MPG scenario.

    更新日期:2020-01-04
  • Digital audio signals encryption by Mobius transformation and Hénon map
    Multimedia Syst. (IF 1.956) Pub Date : 2019-11-18
    Dawood Shah, Tariq Shah, Sajjad Shoukat Jamal

    Abstract In data encryption, recently commonly used schemes are chaos-based due to their properties of sensitivity to initial conditions, control parameters, and pseudo-randomness. The chaotic crypto-algorithms are appropriate for extensive data encryption for images, audio or videos. In this paper, a novel audio encryption algorithm based on substitution and permutation network is proposed. In this study, Mobius transformation is used as a source to generate strong S-boxes for the substitution network and the Hénon chaotic map that performs pixel-wise permutation as employed for the permutation network. The proposed algorithm is tested several times with different sizes of audio files; the experimental outcomes show that the proposed algorithm has effective complexity and suitable for audio encryption and hence for secure audio communication.

    更新日期:2020-01-04
  • Automatic news-roundup generation using clustering, extraction, and presentation
    Multimedia Syst. (IF 1.956) Pub Date : 2019-11-09
    Vincent Utomo, Jenq-Shiou Leu

    Abstract Along with the growth of the internet, the number of information published increased exponentially. This huge flow of information causes a problem called “information overload” which makes it harder for internet users to find key information they needed on the internet. To solve this, this paper proposes an application that helps user find trending news of their query/interest easily. Some challenges include how to determining the trending subtopic, how to extract only the content of each webpage, and how to present the data to user. Therefore, three core modules are used in this study, which are clustering, extraction, and presentation. Several methods are tested in this study, including naïve, manual thresholding, and heuristic clustering method. The result shows that hierarchical clustering using tf–idf word weighting, cosine similarity as distance measure and heuristically terminated using elbow point analysis achieves the best result at 50.84% Acc and 61.96% NMI. One challenge commonly faced by extraction algorithm is the tendency to have lower effectivity over time. In this paper, extraction algorithm using a prior-known subject/keyword to help the content extraction process is used. Second stage of noise removal process is also introduced to further remove noise that exists within the content block. The evaluation result shows improved score of 7.48%. The final application was able to receive score of 4.18 of 5 for its helpfulness and 4.35 of 5 for its effectiveness by respondents; showing that the proposed application could really help users to find information and help to solve information overload problem.

    更新日期:2020-01-04
  • A reliable and efficient micro-protocol for data transmission over an RTP-based covert channel
    Multimedia Syst. (IF 1.956) Pub Date : 2019-11-06
    Maryam Azadmanesh, Mojtaba Mahdavi, Behrouz Shahgholi Ghahfarokhi

    Abstract As the VoIP steganographic methods provide a low capacity covert channel for data transmission, an efficient and real-time data transmission protocol over this channel is required which provides reliability with minimum bandwidth usage. This paper proposes a micro-protocol for data embedding over covert storage channels or covert hybrid channels developed by steganographic methods where real-time transport protocol (RTP) is their underlying protocol. This micro-protocol applies an improved Go-Back-N mechanism which exploits some RTP header fields and error correction codes to retain maximum covert channel capacity while providing reliability. The bandwidth usage and the performance of the proposed micro-protocol are analyzed. The analyses indicate that the performance depends on the network conditions, the underlying steganographic method, the error correction code and the adjustable parameters of the micro-protocol. Therefore, a genetic algorithm is devised to obtain the optimal values of the adjustable micro-protocol parameters. The impact of network conditions, the underlying steganographic method and the error correction code on the performance are assessed through simulations. The performance of this micro-protocol is compared to an existing method named ReLACK where this micro-protocol outperforms its counterpart.

    更新日期:2020-01-04
  • Understanding minority costumes: a computer vision perspective
    Multimedia Syst. (IF 1.956) Pub Date : 2019-10-28
    Qian Zhang, Yu-cheng Yang, Shi-qin Yue, Ding-qin Shao, Lin Wang

    It is an extremely interesting work to understand the minority costumes in computer vision and ethnology community. It explored some crucial clue for understanding minority costumes via computer vision technology. As it known to all, complicated and subtle structure between different minority costumes lead it becomes hard work to recognize them with computer vision even people. An intelligent framework is proposed for understanding minority costumes from computer vision perspective in this paper. First, the images are converted into grayscale ones as the digital image processing pipeline; then, the grayscale images are segmented with the help of structured forests algorithm; after that, a new Revised Histogram of Oriented Gradient is proposed to compute the feature for each segmented gray minority costume image. At the last, the random forests method is used as the classifier for this minority costumes understanding intelligent system. For lack of acknowledged minority costume image data sets, we evaluated the performances of the proposed method on self-construct data set, and the experimental results are presented.

    更新日期:2020-01-04
  • Recent evolution of modern datasets for human activity recognition: a deep survey
    Multimedia Syst. (IF 1.956) Pub Date : 2019-10-14
    Roshan Singh, Ankur Sonawane, Rajeev Srivastava

    Human activity recognition has been a significant goal of computer vision since its inception and has developed considerably in the last years. Recent approaches to this problem increasingly favour the use of data-driven deep learning methods. To facilitate the comparison of these methods, several datasets pertaining to labelled human activity have been created, having great variation in content and methodology. As the field has developed, the datasets used have undergone considerable evolution as well. In this paper, we attempt to classify and describe a variety of datasets for researchers to choose the most suitable benchmark for their domain. For this, we propose a set of characteristics by which datasets may be compared. We also describe the progress in recent years that sets modern datasets apart from those used in the past.

    更新日期:2020-01-04
Contents have been reproduced by permission of the publishers.
导出
全部期刊列表>>
2020新春特辑
限时免费阅读临床医学内容
ACS材料视界
科学报告最新纳米科学与技术研究
清华大学化学系段昊泓
自然科研论文编辑服务
加州大学洛杉矶分校
上海纽约大学William Glover
南开大学化学院周其林
课题组网站
X-MOL
北京大学分子工程苏南研究院
华东师范大学分子机器及功能材料
中山大学化学工程与技术学院
试剂库存
天合科研
down
wechat
bug