• J. Supercomput. (IF 2.157) Pub Date : 2020-01-22
Lu Xiong, Guanrong Tang, Yeh-Cheng Chen, Yu-Xi Hu, Ruey-Shun Chen

Abstract Aiming at solving the problems of complex image background and difficulties in the later image segmentation, an image segmentation algorithm based on the chaotic particle swarm algorithm and fuzzy clustering is proposed. First, the color space is converted from the RGB color space into the HIS color space. Then, a hybrid algorithm consisted of chaotic particle swarm optimization and fuzzy clustering is introduced. Each color component is processed by the algorithm, and the corresponding partition graph is obtained. Finally, the color space is converted into the RGB color space to achieve the segmentation effect. Experimental results show that the new algorithm has higher accuracy to segment the image and has good robustness to noises.

更新日期：2020-01-23
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-21
Meysam Ghahramani, Reza Javidan, Mohammad Shojafar

Abstract Smart city is an important concept in urban development. The use of information and communication technology to promote quality of life and the management of natural resources is one of the main goals in smart cities. On the other hand, at any time, thousands of mobile users send a variety of information on the network, and this is the main challenge in smart cities. To overcome this challenge and collect data from roaming users, the global mobility network (GLOMONET) is a good approach for information transfer. Consequently, designing a secure protocol for GLOMONET is essential. The main intention of this paper is to provide a secure protocol for GLOMONET in smart cities. To do this, we design a protocol that is based on Li et al.’s protocol, which is not safe against our proposed attacks. Our protocol inherits all the benefits of the previous one; it is entirely secure and does not impose any more communication overhead. We formally analyze the protocol using BAN logic and compare it to similar ones in terms of performance and security, which shows the efficiency of our protocol. Our proposed protocol enables mobile users and foreign agents to share a secret key in 6.1 ms with 428 bytes communication overhead, which improves the time complexity of the previous protocol to 53%.

更新日期：2020-01-22
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-20
Yukihiro Nomura, Issei Sato, Toshihiro Hanawa, Shouhei Hanaoka, Takahiro Nakao, Tomomi Takenaga, Tetsuya Hoshino, Yuji Sekiya, Soichiro Miki, Takeharu Yoshikawa, Naoto Hayashi, Osamu Abe

Abstract Recently, deep learning has been exploited in the field of medical image analysis. However, the training of deep learning models with medical images is time-consuming since most medical image data are three-dimensional volumes or high-resolution two-dimensional images. Moreover, the optimization of numerous hyperparameters strongly affects the performance of deep learning. If a framework for training deep learning with hyperparameter optimization on a supercomputer system can be realized, it is expected to accelerate the training of deep learning with medical images. In this study, we described our novel environment for training deep learning with medical images on the supercomputer system in our institute (Reedbush-H supercomputer system) based on asynchronous parallel Bayesian optimization. We trained two types of automated lesion detection application in a constructed environment. The constructed environment enabled us to train deep learning with hyperparameter tuning in a short time.

更新日期：2020-01-21
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-20
Shubhra Dwivedi, Manu Vardhan, Sarsij Tripathi

Abstract Due to the rapid growth of internet services, the demand for protection and security of the network against sophisticated attacks is continuously increasing. Nowadays, in network security, an intrusion detection system (IDS) plays an important role to detect intrusive activity. With the purpose of reducing the search dimensionality and enhancing classification performance of IDS model, in the literature several hybrid evolutionary algorithms have been investigated to tackle anomaly detection problems, but they have few drawbacks such as poor diversity, massive false negative rate, and stagnation. To resolve these limitations, in this study, we introduce a new hybrid evolutionary algorithm combining the techniques of grasshopper optimization algorithm (GOA) and simulated annealing (SA), called GOSA for IDS that extracts the most noteworthy features and eliminates irrelevant ones from the original IDS datasets. In the proposed method, SA is integrated into GOA, while utilizing it to increase the solution quality after each iteration of GOA. Support vector machine is used as a fitness function in the proposed method to select relevant features which can help to classify attacks accurately. The performance of the proposed method is evaluated on two IDS datasets such as NSL-KDD and UNSW-NB15. From experimental results, we observe that the proposed method outperforms existing state-of-the-art methods and attains high detection rate as 99.86%, an accuracy as 99.89%, and low false alarm rate as 0.009 in NSL-KDD and high detection rate as 98.85%, an accuracy as 98.96%, and low false alarm rate as 0.084 in UNSW-NB15.

更新日期：2020-01-21
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-20
Rahul Priyadarshi, Bharat Gupta, Amulya Anurag

Abstract Wireless sensor networks (WSNs) have been considered as one of the fine research areas in recent years because of vital role in numerous applications. To process the extracted data and transmit it to the various location, a large number of nodes must be deployed in a proper way because deployment is one of the major issues in WSNs. Hence, the minimum number of node deployment to attain full coverage is of enormous significance for research. The prime agenda of the presented paper is to categorize various coverage techniques into four major parts: computational geometry-based techniques, force-based techniques, grid-based techniques, and metaheuristic-based techniques. Additionally, several comparisons amid these schemes are provided in view of their benefits and drawbacks. Our discussion weighs on the classification of coverage, practical challenges in the deployment of WSNs, sensing models, and research issues in WSNs. Moreover, a detailed analysis of performance metrics and comparison among various WSNs simulators is listed. In conclusion, standing research issues along with potential work guidelines are discussed.

更新日期：2020-01-21
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-18
K. Vidhya, R. Shanmugalakshmi

Abstract Big Data (BD) has turned into a significant research field owing to the dawn of vast quantity of data generated as of various sources like Internet of things (IoT), social media, and also multimedia applications. BD has played an imperative part in numerous decision-makings as well as forecasting domains for instance health care, recommendation systems, web display advertisement, transportation, clinicians, business analysis, and fraud detection along with tourism marketing. The domain of health care attained its influence by the effect of BD since the data sources concerned in the healthcare organizations are famous for their volume, heterogeneous complexity, and high dynamism. Though the function of BD analytical techniques, platforms, and tools are realized among various domains, their effect on healthcare organization for possible healthcare applications shows propitious research directions. This paper concentrates on the analysis of multiple diseases using modified adaptive neuro-fuzzy inference system (M-ANFIS). Initially, the healthcare BD undergoes pre-processing. In the pre-processing step, data format identification and integration of the healthcare BD dataset is done. Now, features are extracted from the preprocessed dataset and the count of the closed frequent item set (CFI) is found. Then, the entropy of the CFI count is determined. Finally, analyses of the multiple diseases are executed with the aid of M-ANFIS. In M-ANFIS, k-medoid clustering is used to cluster the CFI entropy of healthcare BD. The proposed method’s performance is assessed by comparing it with the other existent techniques.

更新日期：2020-01-21
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-18
Bin Liang, Xiaoshe Dong, Yufei Wang, Xingjun Zhang

Abstract As a new type of computing, cloud computing has led to a major computational change. Among many technologies in cloud computing, task scheduling has always been studied as a core issue by industry and academia. In the existing research, the main goal is completion time or load balancing. However, as the expansion of cluster size, energy consumption becomes a problem that must be faced. In this paper, the first of maximum loss scheduling algorithm is proposed. The algorithm is a low-power algorithm that can greatly reduce the energy consumption of cloud computing clusters through loss comparison rule. The effect of this method is more obvious as the cluster size and the number of tasks increase. Experimental simulation results show that the proposed method is significantly better than the Max–Min, Min–Min, Sufferage and E-HEFT algorithms. Compared to Min–Min, Max–Min, Sufferage and E-HEFT algorithms, average completion time of the algorithm reduces 16%, 12%, 8% and 14%, respectively. At the same time, the load balancing effect is also better than Min–Min and Sufferage algorithms.

更新日期：2020-01-21
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-18
Kamalesh Karmakar, Rajib K. Das, Sunirmal Khatua

Abstract High-performance computing in a cloud environment may require massive data transfer among some of the virtual machines (VMs). These VMs are deployed in physical machines (hosts) of a data center. The data transfer among the communicating VMs may use the same shared communication links of the data center. Hence, it is important to have efficient bandwidth allocation policies for different data transfer requests (DTRs) which result in better utilization of bandwidth and fair allocation among the DTRs. In this paper, a few bandwidth allocation policies are proposed and their performances are analyzed. While designing these policies, the objective is the maximization of throughput and bandwidth utilization while minimizing the service time and turnaround time. Some of the policies are based on integer linear programming (ILP) which runs in exponential time while others are based on polynomial-time heuristics. Experimental results show that the performances of heuristic-based policies are comparable to those given by ILP-based exponential time policies.

更新日期：2020-01-21
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-18
Maria Pantoja, Maxence Weyrich, Gerardo Fernández-Escribano

Abstract Magnetic resonance imaging (MRI) of the brain is a safe and painless test that uses a magnetic field and radio waves to produce detailed images of the brain. FreeSurfer is a tool neuroscientists use to create models of structures in the brain. An average MRI analysis using FreeSurfer takes around 7 h on a central processing unit with 4 cores. Since execution time is so high, researchers are working on different ways to parallelize the software. Most efforts are concentrated on parallelization using multicore, specifically with OpenMP (an implementation of multithreading) reducing execution time around 20%. In this paper, we further accelerate the analysis time for FreeSurfer using the manycore processors, special multicore processors containing from dozens to thousands simpler independent cores. Specifically, we will use graphics processing unit (GPU) a manycore with thousands of simpler cores. Multicore and manycore using GPU acceleration are not mutually exclusive (we will call it GPU acceleration from now on), and we present an implementation that uses both types of accelerations (multicore and GPU). Results show that execution times using both accelerations reduce the analysis time by 70%. Manycore processors are specialist multicore processors designed for a high degree of parallel processing, containing numerous simpler, independent processor cores (from a few tens of cores to thousands or more). Manycore processors are used extensively in embedded computers and high-performance computing.

更新日期：2020-01-21
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-16
Minseo Kang, Jae-Gil Lee

Spark is one of the most widely used systems for the distributed processing of big data. Its performance bottlenecks are mainly due to the network I/O, disk I/O, and garbage collection. Previous studies quantitatively analyzed the performance impact of these bottlenecks but did not focus on iterative algorithms. In an iterative algorithm, garbage collection has more performance impact than other workloads because the algorithm repeatedly loads and deletes data in the main memory through multiple iterations. Spark provides three caching mechanisms which are “disk cache,” “memory cache,” and “no cache” to keep the unchanged data across iterations. In this paper, we provide an in-depth experimental analysis of the effect of garbage collection on the overall performance depending on the caching mechanisms of Spark with various combinations of algorithms and datasets. The experimental results show that garbage collection accounts for 16–47% of the total elapsed time of running iterative algorithms on Spark and that the memory cache is no less advantageous in terms of garbage collection than the disk cache. We expect the results of this paper to serve as a guide for the tuning of garbage collection in the running of iterative algorithms on Spark.

更新日期：2020-01-17
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-16
Ali Javed, Khalid Mahmood Malik, Aun Irtaza, Hafiz Malik

Abstract Automated approaches to analyze sports video content have been heavily explored in the last few decades to develop more informative and effective solutions for replay detection, shot classification, key-events detection, and summarization. Shot transition detection and classification are commonly applied to perform temporal segmentation for video content analysis. Accurate shot classification is an indispensable requirement to precisely detect the key-events and generate more informative summaries of the sports videos. The current state-of-the-art have several limitations, i.e., use of inflexible game-specific rule-based approaches, high computational cost, dependency on editing effects, game structure, and camera variations, etc. In this paper, we propose an effective decision tree architecture for shot classification of field sports videos to address the aforementioned issues. For this purpose, we employ the combination of low-, mid-, and high-level features to develop an interpretable and computationally efficient decision tree framework for shot classification. Rule-based induction is applied to create various rules using the decision tree to classify the video shots into long, medium, close-up, and out-of-field shots. One of the significant contributions of the proposed work is to find the most reliable rules that are least unpredictable for shot classification. The proposed shot classification method is robust to variations in camera, illumination conditions, game structure, video length, sports genre, broadcasters, etc. Performance of our method is evaluated on YouTube dataset of three different genre of sports that is diverse in terms of length, quantity, broadcasters, camera variations, editing effects and illumination conditions. The proposed method provides superior shot classification performance and achieves an average improvement of 6.9% in precision and 9.1% in recall as compared to contemporary methods under above-mentioned limitations.

更新日期：2020-01-17
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-16

MapReduce framework is an effective method for big data parallel processing. Enhancing the performance of MapReduce clusters, along with reducing their job execution time, is a fundamental challenge to this approach. In fact, one is faced with two challenges here: how to maximize the execution overlap between jobs and how to create an optimum job scheduling. Accordingly, one of the most critical challenges to achieving these goals is developing a precise model to estimate the job execution time due to the large number and high volume of the submitted jobs, limited consumable resources, and the need for proper Hadoop configuration. This paper presents a model based on MapReduce phases for predicting the execution time of jobs in a heterogeneous cluster. Moreover, a novel heuristic method is designed, which significantly reduces the makespan of the jobs. In this method, first by providing the job profiling tool, we obtain the execution details of the MapReduce phases through log analysis. Then, using machine learning methods and statistical analysis, we propose a relevant model to predict runtime. Finally, another tool called job submission and monitoring tool is used for calculating makespan. Different experiments were conducted on the benchmarks under identical conditions for all jobs. The results show that the average makespan speedup for the proposed method was higher than an unoptimized case.

更新日期：2020-01-17
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-16
Xingquan Li, Cong Cao, Tao Zhang

Abstract Clustering or partition is a fundamental work for graph or network. Detecting communities is a typical clustering, which divides a network into several parts according to the modularity. Community detection is a critical challenge for designing scalable, adaptive and survivable trust management protocol for a community of interest-based social IoT system. Most of the existed methods on community detection suffer from a common issue that the number of communities should be prior decided. This urges us to estimate the number of communities from the data by some way. This paper concurrently considers eliminating the number of communities and detecting communities based on block diagonal dominace adjacency matrix. To construct a block diagonal dominance adjacency matrix for the input network, it first reorders the node number by the breadth-first search algorithm. For the block diagonal dominance adjacency matrix, this paper shows that the numbers of nodes in a community should be continuous adjacent. And thus, it only needs insert some breakpoints in node number sequence to decide the number of communities and the nodes in every community. In addition, a dynamic programming algorithm is designed to achieve an optimal community detection result. Experimental results on a number of real-world networks show the effectiveness of the dynamic programming approach on the community detection problem.

更新日期：2020-01-17
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-16
Fatemeh Safara, Alireza Souri, Thar Baker, Ismaeel Al Ridhawi, Moayad Aloqaily

Abstract The Internet of Things (IoT) devices gather a plethora of data by sensing and monitoring the surrounding environment. Transmission of collected data from the IoT devices to the cloud through relay nodes is one of the many challenges that arise from IoT systems. Fault tolerance, security, energy consumption and load balancing are all examples of issues revolving around data transmissions. This paper focuses on energy consumption, where a priority-based and energy-efficient routing (PriNergy) method is proposed. The method is based on the routing protocol for low-power and lossy network (RPL) model, which determines routing through contents. Each network slot uses timing patterns when sending data to the destination, while considering network traffic, audio and image data. This technique increases the robustness of the routing protocol and ultimately prevents congestion. Experimental results demonstrate that the proposed PriNergy method reduces overhead on the mesh, end-to-end delay and energy consumption. Moreover, it outperforms one of the most successful routing methods in an IoT environment, namely the quality of service RPL (QRPL).

更新日期：2020-01-17
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-16
Ray-I Chang, Yu-Hsuan Chiu, Jeng-Wei Lin

Tuberculosis (TB) has been one of top 10 leading causes of death. A computer-aided diagnosis system to accelerate TB diagnosis is crucial. In this paper, we apply convolutional neural network and deep learning to classify the images of TB culture test—the gold standard of TB diagnostic test. Since the dataset is small and imbalanced, a transfer learning approach is applied. Moreover, as the recall of non-negative class is an important metric for this application, we propose a two-stage classification method to boost the results. The experiment results on a real dataset of TB culture test (1727 samples with 16,503 images from Tao-Yuan General Hospital, Taiwan) show that the proposed method can achieve 99% precision and 98% recall on the non-negative class.

更新日期：2020-01-16
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-16
Heithem Abbes, Thouraya Louati, Christophe Cérin

Abstract Infrastructure-as-a-service container-based virtualization is gaining interest as a platform for running distributed applications. With increasing scale of cloud architectures, faults are becoming a frequent occurrence, which makes availability true challenge. Replication is a method to survive failures whether of checkpoints, containers or data to increase their availability. In fact, following a node failure, fault-tolerant cloud systems restart failed containers on a new node from distributed images of containers (or checkpoints). With a high failure rate, we can lose some replicas. It is interesting to increase the replication factor in some cases and finding the trade-off between restarting all failed containers and storage overhead. This paper addresses the issue of adapting the replication factor and contributes with a novel replication factor modeling approach, which is able to predict the right replication factor using prediction techniques. These techniques are based on experimental modeling, which analyze collected data related to different executions. We have used regression technique to find the relation between availability and replicas number. Experiments on the Grid’5000 testbed demonstrate the benefits of our proposal to satisfy the availability requirement, using a real fault-tolerant cloud system.

更新日期：2020-01-16
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-15
Jung-hyun Seo, HyeongOk Lee

Network cost is equal to degree × diameter and is one of the important measurements when evaluating graphs. Torus and hypercube are very well-known graphs. When these graphs expand, a Torus has an advantage in that its degree does not increase. A hypercube has a shorter diameter than that of other graphs, because when the graph expands, the diameter increases by 1. Hypercube Qn has 2n nodes, and its diameter is n. We propose the rotational binary graph (RBG), which has the advantages of both hypercube and Torus. RBGn has 2n nodes and a degree of 4. The diameter of RBGn would be 1.5n + 1. In this paper, we first examine the topology properties of RBG. Second, we construct a binary spanning tree in RBG. Third, we compare other graphs to RBG considering network cost specifically. Fourth, we suggest a broadcast algorithm with a time complexity of 2n − 2. Finally, we prove that RBGn embedded into hypercube Qn results in dilation n, and expansion 1, and congestion 7.

更新日期：2020-01-15
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-15
S. Kumaresan, Vijayaragavan Shanmugam

Cloud computing becomes more sophisticated to provide different services at different levels of user access. Even though various services are accessed at a different level, the security of data being accessed is a highly challenging one. However, there is a number of encryption approaches discussed toward the problem of cloud security; they suffer to achieve higher security as required. The previous ABFD (attribute-based flexible delegation) algorithm uses a set of policies in encrypting the data with specific keys mentioned in the policy. However, the leakage of encryption policy would introduce poor security which can be overcome by adopting multitype encryption standards in different time windows. According to this, an efficient time-variant attribute-based multitype encryption algorithm (TAM) is presented in this paper. The TAM algorithm maintains a taxonomy of attributes and related keys to be used for encryption and decryption. The corresponding keys have been used to generate the ciphertext. The content of taxonomy has been dynamically changing in each time window which makes the difference in integrity management and security performance than previous algorithms. The TAM approach introduces a higher security performance up to 89.6%. The method also reduces the time complexity up to 21 s and increases the throughput performance up to 96%.

更新日期：2020-01-15
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-15
Zne-Jung Lee, Chou-Yuan Lee

A student dropping out of university means that he/she quits the university early. Increasingly more students are dropping out of university, the reasons for which vary. It is an important issue for universities to predict students wanting to drop out in advance. Such information would allow them to find useful strategies to help university students and prevent them from dropping out. Compared with all students at a university, student dropping out is a relatively rare event. This represents an issue of imbalanced data. In such data, the majority of classes have more instances than do minority classes. Conventional algorithms classify the minority classes into majority classes and then ignore the minority classes. When data grow with imbalanced features, it becomes difficult to solve these problems with conventional algorithms. An algorithm is proposed to predict students dropping out of a university. In this algorithm, a parallel framework based on Apache Spark with three approaches is presented to parallel process the data on students dropping out of a university. Thereafter, the improved bacterial foraging optimization (BFO) and ensemble method are used to improve the classification execution. This technique is applied to a real scenario from a university in Taiwan. The dataset taken from the UCI machine learning repository is also used to verify the correctness of the introduced parallel intelligent algorithm. The error rate for students dropping out is 7.65% for this algorithm, which shows that the proposed algorithm surpasses the performance of the compared techniques. The outcomes of the suggested algorithm will provide useful information for decision making.

更新日期：2020-01-15
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-14
Omar A. Alzubi, Jafar A. Alzubi, Osama Dorgham, Mohammad Alsayyed

Abstract The ultimate goal of modern cryptography is to protect the information resource and make it absolutely unbreakable and beyond compromise. However, throughout the history of cryptography, thousands of cryptosystems emerged and believed to be invincible and yet attackers were able to break and compromise their security. The main objective of this paper is to design a robust cryptosystem that will be suitable to be implemented in Internet of Things. The proposed cryptosystem is based on algebraic geometric curves, more specifically on Hermitian curves. The new cryptosystem design is called Hermitian-based cryptosystem (HBC). During the development of the HBC design, Kerckhoffs’s desideratum was the main guidance principle, which has been satisfied by choosing the Hermitian curves as the core of the proposed design. The proposed HBC inherits all the advantageous characteristics of Hermitian curve which are large number of points that satisfy the curve and high genus curves. The aforementioned characteristics play a crucial role in generating a large size encryption key for HBC and determine the block size of plaintext. Due to the fact that HBC used algebraic geometric codes over Hermitian curve, it has the ability to perform error correction in addition to data encryption. The error correction is another advantage of HBC compared with many existing cryptosystems such as McEliece cryptosystem. The number of errors that can be corrected by HBC is larger (high data rate) than other algebraic geometric codes such as elliptic and hyperelliptic curves. It also uses non-binary representation which increases its attack resistance. In this paper, the proposed HBC has been mathematically compared with elliptic curve cryptosystem. The results show that HBC has many advantages over the elliptic curves in terms of number of points and genus of the curve.

更新日期：2020-01-15
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-14
Beibei Pang, Fei Hao, Yixuan Yang, Doo-Soon Park

The increasing social problems on population, resources and environment enable the interaction between nature and humanity to become one of the most active research fields in the world. In this paper, we propose a novel framework of human–land sustainable computational system, which fully advances the progress of the development of our society utilizing the cloud computing and big data analysis technologies. Particularly, the study on quality of land management has attracted much attention. With the proposed framework, multi-user multi-cloud environment (MUMCE) is firstly presented, and evaluation of land quality is regarded as various services, such as soil acidity and alkalinity, soil thickness, soil texture, smoothness and field layout. Then, this paper formulates the problem of formal concept analysis-based multi-cloud composition recommendation with regard to multiple users. To address this problem, this paper first adopts the collaborative filtering to obtain the services request of the target user, then the service–provider concept lattices are constructed, and finally the best multi-cloud composition is selected and further recommended for the target user. Meanwhile, the corresponding algorithm is also devised. A case study is conducted for evaluating the feasibility and effectiveness of the proposed approach.

更新日期：2020-01-14
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-14
Aditya Khamparia, Deepak Gupta, Victor Hugo C. de Albuquerque, Arun Kumar Sangaiah, Rutvij H. Jhaveri

Cervical cancer is one of the fastest growing global health problems and leading cause of mortality among women of developing countries. Automated Pap smear cell recognition and classification in early stage of cell development is crucial for effective disease diagnosis and immediate treatment. Thus, in this article, we proposed a novel internet of health things (IoHT)-driven deep learning framework for detection and classification of cervical cancer in Pap smear images using concept of transfer learning. Following transfer learning, convolutional neural network (CNN) was combined with different conventional machine learning techniques like K nearest neighbor, naïve Bayes, logistic regression, random forest and support vector machines. In the proposed framework, feature extraction from cervical images is performed using pre-trained CNN models like InceptionV3, VGG19, SqueezeNet and ResNet50, which are fed into dense and flattened layer for normal and abnormal cervical cells classification. The performance of the proposed IoHT frameworks is evaluated using standard Pap smear Herlev dataset. The proposed approach was validated by analyzing precision, recall, F1-score, training–testing time and support parameters. The obtained results concluded that CNN pre-trained model ResNet50 achieved the higher classification rate of 97.89% with the involvement of random forest classifier for effective and reliable disease detection and classification. The minimum training time and testing time required to train model were 0.032 s and 0.006 s, respectively.

更新日期：2020-01-14
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-13
Reza Ramezani

更新日期：2020-01-14
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-11
Smarika Acharya, Abeer Alsadoon, P. W. C. Prasad, Salma Abdullah, Anand Deva

The accurate classification of the histopathological images of breast cancer diagnosis may face a huge challenge due to the complexity of the pathologist images. Currently, computer-aided diagnosis is implemented to get sound and error-less diagnosis of this lethal disease. However, the classification accuracy and processing time can be further improved. This study was designed to control diagnosis error via enhancing image accuracy and reducing processing time by applying several algorithms such as deep learning, K-means, autoencoder in clustering and enhanced loss function (ELF) in classification. Histopathological images were obtained from five datasets and pre-processed by using stain normalisation and linear transformation filter. These images were patched in sizes of 512 × 512 and 128 × 128 and extracted to preserve the tissue and cell levels to have important information of these images. The patches were further pre-trained by ResNet50-128 and ResNet512. Meanwhile, the 128 × 128 were clustered and autoencoder was employed with K-means which used latent feature of image to obtain better clustering result. Classification algorithm is used in current proposed system to ELF. This was achieved by combining SVM loss function and optimisation problem. The current study has shown that the deep learning algorithm has increased the accuracy of breast cancer classification up to 97% compared to state-of-the-art model which gave a percentage of 95%, and the time was decreased to vary from 30 to 40 s. Also, this work has enhanced system performance via improving clustering by employing K-means with autoencoder for the nonlinear transformation of histopathological image.

更新日期：2020-01-13
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-11
Yuntao Duan, Jiangdai Li, Gautam Srivastava, Jyh-Haw Yeh

In the present era, secure data storage for any Internet of Things (IoT) platform is plagued by poor performance of secure read and write operations, which limits the use of data storage security on any IoT platform. Therefore, in this paper, a data storage security method based on double secret key encryption and Hadoop suitable for any IoT platform is proposed. First, the Hadoop deep learning architecture and implementation process are analyzed, and the process of client Kerberos identity authentication in the Hadoop framework is discussed. From this, the current shortcomings of data storage security based on the Hadoop framework are analyzed. The elements of data storage security are also determined. Furthermore, a novel double secret key encryption method for data storage security and to improve the security of stored data itself is introduced. Simultaneously, hash computing is used to improve the read and write performance of data after secure storage. Experimental results clearly show that our proposed method can effectively improve read and write performance of data, and that the performance of data security operations is improved from current standard implementations.

更新日期：2020-01-13
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-11
Yung-Ting Chuang, Feng-Wei Li

As the number of people using the internet has surged over the past few years, more and more people are choosing to share and retrieve information online. There are several decentralized retrieval applications that provide file-sharing platforms for exactly this purpose. However, these applications cannot guarantee churn resilience, trustworthiness, or low cost of retrieval. Therefore, in this paper, we present a system called trustworthy and churn-resilient academic distribution and retrieval system in P2P networks, or TCR, which: (1) ensures that information will not be centralized by central network administrators; (2) utilizes LSH to classify nodes with similar research topics into a local subnetwork, and applies routing algorithms with trust score equations to determine the next trustworthy node to forward the message, thus ensuring each node can accurately and efficiently find its trustworthy nodes within only a few hops; (3) provides a trustworthy management system that itself deals in trustworthiness, ensuring that even when there is a large proportion of malicious nodes, the system can still detect and punish misbehaving nodes; (4) guarantees that nodes can still retrieve the desired files even when in high-churn networks. We finally demonstrate that our TCR entails low message costs, provides high match rates, detects malicious nodes, and ensures churn resilience and search efficiency when compared to other P2P retrieval systems.

更新日期：2020-01-13
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-10
Mario A. Gomez-Rodriguez, Victor J. Sosa-Sosa, Jesus Carretero, Jose Luis Gonzalez

Abstract A complex and important task in the cloud resource management is the efficient allocation of virtual machines (VMs), or containers, in physical machines (PMs). The evaluation of VM placement techniques in real-world clouds can be tedious, complex and time-consuming. This situation has motivated an increasing use of cloud simulators that facilitate this type of evaluations. However, most of the reported VM placement techniques based on simulations have been evaluated taking into account one specific cloud resource (e.g., CPU), whereas values often unrealistic are assumed for other resources (e.g., RAM, awaiting times, application workloads, etc.). This situation generates uncertainty, discouraging their implementations in real-world clouds. This paper introduces CloudBench, a methodology to facilitate the evaluation and deployment of VM placement strategies in private clouds. CloudBench considers the integration of a cloud simulator with a real-world private cloud. Two main tools were developed to support this methodology, a specialized multi-resource cloud simulator (CloudBalanSim), which is in charge of evaluating VM placement techniques, and a distributed resource manager (Balancer), which deploys and tests in a real-world private cloud the best VM placement configurations that satisfied user requirements defined in the simulator. Both tools generate feedback information, from the evaluation scenarios and their obtained results, which is used as a learning asset to carry out intelligent and faster evaluations. The experiments implemented with the CloudBench methodology showed encouraging results as a new strategy to evaluate and deploy VM placement algorithms in the cloud.

更新日期：2020-01-11
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-10
Kobra Mabodi, Mehdi Yusefi, Shahram Zandiyan, Leili Irankhah, Reza Fotohi

Abstract The internet of things (IoT) is able to provide a prediction of linked, universal, and smart nodes that have autonomous interaction when they present services. Because of wide openness, relatively high processing power, and wide distribution of IoT things, they are ideal for attacks of the gray hole. In the gray hole attack, the attacker fakes itself as the shortest path to the destination that is a thing here. This causes the routing packets not to reach the destination. The proposed method is based on the AODV routing protocol and is presented under the MTISS-IoT name which means for the reduction of gray hole attacks using check node information. In this paper, a hybrid approach is proposed based on cryptographic authentication. The proposed approach consists of four phases, such as the verifying node trust in the IoT, testing the routes, gray hole attack discovery, and the malicious attack elimination process in MTISS-IoT. The method is evaluated here via extensive simulations carried out in the NS-3 environment. The experimental results of four scenarios demonstrated that the MTISS-IoT method can achieve a false positive rate of 14.104%, a false negative rate of 17.49%, and a detection rate of 94.5% when gray hole attack was launched.

更新日期：2020-01-11
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-10
N. Mohan

Wireless communication, which has ushered in many advancements in today’s world, uses several protocols. Code Division Multiple Access (CDMA) is a channel access protocol used by different radio communication technologies to allow several transmitters to simultaneously send information over a single communication channel. The current research proposes a Multi-Bearer Coordinate Grouping-Based CDMA (MBCG-based CDMA) framework for periodical data sensing and data gathering. The experimental results show that the proposed method provides equalization of node energy levels and extends the network’s lifetime. In fact, it outperformed the well-known fuzzy logic algorithm and gave a throughput of 98.56%.

更新日期：2020-01-11
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-10
Juan Fang, Mengxuan Wang, Zelin Wei

Multiple CPUs and GPUs are integrated on the same chip to share memory, and access requests between cores are interfering with each other. Memory requests from the GPU seriously interfere with the CPU memory access performance. Requests between multiple CPUs are intertwined when accessing memory, and its performance is greatly affected. The difference in access latency between GPU cores increases the average latency of memory accesses. In order to solve the problems encountered in the shared memory of heterogeneous multi-core systems, we propose a step-by-step memory scheduling strategy, which improve the system performance. The step-by-step memory scheduling strategy first creates a new memory request queue based on the request source and isolates the CPU requests from the GPU requests when the memory controller receives the memory request, thereby preventing the GPU request from interfering with the CPU request. Then, for the CPU request queue, a dynamic bank partitioning strategy is implemented, which dynamically maps it to different bank sets according to different memory characteristics of the application, and eliminates memory request interference of multiple CPU applications without affecting bank-level parallelism. Finally, for the GPU request queue, the criticality is introduced to measure the difference of the memory access latency between the cores. Based on the first ready-first come first served strategy, we implemented criticality-aware memory scheduling to balance the locality and criticality of application access.

更新日期：2020-01-11
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-09
Chunlin Li, Chengyi Wang, Youlong Luo

Abstract The development of Internet of Things leads to an increase in edge devices, and the traditional cloud is unable to meet the demands of the low latency of numerous devices in edge area. On the hand, the media delivery requires high-quality solution to meet ever-increasing user demands. The edge cloud paradigm is put forward to address the issues, which facilitates edge devices to acquire resources dynamically and rapidly from nearby places. However, in order to complete as many tasks as possible in a limited time to meet the needs of users, and to complete the consistency maintenance in as short a time as possible, a two-level scheduling optimization scheme in an edge cloud environment is proposed. The first-level scheduling is by using our proposed artificial fish swarm-based job scheduling method, most jobs will be scheduled to edge data centers. If the edge data center does not have enough resource to complete, the job will be scheduled to centralized cloud data center. Subsequently, the job is divided into same-sized tasks. Then, the second-level scheduling, considering balance load of nodes, the edge cloud task scheduling is proposed to decrease completion time, while the centralized cloud task scheduling is presented to reduce total cost. The experimental results show that our proposed scheme performs better in terms of minimizing latency and completion time, and cutting down total cost.

更新日期：2020-01-11
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-09
Tansel Dokeroglu, Selen Pehlivan, Bilgin Avenoglu

This study proposes a set of new robust parallel hybrid metaheuristic algorithms based on artificial bee colony (ABC) and teaching learning-based optimization (TLBO) for the multi-dimensional numerical problems. The best practices of ABC and TLBO are implemented to provide robust algorithms on a distributed memory computation environment using MPI libraries. Island parallel versions of the proposed hybrid algorithm are observed to obtain much better results than those of sequential versions. Parallel pseudorandom number generators are used to provide diverse solution candidates to prevent stagnation into local optima. The performances of the proposed hybrid algorithms are compared with eight different metaheuristics algorithms of particle swarm optimization, differential evolution variants, ABC variants and evolutionary algorithm. The empirical results show that the new hybrid parallel algorithms are scalable and the best performing algorithms when compared to the state-of-the-art metaheuristics.

更新日期：2020-01-11
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-09
Reza Shahbazian, Francesca Guerriero

Tremendous amounts of data are generated by sensors and connected devices with high velocity in a variety of forms and large volumes. These characteristics, defined as big data, need new models and methods to be processed in near real-time. The nature of decentralized large-scale data sources requires distributed algorithms in which it is assumed that the data sources are capable of processing their own data and collaborating with neighbor sources. The network objective is to make an optimal decision, while the data are processed in a distributed manner. New technologies, like next generation of wireless communication and 5G, introduce practical issues such as imperfect communication that should be addressed. In this paper, we study a generalized form of distributed algorithms for decision-making over decentralized data sources. We propose an optimal algorithm that uses optimal weighting to combine the resource of neighbors. We define an optimization problem and find the solution by applying the proposed algorithm. We evaluate the performance of the developed algorithm by using both mathematical methods and computer simulations. We introduce the conditions in which the convergence of proposed algorithm is guaranteed and prove that the network error decreases considerably in comparison with some of the known modern methods.

更新日期：2020-01-09
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-09
Amir Javadpour, Guojun Wang, Samira Rezaei, Kuan-Ching Li

Straggler task detection is one of the main challenges in applying MapReduce for parallelizing and distributing large-scale data processing. It is defined as detecting running tasks on weak nodes. Considering two stages in the Map phase (copy, combine) and three stages of Reduce (shuffle, sort and reduce), the total execution time is the total sum of the execution time of these five stages. Estimating the correct execution time in each stage that results in correct total execution time is the primary purpose of this paper. The proposed method is based on the application of a backpropagation neural network on the Hadoop for the detection of straggler tasks, to estimate the remaining execution time of tasks that is very important in straggler task detection. Results achieved have been compared with popular algorithms in this domain such as LATE, ESAMR and the real remaining time for WordCount and Sort benchmarks, and shown able to detect straggler tasks and estimate execution time accurately. Besides, it supports to accelerate task execution time.

更新日期：2020-01-09
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-09
Chunlin Li, Jingpan Bai, Youlong Luo

Abstract With the rapid development of information technology, edge computing has grown rapidly by pushing large amounts of computing to the edge of the network. However, due to the rapid growth of edge access devices and limited edge storage space, the edge cloud faces many challenges in addressing the workloads. In this paper, a cost-optimized resource scaling strategy is proposed based on load fluctuation. Firstly, the load prediction model is built based on DBN with supervised learning to predict the workloads of edge cloud. Then, a cost-optimized resource scaling strategy is presented, which comprehensively considers reservation planning and on-demand planning. In the reservation phase, the long-term resource reservation problem is planned as a two-stage stochastic programming problem, which is transformed into a deterministic integer programming problem. In the on-demand phase, the on-demand resource scaling problem planning is solved as an integer programming problem. Finally, extensive experiments are conducted to evaluate the performance of the proposed cost-optimized resource scaling strategy based on load fluctuation.

更新日期：2020-01-09
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-08
Rajesh Devaraj

Real-time embedded systems are increasingly being implemented on heterogeneous multiprocessor platforms in which the same piece of software may require different amounts of time to execute on different processors. Computation of optimal schedules for such systems is non-trivial. Recently, Zhang et al. proposed linear and dynamic programming algorithms for real-time task scheduling for heterogeneous platforms. The authors have formulated a linear programming problem which is then iteratively solved by the linear programming algorithm (LPA) to produce a feasible schedule. Further, they compared the performance of LPA against their proposed dynamic programming algorithm (DPA) and claimed that LPA is superior to DPA, in terms of scalability. In this paper, we show that their linear programming problem does not correctly capture the execution requirement of real-time tasks on heterogeneous platforms. Consequently, LPA fails to produce valid execution schedules for most task sets presented to it. We first illustrate this flaw and strengthen our claim theoretically using a counterexample. Then, we present necessary modifications to their linear programming formulation to address the identified flaw. Finally, we show that our proposed algorithm can be used to find a feasible schedule for real-time task sets, using a real-world case study and experiments.

更新日期：2020-01-08
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-08
Behnam Seyedi, Reza Fotohi

Abstract Internet of Things (IoTs) is a new concept in computer science that connects the objects with limited resources to unreliable internet through different technologies. The fundamental components of IoT (e.g., the wireless sensor networks and the internet) have an unsecured foundation that leads to different vulnerabilities such as vulnerability against a blackhole attack. In a blackhole attack, the attacker fakes itself as the shortest path to the destination, which is a node here. This causes the routing packets to not reach the destination. In this study, we offer a novel intelligent agent-based strategy using the hello packet table (NIASHPT) to deal with these problems by discovering the blackhole attacks. The proposed NIASHPT method provides an intrusion detection system scheme to defend against blackhole attacks and reduce or eliminate such attacks. This method consists of three phases: In the first phase, each node listens to its adjacent nodes and then applies a pre-routing process. In the stages of adjacent node listening and pre-routing, we attempt to find the blackhole attacks. In the second phase of the proposed method, the malicious nodes are detected and separated from the IoT network to avoid emerging attacks along the route from the source to the destination. In the third phase, the selected route from the source to the destination is checked. The method is evaluated here via extensive simulations carried out in the NS-3 environment. The experimental results of four scenarios demonstrated that the NIASHPT method can achieve a false positive rate of 19.453%, a false negative rate of 22.19%, a detection rate of 80.5%, a PDR of 89.56%, and a packet loss rate of 10.04% in the case of launching a blackhole attack.

更新日期：2020-01-08
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-08
Ling Li, Sida Dai, Zhiwei Cao, Jinghui Hong, Shu Jiang, Kunmeng Yang

Abstract In this study, we analyse two mobile phone activity datasets to predict the future traffic of mobile base stations in urban areas. The predicted time series can be used to reflect the trend of human activity flow. Although common methods such as recurrent neural network and long short-term memory (LSTM) network often achieve a high precision, they have the short back of time-consuming. So, we present the improved gradient-boosted decision tree algorithm based on Kalman filter (GBDT-KF) due to the noise in the original time series, because the decrease in the performance of GBDT is usually caused by overfitting the noise in the signal. According to our experiments, although the RMSE of the predicted values of our GBDT-KF and the ground truth is only 12–14% worse than that of the LSTM model, the proposed GBDT-KF algorithm makes a trade-off between the precision and time complexity and achieves over 100-time training time reduction compared with the LSTM model. By implementing the result of our work, service providers could predict where and when a network congestion would happen; therefore, they could take actions ahead of time. Such applications are useful especially in the era of 5G.

更新日期：2020-01-08
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-06
Shuai Zhou, Xianfu Meng

Abstract With the advanced development of wireless communication techniques and the increasing application of smart devices, the mobile P2P ad hoc networks (i.e., P2P MANETs) are attracting more attention. P2P MANETs can be applied to the environments where the communication infrastructure is down due to natural disasters or political tensions. In such a network, how to improve resource search efficiency has been an important research focus. Most existing researches put emphasis on the location-based peers’ clustering but paid less attention to the time factor to implement resource search approaches, resulting in low search efficiency. This paper first proposes a novel location-based peers’ clustering mechanism and a time-aware partners’ selection scheme. Then, we present a resource search algorithm employing both pull and push approaches based on the finding that peer movements are often repeated on a day-to-day basis to cope with peers’ mobility issue in P2P MANETs. The simulation results show that our resource search scheme could both improve the successful search rate and reduce the propagated messages.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-06
José A. Moríñigo, Pablo García-Muller, Antonio J. Rubio-Montero, Antonio Gómez-Iglesias, Norbert Meyer, Rafael Mayo-García

Abstract This work summarizes the results of a set of executions completed on three fat-tree network supercomputers: Stampede at TACC (USA), Helios at IFERC (Japan) and Eagle at PSNC (Poland). Three MPI-based, communication-intensive scientific applications compiled for CPUs have been executed under weak-scaling tests: the molecular dynamics solver LAMMPS; the finite element-based mini-kernel miniFE of NERSC (USA); and the three-dimensional fast Fourier transform mini-kernel bigFFT of LLNL (USA). The design of the experiments focuses on the sensitivity of the applications to rather different patterns of task location, to assess the impact on the cluster performance. The accomplished weak-scaling tests stress the effect of the MPI-based application mappings (concentrated vs. distributed patterns of MPI tasks over the nodes) on the cluster. Results reveal that highly distributed task patterns may imply a much larger execution time in scale, when several hundreds or thousands of MPI tasks are involved in the experiments. Such a characterization serves users to carry out further, more efficient executions. Also researchers may use these experiments to improve their scalability simulators. In addition, these results are useful from the clusters administration standpoint since tasks mapping has an impact on the cluster throughput.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-03
C. A. S. Deiva Preetha, Subburaj Ramasamy

Abstract Software manufacturers need to minimize the number of their software failures in their production environments. So, software reliability becomes a critical factor for these manufacturers to focus on. Software Reliability Growth Models (SRGMs) are used as indicators of the number of failures that may be faced after the shipping of the software and thus are indicators of the readiness of the software for shipping. SRGMs to handle varying operational profiles have been proposed by researchers earlier. However, as it is difficult to predict the nature of the project in advance, the reliability engineer has to try out each model one at a time before zeroing in on the model to be used in the project. We have derived a combination model, called dynamically weighted infinite NHPP combination, using the existing models for determining the release time. The nonparametric dynamically weighted combination model that we propose was validated and was found to be effective.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-02
Misun Yu, Yu-Seung Ma, Doo-Hwan Bae

The Acknowledgements section contains an error. The correct wording is given below.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-02
R. Mythili, Revathi Venkataraman, T. Sai Raj

Cloud file storage systems are the current trend of enterprises and also of individual users. Due to the malicious or unauthorized users, file sharing among the cloud users has become the biggest challenge in the recent times. Attribute-based signcryption (ABSC) is known as the versatile cryptographic primitive which achieves the fine-grained access control over robust cloud storage. ABSC combines attribute-based encryption (ABE) and attribute-based signatures to achieve privacy-oriented confidentiality along with the authenticity. Unfortunately, most of the present ABE and ABSC schemes leverage heavy computational overheads against the key length, the ciphertext size or the expressive access structures used. In this paper, a new ABSC scheme is devised to reduce computational overheads, more particularly, at cloud by introducing the more expressive hypergraph access structure, Attribute HyperGraph. On a positive note, the proposed system outperforms and shows less cloud computational timing without exponentiations unlike Deng et al. (IEEE Access 6:39473–39486, 2018.  https://doi.org/10.1109/ACCESS.2018.2843778), Liu et al. (Future Gener Comput Syst 52:67–76, 2015.  https://doi.org/10.1016/j.future.2014.10.014) and Li et al. (IEEE Trans Parallel Distrib Syst 25(8):2201–2210, 2014.  https://doi.org/10.1109/TPDS.2013.271) schemes. Subsequently, the system does not incur any cryptographic computations associated with designcryption at cloud unlike Deng et al. (2018).

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-02
F. Orts, G. Ortega, A. M. Puertas, I. García, E. M. Garzón

Modern computational platforms are characterized by the heterogeneity of their processing elements. Additionally, there are many algorithms which can be structured as a set of procedures or tasks with different computational cost. Balancing the computational load among the available processing elements is one of the main keys for the optimal exploitation of such heterogeneous platforms. When the processing time of any procedure executed on any of the available processing elements is known, this workload-balancing problem can be modeled as the well-known scheduling on unrelated parallel machines problem. Solving this type of problems is a big challenge due to the high heterogeneity on both, the tasks and the machines. In this paper, the balancing problem has been formally defined as a global optimization problem which minimizes the makespan (parallel runtime) and a heuristic based on a Genetic Algorithm, called Genetic Scheduler (GenS), has been developed to solve it. In order to analyze the behavior of GenS for several heterogeneous clusters, an example taken from the field of statistical mechanics has been considered as a case study: an active microrheology model. Given this type of problem and a heterogeneous cluster, we seek to minimize the total runtime to extend and analyze in depth the case of study. In such context, a task consists of the simulation of a tracer particle pulled into a cubic box with smaller bath particles. The computational load depends on the total number of the bath particles. Moreover, GenS has been compared to other dynamic and static scheduling approaches. The experimental results of such a comparison show that GenS outperforms the rest of the tested alternatives achieving a better distribution of the computational workload on a heterogeneous cluster. So, the scheduling strategy developed in this paper is of potential interest for any application which requires the execution of many tasks of different duration (a priori known) on a heterogeneous cluster.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-01
YoungGiu Jung, Chang-Min Jeong

Abstract The protocol reverse engineering technique can be used to extract the specification of an unknown protocol. However, there is no standardized method, and in most cases, the extracting process is executed manually or semiautomatically. Since only frequently seen values are extracted as fields from the messages of a protocol, it is difficult to understand the complete specification of the protocol. Therefore, if the information about the structure of an unknown protocol could be acquired in advance, it would be easy to conduct reverse engineering. As such, one of the most important techniques for classifying unknown protocols is a feature extraction algorithm. In this paper, we propose a new feature extraction algorithm based on average histogram for classification of an unknown protocol and design unknown protocol classifier using deep belief networks, one of deep learning algorithms. In order to verify the performance of the proposed system, we performed the training using eight open protocols to evaluate the performance using unknown data. Experimental results show that the proposed technique gives significantly more reliable results of about 99% classification performance, regardless of the strength of the modification of the protocol.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-01
Jun Hou, Qianmu Li, Shicheng Cui, Shunmei Meng, Sainan Zhang, Zhen Ni, Ye Tian

Due to the increasing intelligence of data acquisition and analysis in cyber physical systems (CPSs) and the emergence of various transmission vulnerabilities, this paper proposes a differential privacy protection method for frequent pattern mining in view of the application-level privacy protection requirements of industrial interconnected systems. This method designs a low-cohesion algorithm to realize differential privacy protection. In the implementation of differential privacy protection, Top-k frequent mode method is introduced, which combines the factors of index mechanism and low cohesive weight of each mode, and the original support of each selected mode is disturbed by Laplacian noise. It achieves a balance between privacy protection and utility, guarantees the trust of all parties in CPS and provides an effective solution to the problem of privacy protection in industrial Internet systems.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2020-01-01
Feng-Cheng Lin, Huu-Huy Ngo, Chyi-Ren Dow

Face video retrieval is an attractive research topic in computer vision. However, it remains challenges to overcome because of the significant variation in pose changes, illumination conditions, occlusions, and facial expressions. In video content analysis, face recognition has been playing a vital role. Besides, deep neural networks are being actively studied, and deep learning models have been widely used for object detection, especially for face recognition. Therefore, this study proposes a cloud-based face video retrieval system with deep learning. First, a dataset is collected and pre-processed. To produce a useful dataset for the CNN models, blurry images are removed, and face alignment is implemented on the remaining images. Then the final dataset is constructed and used to pre-train the CNN models (VGGFace, ArcFace, and FaceNet) for face recognition. We compare the results of these three models and choose the most efficient one to develop the system. To implement a query, users can type in the name of a person. If the system detects a new person, it performs enrolling that person. Finally, the result is a list of images and time associated with those images. In addition, a system prototype is implemented to verify the feasibility of the proposed system. Experimental results demonstrate that this system outperforms in terms of recognition accuracy and computational time.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2019-12-23
J. Raja, M. Ramakrishnan

Abstract Managing and using industrial Big Data is a big challenge for every industrial enterprise manager. By using the cloud technology, enterprises can handover the task of heavy data management to reliable hands and focus on their main business. Though cloud technology has numerous advantages, there are several privacy and security issues involved. One way in which cloud providers respond to this issue is with their key management service, where encryption keys are used to protect sensitive data present in the cloud. This paper discusses crucial hierarchy-based key management technique called Privacy-Preserving Based on Characteristic Encryption for privacy preservation in the cloud environment.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2019-12-20
Lingjun Zhao, Chunhua Su, Zeyang Dai, Huakun Huang, Shuxue Ding, Xinyi Huang, Zhaoyang Han

Abstract With the increasing demand of indoor location-based services, such as tracking targets in a smart building, device-free localization technique has attracted great attentions because it can locate the targets without employing any attached devices. Due to the limited space and complexity of the indoor environment, there still exist challenges in terms of high localization accuracy and high efficiency for indoor localization. In this paper, for addressing such issues, we first convert the received signal strength (RSS) signals into image pixels. The localization problem is then formulated as an image classification problem. To well handle the variant RSS images, a deep convolutional neural network is then structured for classification. Finally, for validating the proposed scheme, two real testbeds are built in the indoor environments, including a living room and a corridor of an apartment. Experimental results show that the proposed scheme achieves good localization performance. For example, the localization accuracy can reach up to 100% in the scenario of living room and 97.6% in the corridor. Moreover, the proposed approach outperforms the methods of the K-nearest-neighbor and the support vector machines in both the noiseless and noisy environments.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2019-12-18
Nakhoon Baek, Kwan-Hee Yoo

In the field of large-scale data visualization, the graphics rendering speed is one of the most important factors for its application development. Since the large-scale data visualization usually requires three-dimensional representations, the three-dimensional graphics libraries such as OpenGL and DirectX have been widely used. In this paper, we suggest a new way of accelerated rendering, through directly using the direct rendering manager packets. Current three-dimensional graphics features are focused on the efficiency of general purpose rendering pipelines. In contrast, we concentrated on the speed-up of the special-purpose rendering pipeline, for point cloud rendering. Our result shows that we achieved our purpose effectively.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2019-12-18
S. Malathy, Ravi Rastogi, R. Maheswar, G. R. Kanagachidambaresan, T. V. P. Sundararajan, D. Vigneswaran

Increasing the availability of wireless body sensor network is obligatory to monitor the patient inside and outside the hospital environment. Major amount of energy dissipated in the sensor node is through the enormous switching transitions in the transceiving unit, and hence, it becomes essential to minimize the switching transitions to prolong the lifetime of the network. Also, the packet holding cost in terms of energy influences the network lifetime to a greater extent. The novel energy-efficient framework (NEEF) approach provides a solution for lifetime problem and availability problem in the network. The subject is modeled as finite state machine with normal, abnormal and above normal states. The packet in the cluster head is transmitted once it reaches the threshold level. The critical packet is hoped through high energy node to ensure safe data communication. The framework is evaluated with Fail Safe Fault Tolerant algorithm. The lifetime enhancement is achieved through reducing the switching loss of the transceivers circuit. The NEEF provides extended lifetime and throughput and the framework also promotes improved half-life period. The first dead node in case of NEEF framework is high. The NEEF ensures high availability and extended lifetime.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2019-12-17
Elmira Hatami, Bahman Arasteh

Software evolution is a natural phenomenon due to the changing requirements. Understanding the program structure is a significant and complicated factor in maintaining and evolving the software when software lacks the appropriate design documents. Clustering software modules, as a reverse engineering method, can be used to create an abstract structural model of software. Clustering software modules is a method which decomposes software system modules into several clusters (subsystems) by using module dependency graph. Finding the best clustering for the modules of software is regarded as an NP-complete problem. The main purpose of this study is to develop a method for optimal clustering of software modules in such a way that dependent modules are grouped within a cluster. Software module clustering problem was designed as a hybrid/discrete optimization problem. In this paper, using ant colony optimization algorithm, we made an attempt to find a good clustering of software systems. Producing the high-quality clusters of software modules, generating more stable results compared with the previous heuristic methods and attaining higher convergence are the main merits of the proposed method over the previous methods.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2019-12-17
Hamid Arabnejad, João Bispo, João M. P. Cardoso, Jorge G. Barbosa

Abstract Directive-driven programming models, such as OpenMP, are one solution for exploring the potential parallelism when targeting multicore architectures. Although these approaches significantly help developers, code parallelization is still a non-trivial and time-consuming process, requiring parallel programming skills. Thus, many efforts have been made toward automatic parallelization of the existing sequential code. This article presents AutoPar-Clava, an OpenMP-based automatic parallelization compiler which: (1) statically detects parallelizable loops in C applications; (2) classifies variables used inside the target loop based on their access pattern; (3) supports reduction clauses on scalar and array variables whenever it is applicable; and (4) generates a C OpenMP parallel code from the input sequential version. The effectiveness of AutoPar-Clava is evaluated by using the NAS and Polyhedral Benchmark suites and targeting a x86-based computing platform. The achieved results are very promising and compare favorably with closely related auto-parallelization compilers, such as Intel C/C++ Compiler (icc), ROSE, TRACO and CETUS.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2019-12-16

The wording of Sasan Hossein Alizadeh’s name was incorrect. The correct wording is given here. The original article has been corrected.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2019-12-16
Amanpreet Singh, Maninder Kaur

Abstract The subject of content-based cybercrime has put on substantial coverage in recent past. It is the need of the time for web-based social media providers to have the capability to distinguish oppressive substance both precisely and proficiently to secure their clients. Support vector machine (SVM) is usually acknowledged as an efficient supervised learning model for various classification problems. Nevertheless, the success of an SVM model relies upon the ideal selection of its parameters as well as the structure of the data. Thus, this research work aims to concurrently optimize the parameters and feature selection with a target to build the quality of SVM. This paper proposes a novel hybrid model that is the integration of cuckoo search and SVM, for feature selection and parameter optimization for efficiently solving the problem of content-based cybercrime detection. The proposed model is tested on four different datasets obtained from Twitter, ASKfm and FormSpring to identify bully terms using Scikit-Learn library and LIBSVM of Python. The results of the proposed model demonstrate significant improvement in the performance of classification on all the datasets in comparison to recent existing models. The success rate of the SVM classifier with the excellent recall is 0.971 via tenfold cross-validation, which demonstrates the high efficiency and effectiveness of the proposed model.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2019-12-14
Fatih Özyurt

Convolutional neural networks (CNNs) have recently emerged as a popular topic for machine learning in various academic and industrial fields. It is often an important problem to obtain a dataset with an appropriate size for CNN training. However, the lack of training data in the case of remote image research leads to poor performance due to the overfitting problem. In addition, the back-propagation algorithm used in CNN training is usually very slow and thus requires tuning different hyper-parameters. In order to overcome these drawbacks, a new approach fully based on machine learning algorithm to learn useful CNN features from Alexnet, VGG16, VGG19, GoogleNet, ResNet and SqueezeNet CNN architectures is proposed in the present study. This method performs a fast and accurate classification suitable for recognition systems. Alexnet, VGG16, VGG19, GoogleNet, ResNet and SqueezeNet pretrained architectures were used as feature extractors. The proposed method obtains features from the last fully connected layers of each architecture and applies the ReliefF feature selection algorithm to obtain efficient features. Then, selected features are given to the support vector machine classifier with the CNN-learned features instead of the FC layers of CNN to obtain excellent results. The effectiveness of the proposed method was tested on the UC-Merced dataset. Experimental results demonstrate that the proposed classification method achieved an accuracy rate of 98.76% and 99.29% in 50% and 80% training experiment, respectively.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2019-12-14
G. Prakash, Raja Krishnamoorthy, P. T. Kalaivaani

Abstract With vehicular ad hoc networks (VANETs) increasingly becoming a term synonymous with inter-vehicle communication, a lot of research is being carried out in this line to provide enhanced services. This paper proposes an efficient routing algorithm to improve energy, enhance security and maximize throughput in VANETs which is called as Energy Sources Based Resource Key Distribution and Allocation (ESBRKD-A). The performance of ESBRKD-A is analyzed in network simulator (NS2). The experimental results show that it generates 86% throughput and reduces the routing delay by up to 13% in comparison with two existing protocols viz., cross-layer optimization for heterogeneous energy and optimal scheduling in energy harvesting. ESBRKD-A also increases the lifetime of the network.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2019-12-12
Somnath Mazumdar, Alberto Scionti

There is increasing number of works addressing the design challenges of fast, scalable solutions for the growing number of new type of applications. Recently, many of the solutions aimed at improving processing element capabilities to speed up the execution of machine learning application domain. However, only a few works focused on the interconnection subsystem as a potential source of performance improvement. Wrapping many cores together offer excellent parallelism, but it brings other challenges (e.g. adequate interconnections). Scalable, power-aware interconnects are required to support such a growing number of processing elements, as well as modern applications. In this paper, we propose a scalable and energy-efficient network-on-chip architecture fusing the advantages of rings as well as the 2D mesh without using any bridge router to provide high performance. A dynamic adaptation mechanism allows to better adapt to the application requirements. Simulation results show efficient power consumption (up to $$141.3\%$$ saving for connecting 1024 cores), $$2{\times}$$ (on average) throughput growth with better scalability (up to 1024 processing elements) compared to popular 2D mesh while tested in multiple statistical traffic pattern scenarios.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2019-12-11

Nowadays, prediction of abnormality plays a vital role in healthcare applications for deciding the treatment and guiding for proper treatment on time. The amniotic fluid is the water of the womb, and it is a strong indicator of congenital fetal anomaly. The automatic calculation of amniotic fluid index (AFI) and shape features of varying gestational periods will be useful to predict the perinatal outcome of high risk in maternity patients. Some perinatal outcomes are expected fetal weight, head circumferences and need of new-born ICU which decide the mode of delivery. These perinatal outcomes will be helpful in increasing the live birth and reducing the risk of premature delivery. The aim of this work is to identify the abnormal AFI of expectant mothers to alert the clinicians. Computer-aided diagnosis supports the clinicians in decision-making process. In the proposed work, using the training set of ultrasound images, the shape templates are developed by using deformable methods. Contour points in the edges will be helpful to find the AFI. After that, features are extracted and fuzzy logic algorithm is used to classify the given image into one of the four categories such as oligohydramnios, borderline, normal and hydramnios state for expectant mothers and their impact on fetal growth. The outcome of the proposed approach is measured in two different ways. The first outcome is that calculated AFI will be compared with the value calculated by the radiologist/clinicians, and the second outcome is that along with AFI, shape feature with contour points and gestational age are used for making decision/classification such as normal, borderline, oligohydramnios and hydramnios, and the classified results will also be compared with the expert’s opinion. The outcomes are represented quantitatively. The results proved that AFI calculated by the proposed work was matching 94% with the expert opinion and classification of test image into any one of the categories such as normal, borderline, oligohydramnios and hydramnios fetched average accuracy of prediction up to 92.5%.

更新日期：2020-01-06
• J. Supercomput. (IF 2.157) Pub Date : 2019-12-11
Chuanlong Yin, Yuefei Zhu, Shengli Liu, Jinlong Fei, Hetong Zhang

Abstract The performance of classifiers has a direct impact on the effectiveness of intrusion detection system. Thus, most researchers aim to improve the detection performance of classifiers. However, classifiers can only get limited useful information from the limited number of labeled training samples, which usually affects the generalization of classifiers. In order to enhance the network intrusion detection classifiers, we resort to adversarial training, and a novel supervised learning framework using generative adversarial network for improving the performance of the classifier is proposed in this paper. The generative model in our framework is utilized to continuously generate other complementary labeled samples for adversarial training and assist the classifier for classification, while the classifier in our framework is used to identify different categories. Meanwhile, the loss function is deduced again, and several empirical training strategies are proposed to improve the stabilization of the supervised learning framework. Experimental results prove that the classifier via adversarial training improves the performance indicators of intrusion detection. The proposed framework provides a feasible method to enhance the performance and generalization of the classifier.

更新日期：2020-01-06
Contents have been reproduced by permission of the publishers.

down
wechat
bug