-
A Real Time Deep Learning Based Approach for Detecting Network Attacks Big Data Res. (IF 3.3) Pub Date : 2024-02-27 Christian Callegari, Stefano Giordano, Michele Pagano
Anomaly-based Intrusion Detection is a key research topic in network security due to its ability to face unknown attacks and new security threats. For this reason, many works on the topic have been proposed in the last decade. Nonetheless, an ultimate solution, able to provide a high detection rate with an acceptable false alarm rate, has still to be identified. In the last years big research efforts
-
An Integration visual navigation algorithm for urban air mobility Big Data Res. (IF 3.3) Pub Date : 2024-02-23 Yandong Li, Bo Jiang, Long Zeng, Chenglong Li
This paper presents an integration visual navigation algorithm called PnP-ORBSLAM for UAV position estimation in Urban Air Mobility (UAM). ORBSLAM is a popular and benchmark algorithm for vision based navigation applications. The proposed method improve the performance of ORBSLAM by adding a post-processing marker recognition phase to the model. Based on the features extracted from the markers, PnP
-
Investigating Influence of Google-Play Application Titles on Success Big Data Res. (IF 3.3) Pub Date : 2024-02-21 Ahmad Bilal, Hamid Turab Mirza, Ibrar Hussain, Adnan Ahmad
The title (name) is the primary information related to a mobile (smartphone) application, as it describes its functions and services. An eye-catching title can entice customers to choose a certain application over others. Application development companies are well aware of this phenomenon and invest significant efforts in crafting their application titles with compelling keywords, phrases and topics
-
Scholar's Career Switch from Academia to Industry: Mining and Analysis from AMiner Big Data Res. (IF 3.3) Pub Date : 2024-02-19 Zhou Shao, Sha Yuan, Yinyu Jin, Yongli Wang
The phenomenon of scholars switching their careers from academia to industry has become more prevalent nowadays. This paper proposes a combination approach of bibliometrics analysis and data mining to study the phenomenon from the perspective of Science of Science (SciSci). Based on the proposed methods, this paper first provides an overview of frequent companies and frequent universities as well as
-
Interactive big data visualization and analytics Big Data Res. (IF 3.3) Pub Date : 2024-02-14 David Auber, Nikos Bikakis, Panos K. Chrysanthis, George Papastefanatos, Mohamed Sharaf
-
A big data driven vegetation disease and pest region identification method based on self supervised convolutional neural networks and parallel extreme learning machines Big Data Res. (IF 3.3) Pub Date : 2024-02-13 Bo Jiang, Hao Wang, Hanxu Ma
A self supervised convolutional neural network-parallel extreme learning machine classification model based on big data is proposed to address the subjectivity and inaccuracy of traditional methods for identifying vegetation pests and diseases that rely on manual observation and empirical judgment. This model is constructed using convolutional neural networks and parallel extreme learning machines
-
Knowledge Distillation via Token-Level Relationship Graph Based on the Big Data Technologies Big Data Res. (IF 3.3) Pub Date : 2024-02-12 Shuoxi Zhang, Hanpeng Liu, Kun He
In the big data era, characterized by vast volumes of complex data, the efficiency of machine learning models is of utmost importance, particularly in the context of intelligent agriculture. Knowledge distillation (KD), a technique aimed at both model compression and performance enhancement, serves as a pivotal solution by distilling the knowledge from an elaborate model (teacher) to a lightweight
-
Attentive Implicit Relation Embedding for Event Recommendation in Event-Based Social Network Big Data Res. (IF 3.3) Pub Date : 2024-02-05 Yuan Liang
The vent-ased ocial etwork (EBSN) is a new type of social network that combines online and offline networks, and its primary goal is to recommend appropriate events to users. Most studies do not model event recommendations on the EBSN platform as graph representation learning, nor do they consider the implicit relationship between events, resulting in recommendations that are not accepted by users
-
-
Chlorophyll-a concentration variations in Bohai sea: Impacts of environmental complexity and human activities based on remote sensing technologies Big Data Res. (IF 3.3) Pub Date : 2024-02-03 Yong Du, Xiaoyu Zhang, Shuchang Ma, Nan Yao
This study extensively explores the intricate dynamics of the Bohai Sea ecosystem, a semi-closed marginal sea in China, influenced by both environmental complexity and human activities. By utilizing chlorophyll-a as an indicator, we closely examine how phytoplankton responds to coastal environmental conditions and stressors. The temporal analysis conducted over the 23-year period from 1998 to 2020
-
Tropical cyclone trajectory based on satellite remote sensing prediction and time attention mechanism ConvLSTM model Big Data Res. (IF 3.3) Pub Date : 2024-02-03 Tongfei Li, Mingzheng Lai, Shixian Nie, Haifeng Liu, Zhiyao Liang, Wei Lv
The accurate and timely prediction of tropical cyclones is of paramount importance in mitigating the impact of these catastrophic meteorological events. Presently, methods for predicting tropical cyclones based on satellite remote sensing images encounter notable challenges, including the inadequate extraction of three-dimensional spatial features and limitations in long-term forecasting. As a response
-
Graph Spatial-Temporal Transformer Network for Traffic Prediction Big Data Res. (IF 3.3) Pub Date : 2024-01-26 Zhenzhen Zhao, Guojiang Shen, Lei Wang, Xiangjie Kong
Traffic information can reflect the operating status of a city, and accurate traffic forecasting is critical in intelligent transportation systems (ITS) and urban planning. However, traffic information has complex nonlinearity and dynamic spatial-temporal dependencies due to human mobility, bringing new traffic forecasting challenges. This paper proposed a graph spatial-temporal transformer network
-
Airspace situation analysis of terminal area traffic flow prediction based on big data and machine learning methods Big Data Res. (IF 3.3) Pub Date : 2024-01-18 Yandong Li, Bo Jiang, Weilong Liu, Chenglong Li, Yunfan Zhou
Real-time and accurate prediction of terminal area arrival traffic flow is a key issue for terminal area traffic management. In this paper, we study the advantages and disadvantages of traditional dynamics-based prediction methods and time-series based prediction methods in the first step. Taking the advantages of the two type of methods, a terminal area arrival flow prediction framework based on airspace
-
The Predictability of Stock Price: Empirical Study on Tick Data in Chinese Stock Market Big Data Res. (IF 3.3) Pub Date : 2023-11-17 Yueshan Chen, Xingyu Xu, Tian Lan, Sihai Zhang
Whether or not stocks are predictable has been a topic of concern for decades. The efficient market hypothesis (EMH) says that it is difficult for investors to make extra profits by predicting stock prices, but this may not be true, especially for the Chinese stock market. Therefore, we explore the predictability of the Chinese stock market based on tick data, a widely studied high-frequency data.
-
Cost optimization model design of fresh food cold chain system in the context of big data Big Data Res. (IF 3.3) Pub Date : 2023-11-11 Lei Wang, Guangjun Liu, Ibrar Ahmad
The assessment of cold chain logistics for fresh products can be more precise with high-dimensional information data, providing valuable insights for the optimization of associated costs. Nonetheless, traditional data processing techniques fail to meet the processing efficiency required for such high-dimensional cold chain logistics data. Therefore, this paper proposes a spectral clustering algorithm
-
A methodology to assess and evaluate sites with high potential for stormwater harvesting in Dehradun, India Big Data Res. (IF 3.3) Pub Date : 2023-11-10 Shray Pathak, Shreya Sharma, Abhishek Banerjee, Sanjeev Kumar
The urgency to protect natural water resources in a sustainable manner has risen as water scarcity and global climate change continue to worsen. Among various methods of collecting water, stormwater harvesting (SWH) is regarded as the most environmentally friendly approach to alleviating the strain on freshwater resources. The study introduces a robust approach to evaluating the potential for SWH,
-
Wetland identification through remote sensing: Insights into wetness, greenness, turbidity, temperature, and changing landscapes Big Data Res. (IF 3.3) Pub Date : 2023-11-09 Rana Waqar Aslam, Hong Shu, Kanwal Javid, Shazia Pervaiz, Farhan Mustafa, Danish Raza, Bilal Ahmed, Abdul Quddoos, Saad Al-Ahmadi, Wesam Atef Hatamleh
-
ML-aVAT: A Novel 2-Stage Machine-Learning Approach for Automatic Clustering Tendency Assessment Big Data Res. (IF 3.3) Pub Date : 2023-10-31 Harshal Mittal, Jagarlamudi Sai Laxman, Dheeraj Kumar
Clustering tendency assessment, which aims to deduce if a dataset contains any cluster structure, and, if it does, how many clusters it has, is a critical problem in exploratory data analysis. The VAT family of algorithms provides a “visual” means to assess the clustering tendency for various datasets. The VAT algorithm operates by reordering the pairwise distance matrix of the input data. When viewed
-
Early Pathogen Prediction in Crops Using Nano Biosensors and Neural Network-Based Feature Extraction and Classification Big Data Res. (IF 3.3) Pub Date : 2023-09-17 Mohammad Khalid Imam Rahmani, Hayder M.A. Ghanimi, Syeda Fizzah Jilani, Muhammad Aslam, Meshal Alharbi, Roobaea Alroobaea, Sudhakar Sengan
The most prevalent microbe-caused issues that reduce agricultural output globally are viral and bacterial infections. It is currently quite challenging to identify pathogens due to the current living situation. Biosensors have become the standard for monitoring microbial and viral macromolecules. Disease diagnosis is improved by following the nanoparticles released by infections. Since the sensors'
-
End-PolarT: Polar Representation for End-to-End Scene Text Detection Big Data Res. (IF 3.3) Pub Date : 2023-09-15 Yirui Wu, Qiran Kong, Cheng Qian, Michele Nappi, Shaohua Wan
Deep learning has achieved great success in text detection, where recent methods adopt inspirations from segmentation to detect scene texts. However, most segmentation based methods have high computation cost in pixel-level classification and post refinements. Moreover, they still faces challenges like arbitrary directions, curved texts, illumination and so on. Aim to improve detection accuracy and
-
Study on the Temporal and Spatial Evolution Characteristics of Chinese Public's Cognition and Attitude to “Double Reduction” Policy Based on Big Data Big Data Res. (IF 3.3) Pub Date : 2023-09-11 Jiahui Liu, Wei Liu, Chun Yan, Xinhong Liu
The “double reduction” policy is a policy innovation of China's comprehensive education reform to build a high-quality education system. The public's cognition and attitude toward it are of great significance to its actual implementation. A total of 98396 texts related to “double reduction” collected from Sina-Weibo by web crawler technology are investigated to explore the public's cognition and attitude
-
Classifier-Based Nonuniform Time Slicing Method for Local Community Evolution Analysis Big Data Res. (IF 3.3) Pub Date : 2023-09-09 Xiangyu Luo, Tian Wang, Gang Xin, Yan Lu, Ke Yan, Ying Liu
With the rapid expansion of the scale of a dynamic network, local community evolution analysis attracts much attention because of its efficiency and accuracy. It concentrates on a particularly interested community rather than considering all communities together. A fundamental problem is how to divide time into slices so that a dynamic network is represented as a sequence of snapshots which accurately
-
An Improved CycleGAN for Data Augmentation in Person Re-Identification Big Data Res. (IF 3.3) Pub Date : 2023-09-09 Zhenzhen Yang, Jing Shao, Yongpeng Yang
Person re-identification (ReID) has attracted more and more attention, which is to retrieve interested persons across multiple non-overlapping cameras. Matching the same person between different camera styles has always been an enormous challenge. In the existing work, cross-camera styles images generated by the cycle-consistent generative adversarial network (CycleGAN) only transfer the camera resolution
-
A Large Comparison of Normalization Methods on Time Series Big Data Res. (IF 3.3) Pub Date : 2023-08-22 Felipe Tomazelli Lima, Vinicius M.A. Souza
Normalization is a mandatory preprocessing step in time series problems to guarantee similarity comparisons invariant to unexpected distortions in amplitude and offset. Such distortions are usual for most time series data. A typical example is gait recognition by motion collected on subjects with varying body height and width. To rescale the data for the same range of values, the vast majority of researchers
-
-
Parallel Framework for Memory-Efficient Computation of Image Descriptors for Megapixel Images Big Data Res. (IF 3.3) Pub Date : 2023-06-29 Amr M. Abdeltif, Khalid M. Hosny, Mohamed M. Darwish, Ahmad Salah, Kenli Li
Image moments are image descriptors widely utilized in several image processing, pattern recognition, computer vision, and multimedia security applications. In the era of big data, the computation of image moments yields a huge memory demand, especially for large moment order and/or high-resolution images (i.e., megapixel images). The state-of-the-art moment computation methods successfully accelerate
-
A Multi-View Filter for Relation-Free Knowledge Graph Completion Big Data Res. (IF 3.3) Pub Date : 2023-05-29 Juan Li, Wen Zhang, Hongtao Yu
As knowledge graphs are often incomplete, knowledge graph completion methods have been widely proposed to infer missing facts by predicting the missing element of a triple given the other two elements. However, the assumption that the two elements have to be correlated is strong. Thus in this paper, we investigate relation-free knowledge graph completion to predict relation-tail(r-t) pairs given a
-
A Big Data Framework to Address Building Sum Insured Misestimation Big Data Res. (IF 3.3) Pub Date : 2023-05-24 Callum Roberts, Adrian Gepp, James Todd
In the insurance industry, the accumulation of complex problems and volume of data creates a large scope for actuaries to apply big data techniques to investigate and provide unique solutions for millions of policyholders. With much of the actuarial focus on traditional problems like price optimisation or improving claims management, there is an opportunity to tackle other known product inefficiencies
-
Botnet DGA Domain Name Classification Using Transformer Network with Hybrid Embedding Big Data Res. (IF 3.3) Pub Date : 2023-05-12 Ling Ding, Peng Du, Haiwei Hou, Jian Zhang, Di Jin, Shifei Ding
One of the severest threats to cyber security is botnet, which typically uses domain names generated by Domain Generation Algorithms (DGAs) to communicate with their Command and Control (C&C) infrastructure. DGA detection and classification play an important role of assisting cyber security researchers to detect botnet C&C servers. However, many of the existing DGA detection models only focus on single
-
Meta-Learning Based Dynamic Adaptive Relation Learning for Few-Shot Knowledge Graph Completion Big Data Res. (IF 3.3) Pub Date : 2023-05-05 Linqin Cai, Lingjun Wang, Rongdi Yuan, Tingjie Lai
As artificial intelligence gradually steps into cognitive intelligence stage, knowledge graphs (KGs) play an increasingly important role in many natural language processing tasks. Due to the prevalence of long-tail relations in KGs, few-shot knowledge graph completion (KGC) for link prediction of long-tail relations has gradually become a hot research topic. Current few-shot KGC methods mainly focus
-
Spatio-Temporal Characteristics of Influenza Burden and Its Influence Factors in Japan in the Past Three Decades: An Influenza Disease Burden Data-Based Modeling Study Big Data Res. (IF 3.3) Pub Date : 2023-04-18 Junru Wang, Shixin Zhang, Anbang Dai
Introduction: Influenza has still posed a great threat to humans. The knowledge of the systematic disease burden of influenza in Japan was limited. The study was aimed to investigate Spatio-temporal characteristics of the influenza burden and its influence factors in the past three decades. Methods: Data on annual death, years lived with disability (YLDs), years of life lost (YLLs) and disability adjusted
-
Task-Oriented Collaborative Graph Embedding Using Explicit High-Order Proximity for Recommendation Big Data Res. (IF 3.3) Pub Date : 2023-04-18 Mintae Kim, Wooju Kim
-
Random Manifold Sampling and Joint Sparse Regularization for Multi-Label Feature Selection Big Data Res. (IF 3.3) Pub Date : 2023-04-04 Haibao Li, Hongzhi Zhai
Multi-label learning is usually used to mine the correlation between features and labels, and feature selection can retain as much information as possible through a small number of features. ℓ2,1 regularization method can get sparse coefficient matrix, but it can not solve multicollinearity problem effectively The model proposed in this paper can obtain the most relevant few features by solving the
-
MLPQ: A Dataset for Path Question Answering over Multilingual Knowledge Graphs Big Data Res. (IF 3.3) Pub Date : 2023-03-27 Yiming Tan, Yongrui Chen, Guilin Qi, Weizhuo Li, Meng Wang
Knowledge Graph-based Multilingual Question Answering (KG-MLQA), as one of the essential subtasks in Knowledge Graph-based Question Answering (KGQA), emphasizes that questions on the KGQA task can be expressed in different languages to solve the lexical gap between questions and knowledge graph(s). However, the existing KG-MLQA works mainly focus on the semantic parsing of multilingual questions but
-
What Is a Multi-Modal Knowledge Graph: A Survey Big Data Res. (IF 3.3) Pub Date : 2023-03-14 Jinghui Peng, Xinyu Hu, Wenbo Huang, Jian Yang
With the explosive growth of multi-modal information on the Internet, the multi-modal knowledge graph (MMKG) has become an important research topic in knowledge graphs to meet the needs of data management and application. Most research on MMKG has taken image-text data as the research object and used the multi-modal deep learning approach to process multi-modal data. In comparison, the structure of
-
Heterogeneous Graph Convolutional Network Based on Correlation Matrix Big Data Res. (IF 3.3) Pub Date : 2023-03-08 Liqing Qiu, Jingcheng Zhou, Caixia Jing, Yuying Liu
Heterogeneous graph embedding maps a high-dimension graph that has different sorts of nodes and edges to a low-dimensional space, making it perform well in downstream tasks. The existing models mainly use two approaches to explore and embed heterogeneous graph information. One is to use meta-path to mining heterogeneous information; the other is to use special modules designed by researchers to explore
-
A Parallel Fusion Graph Convolutional Network for Aspect-Level Sentiment Analysis Big Data Res. (IF 3.3) Pub Date : 2023-02-01 Yuxin Wu, Guofeng Deng
Sentiment analysis has always been an important basic task in the NLP field. Recently, graph convolutional networks (GCNs) have been widely used in aspect-level sentiment analysis. Because GCNs have good aggregation effects, every node can contain neighboring node information. However, in previous studies, most models used only a single GCN to learn contextual information. The GCN relies on the construction
-
Spatiotemporal Prediction Based on Feature Classification for Multivariate Floating-Point Time Series Lossy Compression Big Data Res. (IF 3.3) Pub Date : 2023-01-31 Huimin Feng, Ruizhe Ma, Li Yan, Zongmin Ma
A large amount of time series is produced because of the frequent use of IoT devices and sensors. Time series compression is widely adopted to reduce storage overhead and transport costs. At present, most state-of-the-art approaches focus on univariate time series. Therefore, the task of compressing multivariate time series (MTS) is still an important but challenging problem. Traditional MTS compression
-
-
GeoYCSB: A Benchmark Framework for the Performance and Scalability Evaluation of Geospatial NoSQL Databases Big Data Res. (IF 3.3) Pub Date : 2023-01-11 Suneuy Kim, Yvonne Hoang, Tsz Ting Yu, Yuvraj Singh Kanwar
The proliferation of geospatial applications has tremendously increased the variety, velocity, and volume of spatial data that data stores have to manage. Traditional relational databases reveal limitations in handling such big geospatial data, mainly due to their rigid schema requirements and limited scalability. Numerous NoSQL databases have emerged and actively serve as alternative data stores for
-
Efficiently Mining Colocation Patterns for Range Query Big Data Res. (IF 3.3) Pub Date : 2023-01-13 Srikanth Baride, Anuj S. Saxena, Vikram Goyal
Colocation pattern mining finds a set of features whose instances frequently appear nearby in the same geographical space. Most of the existing algorithms for colocation patterns find nearby objects by a user-provided single-distance threshold. The value of the distance threshold is data specific and choosing a suitable distance for a user is not easy. In most real-world scenarios, it is rather meant
-
Predicting Household Electric Power Consumption Using Multi-step Time Series with Convolutional LSTM Big Data Res. (IF 3.3) Pub Date : 2022-11-25 Lucia Cascone, Saima Sadiq, Saleem Ullah, Seyedali Mirjalili, Hafeez Ur Rehman Siddiqui, Muhammad Umer
Energy consumption prediction has become an integral part of a smart and sustainable environment. With future demand forecasts, energy production and distribution can be optimized to meet the needs of the growing population. However, forecasting the demand of individual households is a challenging task due to the diversity of energy consumption patterns. Recently, it has become popular with artificial
-
-
Data-Efficient Performance Modeling for Configurable Big Data Frameworks by Reducing Information Overlap Between Training Examples Big Data Res. (IF 3.3) Pub Date : 2022-11-11 Zhiqiang Liu, Xuanhua Shi, Hai Jin
To support the various analysis application of big data, big data processing frameworks are designed to be highly configurable. However, for common users, it is difficult to tailor the configurable frameworks to achieve optimal performance for every application. Recently, many automatic tuning methods are proposed to configure these frameworks. In detail, these methods firstly build a performance prediction
-
Evaluating Standard Feature Sets Towards Increased Generalisability and Explainability of ML-Based Network Intrusion Detection Big Data Res. (IF 3.3) Pub Date : 2022-11-08 Mohanad Sarhan, Siamak Layeghy, Marius Portmann
Machine Learning (ML)-based network intrusion detection systems bring many benefits for enhancing the cybersecurity posture of an organisation. Many systems have been designed and developed in the research community, often achieving a close to perfect detection rate when evaluated using synthetic datasets. However, there are ongoing challenges with the development and evaluation of ML-based NIDSs;
-
Data Stream Classification Based on Extreme Learning Machine: A Review Big Data Res. (IF 3.3) Pub Date : 2022-11-08 Xiulin Zheng, Peipei Li, Xindong Wu
Many daily applications are generating massive amount of data in the form of stream at an ever higher speed, such as medical data, clicking stream, internet record and banking transaction, etc. In contrast to the traditional static data, data streams are of some inherent properties, to name a few, infinite length, concept drift, multiple labels and concept evolution. Among all the data mining tasks
-
Accelerating Columnar Storage Based on Asynchronous Skipping Strategy Big Data Res. (IF 3.3) Pub Date : 2022-11-04 Wenhai Li, Zheng Yang, Lingfeng Deng, Zhiling Cheng, Weidong Wen, Yanxiang He
Many database applications, such as OnLine Analytical Processing (OLAP), web-based information extraction or scientific computation, need to select a subset of fields based on several user-defined filters. Developers of these applications require effective assembly methods for on-demand filtering and aggregation, which raises new challenges in deploying parallel computing components on top of columnar
-
Augmented Functional Analysis of Variance (A-fANOVA): Theory and Application to Google Trends for Detecting Differences in Abortion Drugs Queries Big Data Res. (IF 3.3) Pub Date : 2022-10-19 Fabrizio Maturo, Annamaria Porreca
The World Wide Web (WWW) has become a popular and readily accessible big data source in recent decades. The information in the WWW is offered in many different types, e.g. Google Trends, which provides deep insights into people's search queries in the Google Search engine. Analysing this kind of data is not straightforward because they usually take the form of high-dimensional data, given that the
-
A Facial Expression Recognition Approach for Social IoT Frameworks Big Data Res. (IF 3.3) Pub Date : 2022-10-19 Silvio Barra, Sanoar Hossain, Chiara Pero, Saiyed Umer
Social IoT has become a sensitive topic in the last years, mainly due to the attraction of social networks and the related digital activities amongst the population. These techniques are gaining even more importance in the current period, in which digital tools are the only ones allowed to maintain social distancing due to the COVID-19 restrictions. In order to aid patients and elderly people in-home
-
Linked Open Government Data to Predict and Explain House Prices: The Case of Scottish Statistics Portal Big Data Res. (IF 3.3) Pub Date : 2022-10-14 Areti Karamanou, Evangelos Kalampokis, Konstantinos Tarabanis
Accurately estimating the prices of houses is important for various stakeholders including house owners, real estate agencies, government agencies, and policy-makers. Towards this end, traditional statistics and, only recently, advanced machine learning and artificial intelligence models are used. Open Government Data (OGD) have a huge potential especially when combined with AI technologies. OGD are
-
A Twig-Based Algorithm for Top-k Subgraph Matching in Large-Scale Graph Data Big Data Res. (IF 3.3) Pub Date : 2022-10-04 Haiwei Zhang, Qijie Bai, Yining Lian, Yanlong Wen
Subgraph matching aims to find similar substructures in a single graph according to a given query graph and is known as a basic query for graph data management. There exist many categories of subgraph matching solutions. Subgraph isomorphism, which is thought of an NP-complete problem, is an initial solution for the subgraph matching task. To speed up the procedure, graph simulation has been presented
-
Special Issue on Real-Time Intelligent Systems Big Data Res. (IF 3.3) Pub Date : 2022-09-26 Richard Chbeir, Yannis Manolopoulos, Jolanta Mizera-Pietraszko, Spyros Sioutas
Abstract not available
-
An Embedding Model for Knowledge Graph Completion Based on Graph Sub-Hop Convolutional Network Big Data Res. (IF 3.3) Pub Date : 2022-09-30 Haitao He, Haoran Niu, Jianzhou Feng, Junlan Nie, Yangsen Zhang, Jiadong Ren
The research on knowledge graph completion based on representation learning is increasingly dependent on the node structural feature in the graph. However, a large number of nodes have few immediate neighbors, resulting in the node features unable to be fully expressed. Hence, multi-hop structure features are crucial to the representation learning of nodes. GCN (Graph Convolutional Network) is a graph
-
Neural Topic Modeling with Deep Mutual Information Estimation Big Data Res. (IF 3.3) Pub Date : 2022-09-13 Kang Xu, Xiaoqiu Lu, Yuan-fang Li, Tongtong Wu, Guilin Qi, Ning Ye, Dong Wang, Zheng Zhou
The emerging neural topic models make topic modeling more easily adaptable and extendable in unsupervised text mining. However, the existing neural topic models are difficult to retain representative information of the documents within the learnt topic representation. Fortunately, Deep Mutual Information Estimation (DMIE), which maximizes the mutual information between input data and the hidden representations
-
Properties and Performance of the ABCDe Random Graph Model with Community Structure Big Data Res. (IF 3.3) Pub Date : 2022-09-14 Bogumił Kamiński, Tomasz Olczak, Bartosz Pankratz, Paweł Prałat, François Théberge
In this paper, we investigate properties and performance of synthetic random graph models with a built-in community structure. Such models are important for evaluating and tuning community detection algorithms that are unsupervised by nature. We propose ABCDe—a multi-threaded implementation of the ABCD (Artificial Benchmark for Community Detection) graph generator. We discuss the implementation details
-
A Multi-Objective Clustering for Better Data Management in Connected Environment Big Data Res. (IF 3.3) Pub Date : 2022-09-05 Sabri Allani, Richard Chbeir, Khouloud Salameh, Elio Mansour, Philippe Arnould
Over the past decade, the rapid increase in connected devices has enabled the emergence of new digital ecosystems to provide new opportunities for monitoring and managing systems to optimize overall performance. With these connected environments, data collection and management become increasingly challenging. A significant number of works in the literature have addressed data collection and management
-
Correlation Expert Tuning System for Performance Acceleration Big Data Res. (IF 3.3) Pub Date : 2022-09-02 Yanfeng Chai, Jiake Ge, Qiang Zhang, Yunpeng Chai, Xin Wang, Qingpeng Zhang
One configuration can not fit all workloads and diverse resources limitations in modern databases. Auto-tuning methods based on reinforcement learning (RL) normally depend on the exhaustive offline training process with a huge amount of performance measurements, which includes large inefficient knobs combinations under a trial-and-error method. The most time-consuming part of the process is not the
-
Automatic Prediction of T2/T3 Staging of Rectal Cancer Based on Radiomics and Machine Learning Big Data Res. (IF 3.3) Pub Date : 2022-09-02 Xinhong Zhang, Boyan Zhang, Binjie Wang, Fan Zhang
The staging of rectal cancer is very important to determine the treatment plans. This study investigated the relationship between the imaging features and the rectal cancer staging, so that the staging of rectal cancer can be automatically predicted based on the imaging features. A total of 81 patients who underwent with T2 or T3 stage rectal cancer from April 2018 to March 2019 were included. Firstly
-
-
Detecting Seasonal Dependencies in Production Lines for Forecast Optimization Big Data Res. (IF 3.3) Pub Date : 2022-07-26 Gerold Hoelzl, Sebastian Soller, Matthias Kranz
Huge amounts of data are produced inside an industrial production plant every minute. This data is getting more accessible by higher network and computing capabilities. This poses an opportunity to apply methods in real time to support the reliability of production machines. In theory every time series, that is currently monitored by for a breach of thresholds, can be extended with a forecast method