-
DECIDE: An Agile event-and-data driven design methodology for decisional Big Data projects Data Knowl. Eng. (IF 1.476) Pub Date : 2020-09-28 Lilia Sfaxi; Mohamed Mehdi Ben Aissa
Decision making is the lifeblood of the enterprise — from the mundane to the strategically critical. However, the increasing deluge of data makes it more important than ever to understand and use it effectively in every context. Being “data driven” is more aspiration than reality in most organizations due to the complexity, volume, variability and velocity of data streams from every customer and employee
-
An analysis of the collaboration network of the International Conference on Conceptual Modeling at the Age of 40 Data Knowl. Eng. (IF 1.476) Pub Date : 2020-10-14 Lucas Henrique C. Lima; Alberto H.F. Laender; Mirella M. Moro; José Palazzo M. de Oliveira
The International Conference on Conceptual Modeling celebrated 40 years of existence at its 38th edition held in Salvador, Brazil, on 4–7 November 2019. As one of the most traditional and well-known conferences in the database area, it has its origins on the Entity-Relationship Model proposed by Peter P. Chen in 1975. To celebrate such an accomplishment, this article goes over the ER history from distinct
-
Handling data imperfection—False data inputs in applications for Alzheimer’s patients Data Knowl. Eng. (IF 1.476) Pub Date : 2020-11-02 Fatma Ghorbel; Fayçal Hamdi; Nassira Achich; Elisabeth Metais
Handling data imperfection is a crucial issue in many application domains. This is particularly true when handling imperfect data inputs in applications for Alzheimer’s patients. In this paper we first propose a typology of imperfection for data entered by Alzheimer’s patients or their caregivers in the context of these applications (mainly due to the memory discordance caused by the disease). This
-
Mining arguments in scientific abstracts with discourse-level embeddings Data Knowl. Eng. (IF 1.476) Pub Date : 2020-08-01 Pablo Accuosto; Horacio Saggion
Argument mining consists in the automatic identification of argumentative structures in texts. In this work we leverage existing discourse-level annotations to facilitate the identification of argumentative components and relations in scientific texts, which has been recognized as a particularly challenging task. We propose a new annotation schema and use it to augment a corpus of computational linguistics
-
Natural logic knowledge bases and their graph form Data Knowl. Eng. (IF 1.476) Pub Date : 2020-08-18 Troels Andreasen; Henrik Bulskov; Per Anker Jensen; Jørgen Fischer Nilsson
This paper describes how knowledge bases can be represented in and reasoned with in natural logic. Natural logic is a regimented fragment of natural language possessing a well-defined logical semantics. As such, natural logic may be considered an attractive alternative among the various knowledge representation logics such as description logics. Our version of natural logic expands formal ontologies
-
An analytical model for information gathering and propagation in social networks using random graphs Data Knowl. Eng. (IF 1.476) Pub Date : 2020-09-08 Samant Saurabh; Sanjay Madria; Anirban Mondal; Ashok Singh Sairam; Saurabh Mishra
-
Scalable distributed reachability query processing in multi-labeled networks Data Knowl. Eng. (IF 1.476) Pub Date : 2020-09-08 Amina Gacem; Apostolos N. Papadopoulos; Kamel Boukhalfa
Testing reachability in a graph gains substantial interest as an important operation in network analysis and graph mining. In its simplest form, a reachability query is defined by a pair of nodes (u, v) and a graph G, and detects if there is a path from u to v. This paper addresses a specific case of reachability on multi-labeled distributed graphs, where the query is parameterized by a set of source
-
Interpretable Anomaly Prediction: Predicting anomalous behavior in industry 4.0 settings via regularized logistic regression tools Data Knowl. Eng. (IF 1.476) Pub Date : 2020-08-21 Rocco Langone; Alfredo Cuzzocrea; Nikolaos Skantzos
Prediction of anomalous behavior in industrial assets based on sensor reading represents a key focus in modern business practice. As a matter of fact, forecast of forthcoming faults is crucial to implement predictive maintenance, i.e. maintenance decision making based on real time information from components and systems, which allows, among other benefits, to reduce maintenance cost, minimize downtime
-
A linear programming-based framework for handling missing data in multi-granular data warehouses Data Knowl. Eng. (IF 1.476) Pub Date : 2020-06-06 Sandro Bimonte; Libo Ren; Nestor Koueya
Data Warehouse (DW) and OLAP systems are first citizens of Business Intelligence tools. They are widely used in the academic and industrial communities for numerous different fields of application. Despite the maturity of DW and OLAP systems, with the advent of Big Data, more and more sources of data are available, and warehousing this data can lead to important quality issues. In this work, we focus
-
PRESS: A personalised approach for mining top-k groups of objects with subspace similarity Data Knowl. Eng. (IF 1.476) Pub Date : 2020-06-05 Tahrima Hashem; Lida Rashidi; Lars Kulik; James Bailey
Personalised analytics is a powerful technology that can be used to improve the career, lifestyle, and health of individuals by providing them with an in-depth analysis of their characteristics as compared to other people. Existing research has often focused on mining general patterns or clusters, but without the facility for customisation to an individual’s needs. It is challenging to adapt such approaches
-
Design and implementation of ETL processes using BPMN and relational algebra Data Knowl. Eng. (IF 1.476) Pub Date : 2020-06-13 Judith Awiti; Alejandro A. Vaisman; Esteban Zimányi
Extraction, transformation, and loading (ETL) processes are used to extract data from internal and external sources of an organization, transform these data, and load them into a data warehouse. The Business Process Modeling and Notation (BPMN) has been proposed for expressing ETL processes at a conceptual level. A different approach is studied in this paper, where relational algebra (RA), extended
-
Mo.Re.Farming: A hybrid architecture for tactical and strategic precision agriculture Data Knowl. Eng. (IF 1.476) Pub Date : 2020-06-12 Enrico Gallinucci; Matteo Golfarelli; Stefano Rizzi
In this paper we propose an innovative architecture, called Mo.Re.Farming, for handling agricultural data in an integrated fashion and supporting decision making in the precision agriculture domain. This architecture is oriented to data analysis and is inspired by Business Intelligence 2.0 approaches. It is hybrid in that it couples traditional and big data technologies to integrate heterogeneous data
-
Natural language processing-enhanced extraction of SBVR business vocabularies and business rules from UML use case diagrams Data Knowl. Eng. (IF 1.476) Pub Date : 2020-05-06 Paulius Danenas; Tomas Skersys; Rimantas Butleris
Discovery, specification and proper representation of various aspects of business knowledge plays crucial part in model-driven information systems engineering, especially when it comes to the early stages of systems development. Being among the most applicable and advanced features of model-driven development, model transformation could help improving one of the most time- and resource-consuming efforts
-
Search-by-example over SQL repositories using structural and intent-driven similarity Data Knowl. Eng. (IF 1.476) Pub Date : 2020-03-16 Gregory Borodin; Yaron Kanza
Searching the query log of a database system has a variety of applications. In a complex database, relevant queries in the log can serve as an initial example for query formulation, or may elucidate how to query the data in an optimized manner. Searching for queries that may cause a security or a privacy breach could be used to detect leaks of sensitive data. In general, queries in the query log can
-
Incremental clustering techniques for multi-party Privacy-Preserving Record Linkage Data Knowl. Eng. (IF 1.476) Pub Date : 2020-03-16 Dinusha Vatsalan; Peter Christen; Erhard Rahm
Privacy-Preserving Record Linkage (PPRL) supports the integration of sensitive information from multiple datasets, in particular the privacy-preserving matching of records referring to the same entity. PPRL has gained much attention in many application areas, with the most prominent ones in the healthcare domain. PPRL techniques tackle this problem by conducting linkage on masked (encoded) values.
-
Computational model for generating interactions in conversational recommender system based on product functional requirements Data Knowl. Eng. (IF 1.476) Pub Date : 2020-03-13 Z.K.A. Baizal; Dwi H. Widyantoro; Nur Ulfa Maulidevi
Conversational recommender system is a tool to help customer in deciding products they are going to buy, by conversational mechanism. By this mechanism, the system is able to imitate natural conversation between customer and professional sales support, for eliciting customer preference. However, many customers are not familiar with the technical features of multi-function and multi-feature products
-
Content-based Node2Vec for representation of papers in the scientific literature Data Knowl. Eng. (IF 1.476) Pub Date : 2020-02-14 B. Kazemi; A. Abhari
Lower-dimensional representation of scientific text has attracted much attention among researchers due to its impact on many data mining and recommendation tasks. This paper studies two main research streams in scientific literature representation. First, both local and distributed representation viewpoints are reviewed and their advantages and disadvantages in lower dimensional representation are
-
Top-k user-specified preferred answers in massive graph databases Data Knowl. Eng. (IF 1.476) Pub Date : 2020-02-14 Noseong Park; Andrea Pugliese; Edoardo Serra; V.S. Subrahmanian
There are numerous applications where users wish to identify subsets of vertices in a social network or graph database that are of interest to them. They may specify sets of patterns and vertex properties, and each of these confers a score to a subgraph. The users want to find the subgraphs with top-k highest scores. Examples in the real world where such subgraphs involve custom scoring methods include:
-
A framework for multidimensional skyline queries over streaming data Data Knowl. Eng. (IF 1.476) Pub Date : 2020-02-12 Karim Alami; Sofian Maabout
Skyline query has attracted a great deal of interest during last years because of its ability to help decision makers when multi-criteria objectives are to be handled. Several authors have pointed the interest of multidimensional skylines, i.e., the set of criteria become a parameter of the query. In order to efficiently evaluate these queries, index structures have been proposed. In this paper, we
-
Semi-automated development of conceptual models from natural language text Data Knowl. Eng. (IF 1.476) Pub Date : 2020-02-12 Mussa Omar; George Baryannis
The process of converting natural language specifications into conceptual models requires detailed analysis of natural language text, and designers frequently make mistakes when undertaking this transformation manually. Although many approaches have been used to partly automate this process, one of the main limitations is the lack of a domain-independent ontology that can be used as a repository for
-
A multi-view similarity measure framework for trouble ticket mining Data Knowl. Eng. (IF 1.476) Pub Date : 2020-02-12 Jian Xu; Jiapeng Mu; Gaorong Chen
Text similarity measures play a very important role in several text mining applications. Although there is an extensive literature on measuring the similarity between long texts, there is less work related to the measurement of similarity between short texts. And most of these works on short text similarity are based on adaptations of long-text similarity methods. Unfortunately, the description of
-
Hierarchy construction and classification of heterogeneous information networks based on RSDAEf Data Knowl. Eng. (IF 1.476) Pub Date : 2020-01-10 Jinli Zhang; Zongli Jiang; Yongping Du; Tong Li; Yida Wang; Xiaohua Hu
Heterogeneous information networks (HINs) composed of multiple types of nodes and links, play increasingly important roles in real life applications. Classification of the related data is an essential work in network analysis. Existing methods can effectively solve these classification tasks when they are applied to homogeneous information networks and simple data, but not for the noisy and sparse
-
Integrating Cuckoo search-Grey wolf optimization and Correlative Naive Bayes classifier with Map Reduce model for big data classification Data Knowl. Eng. (IF 1.476) Pub Date : 2019-12-27 Chitrakant Banchhor; N. Srinivasu
Big data is progressively being used in various areas, such as industry, financial dealing, medicine, and so on, as it can handle the challenges in processing large amounts of data. One of the data mining techniques used widely and effectively to classify big data is the MapReduce model. In this paper, an approach for the classification of big data is developed using Cuckoo–Grey wolf based Correlative
Contents have been reproduced by permission of the publishers.