  • CryptDICE: Distributed data protection system for secure cloud data storage and computation
    Inform. Syst. (IF 2.466) Pub Date : 2020-10-30
    Ansar Rafique; Dimitri Van Landuyt; Emad Heydari Beni; Bert Lagaisse; Wouter Joosen

    Cloud storage allows organizations to store data at remote sites of service providers. Although cloud storage services offer numerous benefits, they also involve new risks and challenges with respect to data security and privacy aspects. To preserve confidentiality, data must be encrypted before outsourcing to the cloud. Although this approach protects the security and privacy aspects of data, it also

  • Conformance checking of mixed-paradigm process models
    Inform. Syst. (IF 2.466) Pub Date : 2020-11-26
    Boudewijn F. van Dongen; Johannes De Smedt; Claudio Di Ciccio; Jan Mendling

    Mixed-paradigm process models integrate strengths of procedural and declarative representations like Petri nets and Declare. They are specifically interesting for process mining because they allow capturing complex behaviour in a compact way. A key research challenge for the proliferation of mixed-paradigm models for process mining is the lack of corresponding conformance checking techniques. In this

  • ER-index: A referential index for encrypted genomic databases
    Inform. Syst. (IF 2.466) Pub Date : 2020-11-10
    Ferdinando Montecuollo; Giovannni Schmid

    Huge DBMSs storing genomic information are being created and engineerized for doing large-scale, comprehensive and in-depth analysis of human beings and their diseases. This paves the way for significant new approaches in medicine, but also poses major challenges for storing, processing and transmitting such big amounts of data in compliance with recent regulations concerning user privacy. We designed

  • Model-based trace variant analysis of event logs
    Inform. Syst. (IF 2.466) Pub Date : 2020-11-14
    Mathilde Boltenhagen; Thomas Chatain; Josep Carmona

    The comparison of trace variants of business processes opens the door for a fine-grained analysis of the distinctive features inherent in the executions of a process in an organization. The current approaches for trace variant analysis do not consider the situation where a process model is present, and therefore, it can guide the derivation of the trace variants by considering high-level structures

  • D2IA: User-defined interval analytics on distributed streams
    Inform. Syst. (IF 2.466) Pub Date : 2020-11-13
    Ahmed Awad; Riccardo Tommasini; Samuele Langhi; Mahmoud Kamel; Emanuele Della Valle; Sherif Sakr

    Nowadays, modern Big Stream Processing Solutions (e.g. Spark, Flink) are working towards being the ultimate framework for streaming analytics. In order to achieve this goal, they started to offer extensions of SQL that incorporate stream-oriented primitives such as windowing and Complex Event Processing (CEP). The former enables stateful computation on infinite sequences of data items while the latter

  • Novel predictive model to improve the accuracy of collaborative filtering recommender systems
    Inform. Syst. (IF 2.466) Pub Date : 2020-11-03
    Bushra Alhijawi; Ghazi Al-Naymat; Nadim Obeid; Arafat Awajan

    The recommendation problem involves the prediction of a set of items that maximize the utility for users. Numerous factors, such as the filtering method and similarity measure, affect the prediction accuracy. We propose a novel prediction mechanism that can be applied to collaborative filtering recommender systems. This prediction mechanism consists of a novel adaptable predictive model, called inheritance-based

  • ProDB: A memory-secure database using hardware enclave and practical oblivious RAM
    Inform. Syst. (IF 2.466) Pub Date : 2020-11-10
    Ziyang Han; Haibo Hu

    One key challenge for data owners to host their databases in the cloud is data privacy. In this paper, we first demonstrate that even with the most recent hardware-based security technology such as Intel SGX, a hypervisor can still sniff key database operations running in its guest virtual machine (VM) such as the frequency and type of SQL queries, by monitoring the access pattern of this VM’s main

  • Requirements Engineering for Cyber Physical Production Systems: The e-CORE approach and its application
    Inform. Syst. (IF 2.466) Pub Date : 2020-11-10
    Pericles Loucopoulos; Evangelia Kavakli; Julien Mascolo

    Traditional manufacturing and production systems are in the throes of a digital transformation. By blending the real and virtual production worlds, it is now possible to connect all parts of the production process: devices, products, processes, systems and people, in an informational ecosystem. This paper examines the underpinning issues that characterise the challenges for transforming traditional

  • Topical affinity in short text microblogs
    Inform. Syst. (IF 2.466) Pub Date : 2020-10-24
    Herman Masindano Wandabwa; M. Asif Naeem; Farhaan Mirza; Russel Pears

    Knowledge-based applications like recommender systems in social networks are powered by complex network of social discussions and user connections. Short text microblog platforms like Twitter are powerful in this aspect due to their real-time content dissemination as well as having a complex mesh of user connections. For example, users on Twitter tend to consume certain content to a greater or less

  • Orientation and conformance: A HMM-based approach to online conformance checking
    Inform. Syst. (IF 2.466) Pub Date : 2020-11-07
    Wai Lam Jonathan Lee; Andrea Burattin; Jorge Munoz-Gama; Marcos Sepúlveda

    Online conformance checking comes with new challenges, especially in terms of time and space constraints. One fundamental challenge of explaining the conformance of a running case is in balancing between making sense at the process level as the case reaches completion and putting emphasis on the current information at the same time. In this paper, we propose an online conformance checking framework

  • Sampling and approximation techniques for efficient process conformance checking
    Inform. Syst. (IF 2.466) Pub Date : 2020-10-26
    Martin Bauer; Han van der Aa; Matthias Weidlich

    Conformance checking enables organizations to automatically assess whether their business processes are executed according to their specification. State-of-the-art conformance checking algorithms perform this task by establishing alignments between behaviour recorded by IT systems to a process model capturing desired behaviour. While such alignments clearly highlight conformance issues, a major downside

  • Querying APIs with SPARQL
    Inform. Syst. (IF 2.466) Pub Date : 2020-10-26
    Matthieu Mosser; Fernando Pieressa; Juan L. Reutter; Adrián Soto; Domagoj Vrgoč

    Although the amount of RDF data has been steadily increasing over the years, the majority of information on the Web is still residing in other formats, and is often not accessible to Semantic Web services. A lot of this data is available through APIs serving JSON documents. In this work we propose a way of extending SPARQL with the option to consume JSON APIs and integrate this information into SPARQL

  • Scalable and data-aware SQL query recommendations
    Inform. Syst. (IF 2.466) Pub Date : 2020-09-18
    Natalia Arzamasova; Klemens Böhm

    SQL query recommendation suggests an SQL statement to a user, based on his submitted requests and on queries of other users stored in a log. Such methods need to be scalable and data-aware. Data awareness means that the filtering condition, the most crucial element of the recommendation, contains actual values. Otherwise, the query is not directly executable. Existing approaches do not satisfy the

  • A large reproducible benchmark of ontology-based methods and word embeddings for word similarity
    Inform. Syst. (IF 2.466) Pub Date : 2020-09-30
    Juan J. Lastra-Díaz; Josu Goikoetxea; Mohamed Ali Hadj Taieb; Ana Garcia-Serrano; Mohamed Ben Aouicha; Eneko Agirre; David Sánchez

    This work is a companion reproducibility paper of the experiments and results reported in Lastra-Diaz et al. (2019a), which is based on the evaluation of a companion reproducibility dataset with the HESML V1R4 library and the long-term reproducibility tool called Reprozip. Human similarity and relatedness judgements between concepts underlie most of cognitive capabilities, such as categorization, memory

  • A general framework for privacy-preserving of data publication based on randomized response techniques
    Inform. Syst. (IF 2.466) Pub Date : 2020-09-29
    Chaobin Liu; Shixi Chen; Shuigeng Zhou; Jihong Guan; Yao Ma

    Privacy preserving is a paramount concern in publishing datasets that contain sensitive information. Preventing privacy disclosure and providing useful information to legitimate users for data analyzing/mining are conflicting goals. Randomized response is a class of techniques that perturbs each sensitive value in a certain way, so that personal privacy is protected while the large-trend of the entire

  • Feature-oriented engineering of declarative artifact-centric process models
    Inform. Syst. (IF 2.466) Pub Date : 2020-09-10
    Rik Eshuis

    Declarative artifact-centric process models are suitable for specifying knowledge-intensive processes. Currently, such models need to be designed from scratch, even though existing model fragments could be reused to gain efficiency in designing and maintaining declarative artifact-centric process models. To address this problem, this paper proposes an approach for composing model fragments, abstracted

  • A knowledge-intensive adaptive business process management framework
    Inform. Syst. (IF 2.466) Pub Date : 2020-09-10
    Huseyin Kir; Nadia Erdogan

    Business process management has been the driving force of optimization and operational efficiency for companies until now, but the digitalization era we have been experiencing requires businesses to be agile and responsive as well. In order to be a part of this digital transformation, delivering new levels of automation-fueled agility through digitalization of BPM itself is required. However, the automation

  • Cause vs. effect in context-sensitive prediction of business process instances
    Inform. Syst. (IF 2.466) Pub Date : 2020-09-14
    Jens Brunk; Matthias Stierle; Leon Papke; Kate Revoredo; Martin Matzner; Jörg Becker

    Predicting undesirable events during the execution of a business process instance provides the process participants with an opportunity to intervene and keep the process aligned with its goals. Few approaches for tackling this challenge consider a multi-perspective view, where the flow perspective of the process is combined with its surrounding context. Given the many sources of data in today’s world

  • Detection of batch activities from event logs
    Inform. Syst. (IF 2.466) Pub Date : 2020-09-10
    Niels Martin; Luise Pufahl; Felix Mannhardt

    Organizations carry out a variety of business processes in order to serve their clients. Usually supported by information technology and systems, process execution data is logged in an event log. Process mining uses this event log to discover the process’ control-flow, its performance, information about the resources, etc. A common assumption is that the cases are executed independently of each other

  • On the appropriateness of Platt scaling in classifier calibration
    Inform. Syst. (IF 2.466) Pub Date : 2020-09-10
    Björn Böken

    Many applications using data mining and machine learning techniques require posterior probability estimates besides often highly accurate predictions. Classifier calibration is a separate branch of machine learning that aims at transforming classifier predictions into posterior class probabilities and thus are useful additional extensions in the respective applications. Among the existing state-of-the-art

  • Controlled flexibility in blockchain-based collaborative business processes
    Inform. Syst. (IF 2.466) Pub Date : 2020-08-29
    Orlenys López-Pintado; Marlon Dumas; Luciano García-Bañuelos; Ingo Weber

    Blockchain technology enables the execution of collaborative business processes involving mutually untrusted parties. Existing tools allow such processes to be modeled using high-level notations and compiled into smart contracts that can be deployed on blockchain platforms. However, these tools do not provide mechanisms to cope with the flexibility requirements inherent to open and dynamic collaboration

  • Towards holistic Entity Linking: Survey and directions
    Inform. Syst. (IF 2.466) Pub Date : 2020-08-24
    Italo L. Oliveira; Renato Fileto; René Speck; Luís P.F. Garcia; Diego Moussallem; Jens Lehmann

    Entity Linking (EL) empowers Natural Language Processing applications by linking relevant mentions found in raw textual data to precise information about what they supposedly stand for. However, EL approaches have mostly focused on particular kinds of inputs and frequently fail to properly handle texts from specific sources (e.g., microblogs) that have particularities such as grammatical errors, slangs

  • Collaborative filtering over evolution provenance data for interactive visual data exploration
    Inform. Syst. (IF 2.466) Pub Date : 2020-08-18
    Houssem Ben Lahmar, Melanie Herschel

    In interactive visual data exploration, users rely on recommendations on what data to explore next. EVLIN is a system that recommends queries to retrieve these data for the next exploration step, paired with suited visualizations. This paper extends EVLIN by combining its content-based recommendations with recommendations leveraging collaborative filtering to improve the effectiveness of recommendation-based

  • OILog: An online incremental log keyword extraction approach based on MDP-LSTM neural network
    Inform. Syst. (IF 2.466) Pub Date : 2020-08-14
    Xiaoyu Duan, Shi Ying, Hailong Cheng, Wanli Yuan, Xiang Yin

    Log keyword extraction is an indispensable part of log anomaly detection. There are two main challenges in keyword extraction, one is that the essence of logs is unstructured, and different vendors usually define different log formats, the other one is that the most of the traditional method cannot update the log keywords incrementally to match the newly generated log data, so the extraction accuracy

  • In-situ visual exploration over big raw data
    Inform. Syst. (IF 2.466) Pub Date : 2020-08-07
    Nikos Bikakis, Stavros Maroulis, George Papastefanatos, Panos Vassiliadis

    Data exploration and visual analytics systems are of great importance in Open Science scenarios, where less tech-savvy researchers wish to access and visually explore big raw data files (e.g., json, csv) generated by scientific experiments using commodity hardware and without being overwhelmed in the tedious processes of data loading, indexing and query optimization. In this paper, we present our work

  • Every apprentice needs a master: Feedback-based effectiveness improvements for process model matching
    Inform. Syst. (IF 2.466) Pub Date : 2020-08-04
    Christopher Klinkmüller, Ingo Weber

    Process models are a central element of modern business process management technology. When adopting such technology, organizations inevitably establish process model collections which, depending on the degree of adoption, can reach sizes of thousands of models. Process model matching techniques are intended to assist experts in the management of such large collections, e.g., in querying the collections

  • Knowledge-guided unsupervised rhetorical parsing for text summarization
    Inform. Syst. (IF 2.466) Pub Date : 2020-08-03
    Shengluan Hou, Ruqian Lu

    Automatic text summarization (ATS) has recently achieved impressive performance thanks to recent advances in deep learning and the availability of large-scale corpora. However, there is still no guarantee that the generated summaries are grammatical, concise, and convey all salient information as the original documents have. To make the summarization results more faithful, this paper presents an unsupervised

  • XChange: A semantic diff approach for XML documents
    Inform. Syst. (IF 2.466) Pub Date : 2020-08-01
    Alessandreia Oliveira, Troy Kohwalter, Marcos Kalinowski, Leonardo Murta, Vanessa Braganholo

    XML documents are extensively used in several applications and evolve over time. Identifying the semantics of these changes becomes a fundamental process to understand their evolution. Existing approaches related to understanding changes (diff) in XML documents focus only on syntactic changes. These approaches compare XML documents based on their structure, without considering the associated semantics

  • Privacy-aware data cleaning-as-a-service
    Inform. Syst. (IF 2.466) Pub Date : 2020-07-31
    Yu Huang, Mostafa Milani, Fei Chiang

    Data cleaning is a pervasive problem for organizations as they try to reap value from their data. Recent advances in networking and cloud computing technology have fueled a new computing paradigm called Database-as-a-Service, where data management tasks are outsourced to large service providers. In this paper, we consider a Data Cleaning-as-a-Service model that allows a client to interact with a data

  • Exploiting semantic relationships for unsupervised expansion of sentiment lexicons
    Inform. Syst. (IF 2.466) Pub Date : 2020-07-29
    Felipe Viegas, Mário S. Alvim, Sérgio Canuto, Thierson Rosa, Marcos André Gonçalves, Leonardo Rocha

    The literature in sentiment analysis has widely assumed that semantic relationships between words cannot be effectively exploited to produce satisfactory sentiment lexicon expansions. This assumption stems from the fact that words considered to be “close” in a semantic space (e.g., word embeddings) may present completely opposite polarities, which might suggest that sentiment information in such spaces

  • DimensionSlice: A main-memory data layout for fast scans of multidimensional data
    Inform. Syst. (IF 2.466) Pub Date : 2020-07-25
    Ilhyun Suh, Yon Dohn Chung

    Multidimensional data are exploited in many application areas such as scientific data analysis, business intelligence, and geographic information systems. One of the most frequent operations applied to such multidimensional data is the selection of a subspace of the given multidimensional space, which involves predicate evaluation on multiple dimensions. Existing main-memory data layouts optimized

  • Fragments of bag relational algebra: Expressiveness and certain answers
    Inform. Syst. (IF 2.466) Pub Date : 2020-07-22
    Marco Console; Paolo Guagliardo; Leonid Libkin

    While all relational database systems are based on the bag data model, much of theoretical research still views relations as sets. Recent attempts to provide theoretical foundations for modern data management problems under the bag semantics concentrated on applications that need to deal with incomplete relations, i.e., relations populated by constants and nulls. Our goal is to provide a complete characterization

  • Relevance- and interface-driven clustering for visual information retrieval
    Inform. Syst. (IF 2.466) Pub Date : 2020-07-13
    Mohamed Reda Bouadjenek, Scott Sanner, Yihao Du

    Search results of spatio-temporal data are often displayed on a map, but when the number of matching search results is large, it can be time-consuming to individually examine all results, even when using methods such as filtered search to narrow the content focus. This suggests the need to aggregate results via a clustering method. However, standard unsupervised clustering algorithms like K-means (i)

  • Decentralized data access control over consortium blockchains
    Inform. Syst. (IF 2.466) Pub Date : 2020-07-09
    Yaoliang Chen, Shi Chen, Jiao Liang, Lance Warren Feagan, Weili Han, Sheng Huang, X. Sean Wang

    Blockchain is an emerging data management technology that enables people in a collaborative network to establish trusted connections with the other participants. Recently consortium blockchains have raised interest in a broader blockchain technology discussion. Instead of a fully public, autonomous network, consortium blockchain supports a network where participants can be limited to a subset of users

  • Providing accurate answers to OLAP queries based on standardized moments of data cubes
    Inform. Syst. (IF 2.466) Pub Date : 2020-07-08
    Elaheh Pourabbas

    In this paper, we focus on the problem of providing accurate estimates to a target data cube from sets of source data cubes, which share the same summary measures. We investigate the acyclic and cyclic schemas of data sources and show that the more accurate target data cube can be computed on the basis of third and fourth standardized moments (i.e., skewness and kurtosis, respectively) of the source

  • Processing tweets for cybersecurity threat awareness
    Inform. Syst. (IF 2.466) Pub Date : 2020-07-04
    Fernando Alves, Aurélien Bettini, Pedro M. Ferreira, Alysson Bessani

    Receiving timely and relevant security information is crucial for maintaining a high-security level on an IT infrastructure. This information can be extracted from Open Source Intelligence published daily by users, security organisations, and researchers. In particular, Twitter has become an information hub for obtaining cutting-edge information about many subjects, including cybersecurity. This work

  • Hate speech detection is not as easy as you may think: A closer look at model validation (extended version)
    Inform. Syst. (IF 2.466) Pub Date : 2020-06-30
    Aymé Arango; Jorge Pérez; Barbara Poblete

    Hate speech is an important problem that is seriously affecting the dynamics and usefulness of online social communities. Large scale social platforms are currently investing important resources into automatically detecting and classifying hateful content, without much success. On the other hand, the results reported by state-of-the-art systems indicate that supervised approaches achieve almost perfect

  • A review of topic modeling methods
    Inform. Syst. (IF 2.466) Pub Date : 2020-06-18
    Ike Vayansky, Sathish A.P. Kumar

    Topic modeling is a popular analytical tool for evaluating data. Numerous methods of topic modeling have been developed which consider many kinds of relationships and restrictions within datasets; however, these methods are not frequently employed. Instead many researchers gravitate to Latent Dirichlet Analysis, which although flexible and adaptive, is not always suited for modeling more complex data

  • Using a modelling language to describe the quality of life goals of people living with dementia
    Inform. Syst. (IF 2.466) Pub Date : 2020-06-15
    James Lockerbie; Neil Maiden

    Although now well established, our information systems engineering theories and methods are applied only rarely in disciplines beyond systems development. This paper reports the application of the i* goal modelling language to describe the types of and relationships between quality of life goals of people living with dementia. Published social care frameworks to manage and improve the lives of people

  • Comprehending 3D and 4D ontology-driven conceptual models: An empirical study
    Inform. Syst. (IF 2.466) Pub Date : 2020-06-03
    Michaël Verdonck, Frederik Gailly, Sergio de Cesare

    This paper presents an empirical study that investigates the extent to which the pragmatic quality of ontology-driven models is influenced by the choice of a particular ontology, given a certain understanding of that ontology. To this end, we analyzed previous research efforts and distilled three hypotheses based on different metaphysical characteristics. An experiment based on two foundational ontologies

  • Continuous outlier mining of streaming data in flink
    Inform. Syst. (IF 2.466) Pub Date : 2020-05-29
    Theodoros Toliopoulos, Anastasios Gounaris, Kostas Tsichlas, Apostolos Papadopoulos, Sandra Sampaio

    In this work, we focus on distance-based outliers in a metric space, where the status of an entity as to whether it is an outlier is based on the number of other entities in its neighborhood. In recent years, several solutions have tackled the problem of distance-based outliers in data streams, where outliers must be mined continuously as new elements become available. An interesting research problem

  • Scenario-based process querying for compliance, reuse, and standardization
    Inform. Syst. (IF 2.466) Pub Date : 2020-05-27
    Artem Polyvyanyy, Anastasiia Pika, Arthur H.M. ter Hofstede

    Process models constitute valuable artifacts for organizations. A process model formally captures the way an organization works internally and interacts with its customers and partners. Over time, more models may be created as business practices evolve (leading to different versions of models) or an organization expands, e.g., through mergers or acquisitions. It is not uncommon for large organizations

  • Three-dimensional Entity Resolution with JedAI
    Inform. Syst. (IF 2.466) Pub Date : 2020-05-27
    George Papadakis, George Mandilaras, Luca Gagliardelli, Giovanni Simonini, Emmanouil Thanos, George Giannakopoulos, Sonia Bergamaschi, Themis Palpanas, Manolis Koubarakis

    Entity Resolution (ER) is the task of detecting different entity profiles that describe the same real-world objects. To facilitate its execution, we have developed JedAI, an open-source system that puts together a series of state-of-the-art ER techniques that have been proposed and examined independently, targeting parts of the ER end-to-end pipeline. This is a unique approach, as no other ER tool

  • Sports analytics — Evaluation of basketball players and team performance
    Inform. Syst. (IF 2.466) Pub Date : 2020-05-23
    Vangelis Sarlis, Christos Tjortjis

    Given the recent trend in Data Science (DS) and Sports Analytics, an opportunity has arisen for utilizing Machine Learning (ML) and Data Mining (DM) techniques in sports. This paper reviews background and advanced basketball metrics used in National Basketball Association (NBA) and Euroleague games. The purpose of this paper is to benchmark existing performance analytics used in the literature for

  • Scalable alignment of process models and event logs: An approach based on automata and S-components
    Inform. Syst. (IF 2.466) Pub Date : 2020-05-22
    Daniel Reißner, Abel Armas-Cervantes, Raffaele Conforti, Marlon Dumas, Dirk Fahland, Marcello La Rosa

    Given a model of the expected behavior of a business process and given an event log recording its observed behavior, the problem of business process conformance checking is that of identifying and describing the differences between the process model and the event log. A desirable feature of a conformance checking technique is that it should identify a minimal yet complete set of differences. Existing

  • Eras: Improving the quality control in the annotation process for Natural Language Processing tasks
    Inform. Syst. (IF 2.466) Pub Date : 2020-05-21
    Jonatas S. Grosman, Pedro H.T. Furtado, Ariane M.B. Rodrigues, Guilherme G. Schardong, Simone D.J. Barbosa, Hélio C.V. Lopes

    The increasing amount of valuable, unstructured textual information poses a major challenge to extract value from those texts. We need to use NLP (Natural Language Processing) techniques, most of which rely on manually annotating a large corpus of text for its development and evaluation. Creating a large annotated corpus is laborious and requires suitable computational support. There are many annotation

  • DMAKit: A user-friendly web platform for bringing state-of-the-art data analysis techniques to non-specific users
    Inform. Syst. (IF 2.466) Pub Date : 2020-05-16
    David Medina-Ortiz, Sebastián Contreras, Cristofer Quiroz, Juan A. Asenjo, Álvaro Olivera-Nappa
  • Vadalog: A modern architecture for automated reasoning with large knowledge graphs
    Inform. Syst. (IF 2.466) Pub Date : 2020-05-11
    Luigi Bellomarini; Davide Benedetto; Georg Gottlob; Emanuel Sallinger

    The introduction of novel Datalog +/- fragments with good theoretical properties, together with the growing use of enterprise knowledge graphs motivated the development of Vadalog, a knowledge graph management system developed at the University of Oxford. It adopts Warded Datalog +/- as the core of its language for knowledge representation and reasoning, which exhibits a very good tradeoff between

  • Process discovery with context-aware process trees
    Inform. Syst. (IF 2.466) Pub Date : 2020-05-08
    Roee Shraga; Avigdor Gal; Dafna Schumacher; Arik Senderovich; Matthias Weidlich

    Discovery plays a key role in data-driven analysis of business processes. The vast majority of contemporary discovery algorithms aims at the identification of control-flow constructs. The increase in data richness, however, enables discovery that incorporates the context of process execution beyond the control-flow perspective. A “control-flow first” approach, where context data serves for refinement

  • SchemaDecrypt++: Parallel on-line Versioned Schema Inference for Large Semantic Web Data sources
    Inform. Syst. (IF 2.466) Pub Date : 2020-05-06
    Kenza Kellou-Menouer, Zoubida Kedad

    A growing number of linked data sources are published on the Web. They form a single huge data space referred to as the Web of data. These data sources contain both the data and the schema describing them, but the data is not constrained by this schema. Indeed, two instances of the same class may be described by different properties. This flexibility for describing the data eases their evolution, but

  • Explaining data with descriptions
    Inform. Syst. (IF 2.466) Pub Date : 2020-05-04
    Matteo Paganelli, Paolo Sottovia, Antonio Maccioni, Matteo Interlandi, Francesco Guerra

    With the advent of Big Data, it is impossible for a human user to properly inspect and understand data at a glance. In this paper, we introduce the problem of generating data descriptions: a set of compact, readable and insightful formulas of boolean predicates that represents a set of data records. Unfortunately, finding the best description for a dataset is both NP-hard and task-specific. Therefore

  • Recommender systems for smart cities
    Inform. Syst. (IF 2.466) Pub Date : 2020-04-28
    Lara Quijano-Sánchez, Iván Cantador, María E. Cortés-Cediel, Olga Gil

    Among other conceptualizations, smart cities have been defined as functional urban areas articulated by the use of Information and Communication Technologies (ICT) and modern infrastructures to face city problems in efficient and sustainable ways. Within ICT, recommender systems are strong tools that filter relevant information, upgrading the relations between stakeholders in the polity and civil society

  • Evaluation of factors contributing to the failure of information systems in public universities: The case of Iran
    Inform. Syst. (IF 2.466) Pub Date : 2020-04-24
    Siamak Kheybari, Fariba Mahdi Rezaie, S. Ali Naji, Mahsa Javdanmehr, Jafar Rezaei

    In this paper, we evaluate the reasons for the failure of information systems in public universities. To that end, we start by presenting a hierarchical structure of criteria after reviewing related studies, and dividing the criteria into the categories of project management, organizational management, human-related, organizational and technical. To assess the weight of the criteria in the proposed

  • Automatic latent street type discovery from web open data
    Inform. Syst. (IF 2.466) Pub Date : 2020-04-24
    Yihong Zhang, Panote Siriaraya, Yukiko Kawai, Adam Jatowt

    Street categorization is an important topic in urban planning and in various applications such as routing and environment monitoring. Typically streets are classified as commercial, residential, and industrial. However, such broad categorization is insufficient to capture the rich properties a street may possess, and often cannot be used for specific applications. Previous works have proposed several

  • A survey on graph-based methods for similarity searches in metric spaces
    Inform. Syst. (IF 2.466) Pub Date : 2020-02-25
    Larissa C. Shimomura; Rafael Seidi Oyamada; Marcos R. Vieira; Daniel S. Kaster

    Technology development has accelerated the volume growth of complex data, such as images, videos, time series, and georeferenced data. Similarity search is a widely used approach to retrieve complex data, which aims at retrieving similar data according to intrinsic characteristics of the data. Therefore, to facilitate the retrieval of complex data using similarity searches, one needs to organize large

  • Using agile methodologies for adopting COBIT
    Inform. Syst. (IF 2.466) Pub Date : 2020-02-19
    Ana Cláudia Amorim; Miguel Mira da Silva; Rúben Pereira; Margarida Gonçalves

    COBIT 5 is a widely-used framework for implementing sound governance of enterprise IT (GEIT). Currently, the ISACA’s official implementation solution follows a sequentially ordered process, raising several issues related with lack of commitment from top management and misaligned solutions. Nevertheless, new project life-cycle strategies have emerged along with the agile paradigm for project management

  • Re-ranking via local embeddings: A use case with permutation-based indexing and the nSimplex projection
    Inform. Syst. (IF 2.466) Pub Date : 2020-02-13
    Lucia Vadicamo; Claudio Gennaro; Fabrizio Falchi; Edgar Chávez; Richard Connor; Giuseppe Amato

    Approximate Nearest Neighbor (ANN) search is a prevalent paradigm for searching intrinsically high dimensional objects in large-scale data sets. Recently, the permutation-based approach for ANN has attracted a lot of interest due to its versatility in being used in the more general class of metric spaces. In this approach, the entire database is ranked by a permutation distance to the query. Typically

  • Bitpart: Exact metric search in high(er) dimensions
    Inform. Syst. (IF 2.466) Pub Date : 2020-02-04
    Alan Dearle; Richard Connor

    We define BitPart (Bitwise representations of binary Partitions), a novel exact search mechanism intended for use in high-dimensional spaces. In outline, a fixed set of reference objects is used to define a large set of regions within the original space, and each data item is characterised according to its containment within these regions. In contrast with other mechanisms only a subset of this information

  • A comprehensive analysis of delayed insertions in metric access methods
    Inform. Syst. (IF 2.466) Pub Date : 2020-01-11
    Humberto Razente; Maria Camila N. Barioni; Regis M. Santos Sousa

    Similarity queries are fundamental operations for applications that deal with complex data. This paper presents MIA (Metric Indexing Assisted by auxiliary memory with limited capacity), a new delayed insertion approach that can be employed to create enhanced dynamic metric access methods through short-term memories. We present a comprehensive evaluation of delayed insertion methods for metric access

  • CoPModL: Construction Process Modeling Language and Satisfiability Checking
    Inform. Syst. (IF 2.466) Pub Date : 2019-11-27
    Elisa Marengo; Werner Nutt; Matthias Perktold

    Process modeling has been widely investigated in the literature and several general purpose approaches have been introduced, addressing a variety of domains. However, generality goes to the detriment of the possibility to model details and peculiarities of a particular application domain. As acknowledged by the literature, known approaches predominantly focus on one aspect between control flow and

