当前期刊: ACM Transactions on Database Systems Go to current issue    加入关注    本刊投稿指南
显示样式:        排序: IF: - GO 导出
我的关注
我的收藏
您暂时未登录!
登录
  • Flexible Skylines: Dominance for Arbitrary Sets of Monotone Functions
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-12-10
    Paolo Ciaccia; Davide Martinenghi

    Skyline and ranking queries are two popular, alternative ways of discovering interesting data in large datasets. Skyline queries are simple to specify, as they just return the set of all non-dominated tuples, thereby providing an overall view of potentially interesting results. However, they are not equipped with any means to accommodate user preferences or to control the cardinality of the result

    更新日期:2020-12-10
  • Incremental and Approximate Computations for Accelerating Deep CNN Inference
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-12-06
    Supun Nakandala; Kabir Nagrecha; Arun Kumar; Yannis Papakonstantinou

    Deep learning now offers state-of-the-art accuracy for many prediction tasks. A form of deep learning called deep convolutional neural networks (CNNs) are especially popular on image, video, and time series data. Due to its high computational cost, CNN inference is often a bottleneck in analytics tasks on such data. Thus, a lot of work in the computer architecture, systems, and compilers communities

    更新日期:2020-12-07
  • Functional Aggregate Queries with Additive Inequalities
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-12-06
    Mahmoud Abo Khamis; Ryan R. Curtin; Benjamin Moseley; Hung Q. Ngo; Xuanlong Nguyen; Dan Olteanu; Maximilian Schleich

    Motivated by fundamental applications in databases and relational machine learning, we formulate and study the problem of answering functional aggregate queries (FAQ) in which some of the input factors are defined by a collection of additive inequalities between variables. We refer to these queries as FAQ-AI for short. To answer FAQ-AI in the Boolean semiring, we define relaxed tree decompositions

    更新日期:2020-12-07
  • MobilityDB: A Mobility Database Based on PostgreSQL and PostGIS
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-12-06
    Esteban Zimányi; Mahmoud Sakr; Arthur Lesuisse

    Despite two decades of research in moving object databases and a few research prototypes that have been proposed, there is not yet a mainstream system targeted for industrial use. In this article, we present MobilityDB, a moving object database that extends the type system of PostgreSQL and PostGIS with abstract data types for representing moving object data. The types are fully integrated into the

    更新日期:2020-12-07
  • Editorial
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-09-11
    Chris Jermaine

    No abstract available.

    更新日期:2020-09-12
  • Discovering Graph Functional Dependencies
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-09-11
    Wenfei Fan; Chunming Hu; Xueli Liu; Ping Lu

    This article studies discovery of Graph Functional Dependencies (GFDs), a class of functional dependencies defined on graphs. We investigate the fixed-parameter tractability of three fundamental problems related to GFD discovery. We show that the implication and satisfiability problems are fixed-parameter tractable, but the validation problem is co-W[1]-hard in general. We introduce notions of reduced

    更新日期:2020-09-12
  • Maintaining Triangle Queries under Updates
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-08-26
    Ahmet Kara; Hung Q. Ngo; Milos Nikolic; Dan Olteanu; Haozhe Zhang

    We consider the problem of incrementally maintaining the triangle queries with arbitrary free variables under single-tuple updates to the input relations. We introduce an approach called IVMϵ that exhibits a trade-off between the update time, the space, and the delay for the enumeration of the query result, such that the update time ranges from the square root to linear in the database size while the

    更新日期:2020-08-26
  • Synthesis of Incremental Linear Algebra Programs
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-08-26
    Amir Shaikhha; Mohammed Elseidy; Stephan Mihaila; Daniel Espino; Christoph Koch

    This article targets the Incremental View Maintenance (IVM) of sophisticated analytics (such as statistical models, machine learning programs, and graph algorithms) expressed as linear algebra programs. We present LAGO, a unified framework for linear algebra that automatically synthesizes efficient incremental trigger programs, thereby freeing the user from error-prone manual derivations, performance

    更新日期:2020-08-26
  • Efficient Discovery of Matching Dependencies
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-08-26
    Philipp Schirmer; Thorsten Papenbrock; Ioannis Koumarelas; Felix Naumann

    Matching dependencies (MDs) are data profiling results that are often used for data integration, data cleaning, and entity matching. They are a generalization of functional dependencies (FDs) matching similar rather than same elements. As their discovery is very difficult, existing profiling algorithms find either only small subsets of all MDs or their scope is limited to only small datasets. We focus

    更新日期:2020-08-26
  • Packing R-trees with Space-filling Curves: Theoretical Optimality, Empirical Efficiency, and Bulk-loading Parallelizability
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-08-26
    Jianzhong Qi; Yufei Tao; Yanchuan Chang; Rui Zhang

    The massive amount of data and large variety of data distributions in the big data era call for access methods that are efficient in both query processing and index management, and over both practical and worst-case workloads. To address this need, we revisit two classic multidimensional access methods—the R-tree and the space-filling curve. We propose a novel R-tree packing strategy based on space-filling

    更新日期:2020-08-26
  • Succinct Range Filters
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-06-21
    Huanchen Zhang; Hyeontaek Lim; Viktor Leis; David G. Andersen; Michael Kaminsky; Kimberly Keeton; Andrew Pavlo

    We present the Succinct Range Filter (SuRF), a fast and compact data structure for approximate membership tests. Unlike traditional Bloom filters, SuRF supports both single-key lookups and common range queries: open-range queries, closed-range queries, and range counts. SuRF is based on a new data structure called the Fast Succinct Trie (FST) that matches the point and range query performance of state-of-the-art

    更新日期:2020-08-18
  • Adaptive Asynchronous Parallelization of Graph Algorithms
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-07-05
    Wenfei Fan; Ping Lu; Wenyuan Yu; Jingbo Xu; Qiang Yin; Xiaojian Luo; Jingren Zhou; Ruochun Jin

    This article proposes an Adaptive Asynchronous Parallel (AAP) model for graph computations. As opposed to Bulk Synchronous Parallel (BSP) and Asynchronous Parallel (AP) models, AAP reduces both stragglers and stale computations by dynamically adjusting relative progress of workers. We show that BSP, AP, and Stale Synchronous Parallel model (SSP) are special cases of AAP. Better yet, AAP optimizes parallel

    更新日期:2020-08-18
  • Learning Models over Relational Data Using Sparse Tensors and Functional Dependencies
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-06-27
    Mahmoud Abo Khamis; Hung Q. Ngo; Xuanlong Nguyen; Dan Olteanu; Maximilian Schleich

    Integrated solutions for analytics over relational databases are of great practical importance as they avoid the costly repeated loop data scientists have to deal with on a daily basis: select features from data residing in relational databases using feature extraction queries involving joins, projections, and aggregations; export the training dataset defined by such queries; convert this dataset into

    更新日期:2020-08-18
  • On the Language of Nested Tuple Generating Dependencies
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-07-13
    Phokion G. Kolaitis; Reinhard Pichler; Emanuel Sallinger; Vadim Savenkov

    During the past 15 years, schema mappings have been extensively used in formalizing and studying such critical data interoperability tasks as data exchange and data integration. Much of the work has focused on GLAV mappings, i.e., schema mappings specified by source-to-target tuple-generating dependencies (s-t tgds), and on schema mappings specified by second-order tgds (SO tgds), which constitute

    更新日期:2020-08-18
  • Catching Numeric Inconsistencies in Graphs
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-06-27
    Wenfei Fan; Xueli Liu; Ping Lu; Chao Tian

    Numeric inconsistencies are common in real-life knowledge bases and social networks. To catch such errors, we extend graph functional dependencies with linear arithmetic expressions and built-in comparison predicates, referred to as numeric graph dependencies (NGDs). We study fundamental problems for NGDs. We show that their satisfiability, implication, and validation problems are Σp2-complete, Πp2-complete

    更新日期:2020-08-18
  • Editorial
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-02-17
    Christian S. Jensen

    No abstract available.

    更新日期:2020-02-17
  • Computing Optimal Repairs for Functional Dependencies
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-02-17
    Ester Livshits; Benny Kimelfeld; Sudeepa Roy

    We investigate the complexity of computing an optimal repair of an inconsistent database, in the case where integrity constraints are Functional Dependencies (FDs). We focus on two types of repairs: an optimal subset repair (optimal S-repair), which is obtained by a minimum number of tuple deletions, and an optimal update repair (optimal U-repair), which is obtained by a minimum number of value (cell)

    更新日期:2020-02-17
  • A Game-theoretic Approach to Data Interaction
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-02-08
    Ben McCamish; Vahid Ghadakchi; Arash Termehchy; Behrouz Touri; Eduardo Cotilla-Sanchez; Liang Huang; Soravit Changpinyo

    As most users do not precisely know the structure and/or the content of databases, their queries do not exactly reflect their information needs. The database management system (DBMS) may interact with users and use their feedback on the returned results to learn the information needs behind their queries. Current query interfaces assume that users do not learn and modify the way they express their

    更新日期:2020-02-08
  • KTELO
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-02-08
    Dan Zhang; Ryan McKenna; Ios Kotsogiannis; George Bissias; Michael Hay; Ashwin Machanavajjhala; Gerome Miklau

    The adoption of differential privacy is growing, but the complexity of designing private, efficient, and accurate algorithms is still high. We propose a novel programming framework and system, ϵKTELO for implementing both existing and new privacy algorithms. For the task of answering linear counting queries, we show that nearly all existing algorithms can be composed from operators, each conforming

    更新日期:2020-02-08
  • Efficient Enumeration Algorithms for Regular Document Spanners
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2020-02-08
    Fernando Florenzano; Cristian Riveros; Martín Ugarte; Stijn Vansummeren; Domagoj Vrgoč

    Regular expressions and automata models with capture variables are core tools in rule-based information extraction. These formalisms, also called regular document spanners, use regular languages to locate the data that a user wants to extract from a text document and then store this data into variables. Since document spanners can easily generate large outputs, it is important to have efficient evaluation

    更新日期:2020-02-08
  • Dichotomies for Evaluating Simple Regular Path Queries
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2019-12-17
    Wim Martens; Tina Trautner

    Regular path queries (RPQs) are a central component of graph databases. We investigate decision and enumeration problems concerning the evaluation of RPQs under several semantics that have recently been considered: arbitrary paths, shortest paths, paths without node repetitions (simple paths), and paths without edge repetitions (trails). Whereas arbitrary and shortest paths can be dealt with efficiently

    更新日期:2019-12-17
  • General Temporally Biased Sampling Schemes for Online Model Management
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2019-12-17
    Brian Hentschel; Peter J. Haas; Yuanyuan Tian

    To maintain the accuracy of supervised learning models in the presence of evolving data streams, we provide temporally biased sampling schemes that weight recent data most heavily, with inclusion probabilities for a given data item decaying over time according to a specified “decay function.” We then periodically retrain the models on the current sample. This approach speeds up the training process

    更新日期:2019-12-17
  • On the Expressive Power of Query Languages for Matrices
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2019-12-17
    Robert Brijder; Floris Geerts; Jan Van Den Bussche; Timmy Weerwag

    We investigate the expressive power of MATLANG, a formal language for matrix manipulation based on common matrix operations and linear algebra. The language can be extended with the operation inv for inverting a matrix. In MATLANG + inv, we can compute the transitive closure of directed graphs, whereas we show that this is not possible without inversion. Indeed, we show that the basic language can

    更新日期:2019-12-17
  • ChronicleDB
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2019-12-17
    Marc Seidemann; Nikolaus Glombiewski; Michael Körber; Bernhard Seeger

    Reactive security monitoring, self-driving cars, the Internet of Things (IoT), and many other novel applications require systems for both writing events arriving at very high and fluctuating rates to persistent storage as well as supporting analytical ad hoc queries. As standard database systems are not capable of delivering the required write performance, log-based systems, key-value stores, and other

    更新日期:2019-12-17
  • Design and Evaluation of an RDMA-aware Data Shuffling Operator for Parallel Database Systems
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2019-12-17
    Feilong Liu; Lingyan Yin; Spyros Blanas

    The commoditization of high-performance networking has sparked research interest in the RDMA capability of this hardware. One-sided RDMA primitives, in particular, have generated substantial excitement due to the ability to directly access remote memory from within an application without involving the TCP/IP stack or the remote CPU. This article considers how to leverage RDMA to improve the analytical

    更新日期:2019-12-17
  • Efficient Algorithms for Approximate Single-Source Personalized PageRank Queries
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2019-12-17
    Sibo Wang; Renchi Yang; Runhui Wang; Xiaokui Xiao; Zhewei Wei; Wenqing Lin; Yin Yang; Nan Tang

    Given a graph G, a source node s, and a target node t, the personalized PageRank (PPR) of t with respect to s is the probability that a random walk starting from s terminates at t. An important variant of the PPR query is single-source PPR (SSPPR), which enumerates all nodes in G and returns the top-k nodes with the highest PPR values with respect to a given source s. PPR in general and SSPPR in particular

    更新日期:2019-12-17
  • From a Comprehensive Experimental Survey to a Cost-based Selection Strategy for Lightweight Integer Compression Algorithms
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2019-06-19
    Patrick Damme; Annett Ungethüm; Juliana Hildebrandt; Dirk Habich; Wolfgang Lehner

    Lightweight integer compression algorithms are frequently applied in in-memory database systems to tackle the growing gap between processor speed and main memory bandwidth. In recent years, the vectorization of basic techniques such as delta coding and null suppression has considerably enlarged the corpus of available algorithms. As a result, today there is a large number of algorithms to choose from

    更新日期:2019-06-19
  • Verification of Hierarchical Artifact Systems
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2019-06-19
    Alin Deutsch; Yuliang Li; Victor Vianu

    Data-driven workflows, of which IBM’s Business Artifacts are a prime exponent, have been successfully deployed in practice, adopted in industrial standards, and have spawned a rich body of research in academia, focused primarily on static analysis. The present work represents a significant advance on the problem of artifact verification by considering a much richer and more realistic model than in

    更新日期:2019-06-19
  • Interactive Mapping Specification with Exemplar Tuples
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2019-06-19
    Angela Bonifati; Ugo Comignani; Emmanuel Coquery; Romuald Thion

    While schema mapping specification is a cumbersome task for data curation specialists, it becomes unfeasible for non-expert users, who are unacquainted with the semantics and languages of the involved transformations. In this article, we present an interactive framework for schema mapping specification suited for non-expert users. The underlying key intuition is to leverage a few exemplar tuples to

    更新日期:2019-06-19
  • A Unified Framework for Frequent Sequence Mining with Subsequence Constraints
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2019-06-19
    Kaustubh Beedkar; Rainer Gemulla; Wim Martens

    Frequent sequence mining methods often make use of constraints to control which subsequences should be mined. A variety of such subsequence constraints has been studied in the literature, including length, gap, span, regular-expression, and hierarchy constraints. In this article, we show that many subsequence constraints—including and beyond those considered in the literature—can be unified in a single

    更新日期:2019-06-19
  • Output-Optimal Massively Parallel Algorithms for Similarity Joins
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2019-04-08
    Xiao Hu; Ke Yi; Yufei Tao

    Parallel join algorithms have received much attention in recent years due to the rapid development of massively parallel systems such as MapReduce and Spark. In the database theory community, most efforts have been focused on studying worst-case optimal algorithms. However, the worst-case optimality of these join algorithms relies on the hard instances having very large output sizes. In the case of

    更新日期:2019-04-08
  • A Survey of Spatial Crowdsourcing
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2019-04-08
    Srinivasa Raghavendra Bhuvan Gummidi; Xike Xie; Torben Bach Pedersen

    Widespread use of advanced mobile devices has led to the emergence of a new class of crowdsourcing called spatial crowdsourcing. Spatial crowdsourcing advances the potential of a crowd to perform tasks related to real-world scenarios involving physical locations, which were not feasible with conventional crowdsourcing methods. The main feature of spatial crowdsourcing is the presence of spatial tasks

    更新日期:2019-04-08
  • Inferring Insertion Times and Optimizing Error Penalties in Time-decaying Bloom Filters
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2019-04-08
    Jonathan L. Dautrich; Chinya V. Ravishankar

    Current Bloom Filters tend to ignore Bayesian priors as well as a great deal of useful information they hold, compromising the accuracy of their responses. Incorrect responses cause users to incur penalties that are both application- and item-specific, but current Bloom Filters are typically tuned only for static penalties. Such shortcomings are problematic for all Bloom Filter variants, but especially

    更新日期:2019-04-08
  • Dependencies for Graphs
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2019-04-08
    Wenfei Fan; Ping Lu

    This article proposes a class of dependencies for graphs, referred to as graph entity dependencies (GEDs). A GED is defined as a combination of a graph pattern and an attribute dependency. In a uniform format, GEDs can express graph functional dependencies with constant literals to catch inconsistencies, and keys carrying id literals to identify entities (vertices) in a graph. We revise the chase for

    更新日期:2019-04-08
  • Representations and Optimizations for Embedded Parallel Dataflow Languages
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2019-01-29
    Alexander Alexandrov; Georgi Krastev; Volker Markl

    Parallel dataflow engines such as Apache Hadoop, Apache Spark, and Apache Flink are an established alternative to relational databases for modern data analysis applications. A characteristic of these systems is a scalable programming model based on distributed collections and parallel transformations expressed by means of second-order functions such as map and reduce. Notable examples are Flink’s DataSet

    更新日期:2019-01-29
  • Wander Join and XDB
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2019-01-29
    Feifei Li; Bin Wu; Ke Yi; Zhuoyue Zhao

    Joins are expensive, and online aggregation over joins was proposed to mitigate the cost, which offers users a nice and flexible tradeoff between query efficiency and accuracy in a continuous, online fashion. However, the state-of-the-art approach, in both internal and external memory, is based on ripple join, which is still very expensive and even needs unrealistic assumptions (e.g., tuples in a table

    更新日期:2019-01-29
  • Historic Moments Discovery in Sequence Data
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2019-01-29
    Ran Bai; Wing Kai Hon; Eric Lo; Zhian He; Kenny Zhu

    Many emerging applications are based on finding interesting subsequences from sequence data. Finding “prominent streaks,” a set of the longest contiguous subsequences with values all above (or below) a certain threshold, from sequence data is one of that kind that receives much attention. Motivated from real applications, we observe that prominent streaks alone are not insightful enough but require

    更新日期:2019-01-29
  • Scalable Analytics on Fast Data
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2019-01-29
    Andreas Kipf; Varun Pandey; Jan Böttcher; Lucas Braun; Thomas Neumann; Alfons Kemper

    Today’s streaming applications demand increasingly high event throughput rates and are often subject to strict latency constraints. To allow for more complex workloads, such as window-based aggregations, streaming systems need to support stateful event processing. This introduces new challenges for streaming engines as the state needs to be maintained in a consistent and durable manner and simultaneously

    更新日期:2019-01-29
  • Parallelizing Sequential Graph Computations
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2018-12-16
    Wenfei Fan; Wenyuan Yu; Jingbo Xu; Jingren Zhou; Xiaojian Luo; Qiang Yin; Ping Lu; Yang Cao; Ruiqi Xu

    This article presents GRAPE, a parallel GRAPh Engine for graph computations. GRAPE differs from prior systems in its ability to parallelize existing sequential graph algorithms as a whole, without the need for recasting the entire algorithm into a new model. Underlying GRAPE are a simple programming model and a principled approach based on fixpoint computation that starts with partial evaluation and

    更新日期:2018-12-16
  • MacroBase
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2018-12-16
    Firas Abuzaid; Peter Bailis; Jialin Ding; Edward Gan; Samuel Madden; Deepak Narayanan; Kexin Rong; Sahaana Suri

    As data volumes continue to rise, manual inspection is becoming increasingly untenable. In response, we present MacroBase, a data analytics engine that prioritizes end-user attention in high-volume fast data streams. MacroBase enables efficient, accurate, and modular analyses that highlight and aggregate important and unusual behavior, acting as a search engine for fast data. MacroBase is able to deliver

    更新日期:2018-12-16
  • Learning From Query-Answers
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2018-12-16
    Niccolò Meneghetti; Oliver Kennedy; Wolfgang Gatterbauer

    Tuple-independent and disjoint-independent probabilistic databases (TI- and DI-PDBs) represent uncertain data in a factorized form as a product of independent random variables that represent either tuples (TI-PDBs) or sets of tuples (DI-PDBs). When the user submits a query, the database derives the marginal probabilities of each output-tuple, exploiting the underlying assumptions of statistical independence

    更新日期:2018-12-16
  • Optimal Bloom Filters and Adaptive Merging for LSM-Trees
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2018-12-16
    Niv Dayan; Manos Athanassoulis; Stratos Idreos

    In this article, we show that key-value stores backed by a log-structured merge-tree (LSM-tree) exhibit an intrinsic tradeoff between lookup cost, update cost, and main memory footprint, yet all existing designs expose a suboptimal and difficult to tune tradeoff among these metrics. We pinpoint the problem to the fact that modern key-value stores suboptimally co-tune the merge policy, the buffer size

    更新日期:2018-12-16
  • Dynamic Complexity under Definable Changes
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2018-11-26
    Thomas Schwentick; Nils Vortmeier; Thomas Zeume

    In the setting of dynamic complexity, the goal of a dynamic program is to maintain the result of a fixed query for an input database that is subject to changes, possibly using additional auxiliary relations. In other words, a dynamic program updates a materialized view whenever a base relation is changed. The update of query result and auxiliary relations is specified using first-order logic or, equivalently

    更新日期:2018-11-26
  • Distributed Joins and Data Placement for Minimal Network Traffic
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2018-11-26
    Orestis Polychroniou; Wangda Zhang; Kenneth A. Ross

    Network communication is the slowest component of many operators in distributed parallel databases deployed for large-scale analytics. Whereas considerable work has focused on speeding up databases on modern hardware, communication reduction has received less attention. Existing parallel DBMSs rely on algorithms designed for disks with minor modifications for networks. A more complicated algorithm

    更新日期:2018-11-26
  • A Relational Framework for Classifier Engineering
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2018-11-26
    Benny Kimelfeld; Christopher Ré

    In the design of analytical procedures and machine learning solutions, a critical and time-consuming task is that of feature engineering, for which various recipes and tooling approaches have been developed. In this article, we embark on the establishment of database foundations for feature engineering. We propose a formal framework for classification in the context of a relational database. The goal

    更新日期:2018-11-26
  • Expressive Languages for Querying the Semantic Web
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2018-11-26
    Marcelo Arenas; Georg Gottlob; Andreas Pieris

    The problem of querying RDF data is a central issue for the development of the Semantic Web. The query language SPARQL has become the standard language for querying RDF since its W3C standardization in 2008. However, the 2008 version of this language missed some important functionalities: reasoning capabilities to deal with RDFS and OWL vocabularies, navigational capabilities to exploit the graph structure

    更新日期:2018-11-26
  • K-Regret Queries Using Multiplicative Utility Functions
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2018-09-05
    Jianzhong Qi; Fei Zuo; Hanan Samet; Jia Cheng Yao

    The k-regret query aims to return a size-k subset S of a database D such that, for any query user that selects a data object from this size-k subset S rather than from database D, her regret ratio is minimized. The regret ratio here is modeled by the relative difference in the optimality between the locally optimal object in S and the globally optimal object in D. The optimality of a data object in

    更新日期:2018-09-05
  • Answering FO+MOD Queries under Updates on Bounded Degree Databases
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2018-09-05
    Christoph Berkholz; Jens Keppeler; Nicole Schweikardt

    We investigate the query evaluation problem for fixed queries over fully dynamic databases, where tuples can be inserted or deleted. The task is to design a dynamic algorithm that immediately reports the new result of a fixed query after every database update. We consider queries in first-order logic (FO) and its extension with modulo-counting quantifiers (FO+MOD) and show that they can be efficiently

    更新日期:2018-09-05
  • Lightweight Monitoring of Distributed Streams
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2018-09-05
    Arnon Lazerson; Daniel Keren; Assaf Schuster

    As data becomes dynamic, large, and distributed, there is increasing demand for what have become known as distributed stream algorithms. Since continuously collecting the data to a central server and processing it there is infeasible, a common approach is to define local conditions at the distributed nodes, such that—as long as they are maintained—some desirable global condition holds. Previous methods

    更新日期:2018-09-05
  • Efficient Evaluation and Static Analysis for Well-Designed Pattern Trees with Projection
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2018-09-05
    Pablo Barceló; Markus Kröll; Reinhard Pichler; Sebastian Skritek

    Conjunctive queries (CQs) fail to provide an answer when the pattern described by the query does not exactly match the data. CQs might thus be too restrictive as a querying mechanism when data is semistructured or incomplete. The semantic web therefore provides a formalism—known as (projected) well-designed pattern trees (pWDPTs)—that tackles this problem: pWDPTs allow us to formulate queries that

    更新日期:2018-09-05
  • TriAL
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2018-04-11
    Leonid Libkin; Juan L. Reutter; Adrián Soto; Domagoj Vrgoč

    Navigational queries over RDF data are viewed as one of the main applications of graph query languages, and yet the standard model of graph databases—essentially labeled graphs—is different from the triples-based model of RDF. While encodings of RDF databases into graph data exist, we show that even the most natural ones are bound to lose some functionality when used in conjunction with graph query

    更新日期:2018-04-11
  • Building Efficient Query Engines in a High-Level Language
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2018-04-11
    Amir Shaikhha; Yannis Klonatos; Christoph Koch

    Abstraction without regret refers to the vision of using high-level programming languages for systems development without experiencing a negative impact on performance. A database system designed according to this vision offers both increased productivity and high performance instead of sacrificing the former for the latter as is the case with existing, monolithic implementations that are hard to maintain

    更新日期:2018-04-11
  • Estimating the Impact of Unknown Unknowns on Aggregate Query Results
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2018-04-11
    Yeounoh Chung; Michael Lind Mortensen; Carsten Binnig; Tim Kraska

    It is common practice for data scientists to acquire and integrate disparate data sources to achieve higher quality results. But even with a perfectly cleaned and merged data set, two fundamental questions remain: (1) Is the integrated data set complete? and (2) What is the impact of any unknown (i.e., unobserved) data on query results? In this work, we develop and analyze techniques to estimate the

    更新日期:2018-04-11
  • Bounded Query Rewriting Using Views
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2018-04-11
    Yang Cao; Wenfei Fan; Floris Geerts; Ping Lu

    A query Q in a language L has a bounded rewriting using a set of L-definable views if there exists a query Q′ in L such that given any dataset D, Q(D) can be computed by Q′ that accesses only cached views and a small fraction DQ of D. We consider datasets D that satisfy a set of access constraints, which are a combination of simple cardinality constraints and associated indices, such that the size

    更新日期:2018-04-11
  • Practical Private Range Search in Depth
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2018-04-11
    Ioannis Demertzis; Stavros Papadopoulos; Odysseas Papapetrou; Antonios Deligiannakis; Minos Garofalakis; Charalampos Papamanthou

    We consider a data owner that outsources its dataset to an untrusted server. The owner wishes to enable the server to answer range queries on a single attribute, without compromising the privacy of the data and the queries. There are several schemes on “practical” private range search (mainly in database venues) that attempt to strike a trade-off between efficiency and security. Nevertheless, these

    更新日期:2018-04-11
  • Declarative Probabilistic Programming with Datalog
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2017-11-13
    Vince BáRány; Balder Ten Cate; Benny Kimelfeld; Dan Olteanu; Zografoula Vagena

    Probabilistic programming languages are used for developing statistical models. They typically consist of two components: a specification of a stochastic process (the prior) and a specification of observations that restrict the probability space to a conditional subspace (the posterior). Use cases of such formalisms include the development of algorithms in machine learning and artificial intelligence

    更新日期:2017-11-13
  • EmptyHeaded
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2017-11-13
    Christopher R. Aberger; Andrew Lamb; Susan Tu; Andres Nötzli; Kunle Olukotun; Christopher Ré

    There are two types of high-performance graph processing engines: lowand high-level engines. Low-level engines (Galois, PowerGraph, Snap) provide optimized data structures and computation models but require users to write low-level imperative code, hence ensuring that efficiency is the burden of the user. In high-level engines, users write in query languages like datalog (SociaLite) or SQL (Grail)

    更新日期:2017-11-13
  • Linear Time Membership in a Class of Regular Expressions with Counting, Interleaving, and Unordered Concatenation
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2017-11-13
    Dario Colazzo; Giorgio Ghelli; Carlo Sartiani

    Regular Expressions (REs) are ubiquitous in database and programming languages. While many applications make use of REs extended with interleaving (shuffle) and unordered concatenation operators, this extension badly affects the complexity of basic operations, and, especially, makes membership NP-hard, which is unacceptable in most practical scenarios. In this article, we study the problem of membership

    更新日期:2017-11-13
  • PrivBayes
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2017-11-13
    Jun Zhang; Graham Cormode; Cecilia M. Procopiuc; Divesh Srivastava; Xiaokui Xiao

    Privacy-preserving data publishing is an important problem that has been the focus of extensive study. The state-of-the-art solution for this problem is differential privacy, which offers a strong degree of privacy protection without making restrictive assumptions about the adversary. Existing techniques using differential privacy, however, cannot effectively handle the publication of high-dimensional

    更新日期:2017-11-13
  • Blazes
    ACM Trans. Database Syst. (IF 2.927) Pub Date : 2017-11-13
    Peter Alvaro; Neil Conway; Joseph M. Hellerstein; David Maier

    Distributed consistency is perhaps the most-discussed topic in distributed systems today. Coordination protocols can ensure consistency, but in practice they cause undesirable performance unless used judiciously. Scalable distributed architectures avoid coordination whenever possible, but under-coordinated systems can exhibit behavioral anomalies under fault, which are often extremely difficult to

    更新日期:2017-11-13
Contents have been reproduced by permission of the publishers.
导出
全部期刊列表>>
微生物研究
亚洲大洋洲地球科学
NPJ欢迎投稿
自然科研论文编辑
ERIS期刊投稿
欢迎阅读创刊号
自然职场,为您触达千万科研人才
spring&清华大学出版社
城市可持续发展前沿研究专辑
Springer 纳米技术权威期刊征稿
全球视野覆盖
施普林格·自然新
chemistry
物理学研究前沿热点精选期刊推荐
自然职位线上招聘会
欢迎报名注册2020量子在线大会
化学领域亟待解决的问题
材料学研究精选新
GIANT
ACS ES&T Engineering
ACS ES&T Water
屿渡论文,编辑服务
阿拉丁试剂right
上海中医药大学
清华大学
复旦大学
南科大
北京理工大学
上海交通大学
隐藏1h前已浏览文章
课题组网站
新版X-MOL期刊搜索和高级搜索功能介绍
ACS材料视界
清华大学-1
武汉大学
浙江大学
天合科研
x-mol收录
试剂库存
down
wechat
bug