Evaluation of Machine Learning Algorithms in Predicting the Next SQL Query from the Future,ACM Transactions on Database Systems

当前位置： X-MOL 学术 › ACM Trans. Database Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Evaluation of Machine Learning Algorithms in Predicting the Next SQL Query from the Future
ACM Transactions on Database Systems ( IF 2.2 ) Pub Date : 2021-03-18 , DOI: 10.1145/3442338
Venkata Vamsikrishna Meduri ₁ , Kanchan Chowdhury ₁ , Mohamed Sarwat ₁

Affiliation

Prediction of the next SQL query from the user, given her sequence of queries until the current timestep, during an ongoing interaction session of the user with the database, can help in speculative query processing and increased interactivity. While existing machine learning-- (ML) based approaches use recommender systems to suggest relevant queries to a user, there has been no exhaustive study on applying temporal predictors to predict the next user issued query. In this work, we experimentally compare ML algorithms in predicting the immediate next future query in an interaction workload, given the current user query or the sequence of queries in a user session thus far. As a part of this, we propose the adaptation of two powerful temporal predictors: (a) Recurrent Neural Networks (RNNs) and (b) a Reinforcement Learning approach called Q-Learning that uses Markov Decision Processes. We represent each query as a comprehensive set of fragment embeddings that not only captures the SQL operators, attributes, and relations but also the arithmetic comparison operators and constants that occur in the query. Our experiments on two real-world datasets show the effectiveness of temporal predictors against the baseline recommender systems in predicting the structural fragments in a query w.r.t. both quality and time. Besides showing that RNNs can be used to synthesize novel queries, we find that exact Q-Learning outperforms RNNs despite predicting the next query entirely from the historical query logs.

中文翻译：

机器学习算法在预测未来下一个 SQL 查询中的评估

在用户与数据库的正在进行的交互会话期间，给定她在当前时间步之前的查询序列，预测来自用户的下一个 SQL 查询，可以帮助推测查询处理和增加交互性。虽然现有的基于机器学习 (ML) 的方法使用推荐系统向用户建议相关查询，但还没有关于应用时间预测器来预测下一个用户发出的查询的详尽研究。在这项工作中，我们在给定当前用户查询或迄今为止用户会话中的查询序列的情况下，实验性地比较了机器学习算法在预测交互工作负载中的下一个未来查询方面的效果。作为其中的一部分，我们建议采用两个强大的时间预测器：(a) 循环神经网络 (RNN) 和 (b) 一种称为 Q-Learning 的强化学习方法，它使用马尔可夫决策过程。我们将每个查询表示为一组全面的片段嵌入，它不仅捕获 SQL 运算符、属性和关系，还捕获查询中出现的算术比较运算符和常量。我们在两个真实世界数据集上的实验表明，时间预测器相对于基线推荐系统在预测查询中的结构片段的质量和时间方面的有效性。除了表明 RNN 可用于合成新的查询之外，我们发现尽管完全根据历史查询日志预测下一个查询，但精确的 Q-Learning 优于 RNN。我们将每个查询表示为一组全面的片段嵌入，它不仅捕获 SQL 运算符、属性和关系，还捕获查询中出现的算术比较运算符和常量。我们在两个真实世界数据集上的实验表明，时间预测器相对于基线推荐系统在预测查询中的结构片段的质量和时间方面的有效性。除了表明 RNN 可用于合成新的查询之外，我们发现尽管完全根据历史查询日志预测下一个查询，但精确的 Q-Learning 优于 RNN。我们将每个查询表示为一组全面的片段嵌入，它不仅捕获 SQL 运算符、属性和关系，还捕获查询中出现的算术比较运算符和常量。我们在两个真实世界数据集上的实验表明，时间预测器相对于基线推荐系统在预测查询中的结构片段的质量和时间方面的有效性。除了表明 RNN 可用于合成新的查询之外，我们发现尽管完全根据历史查询日志预测下一个查询，但精确的 Q-Learning 优于 RNN。我们在两个真实世界数据集上的实验表明，时间预测器相对于基线推荐系统在预测查询中的结构片段的质量和时间方面的有效性。除了表明 RNN 可用于合成新的查询之外，我们发现尽管完全根据历史查询日志预测下一个查询，但精确的 Q-Learning 优于 RNN。我们在两个真实世界数据集上的实验表明，时间预测器相对于基线推荐系统在预测查询中的结构片段的质量和时间方面的有效性。除了表明 RNN 可用于合成新的查询之外，我们发现尽管完全根据历史查询日志预测下一个查询，但精确的 Q-Learning 优于 RNN。

更新日期：2021-03-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11