Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing,arXiv - CS - Databases

当前位置： X-MOL 学术 › arXiv.cs.DB › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing
arXiv - CS - Databases Pub Date : 2020-12-23 , DOI: arxiv-2012.12627
Xi Victoria Lin, Richard Socher, Caiming Xiong

We present BRIDGE, a powerful sequential architecture for modeling dependencies between natural language questions and relational databases in cross-DB semantic parsing. BRIDGE represents the question and DB schema in a tagged sequence where a subset of the fields are augmented with cell values mentioned in the question. The hybrid sequence is encoded by BERT with minimal subsequent layers and the text-DB contextualization is realized via the fine-tuned deep attention in BERT. Combined with a pointer-generator decoder with schema-consistency driven search space pruning, BRIDGE attained state-of-the-art performance on popular cross-DB text-to-SQL benchmarks, Spider (71.1\% dev, 67.5\% test with ensemble model) and WikiSQL (92.6\% dev, 91.9\% test). Our analysis shows that BRIDGE effectively captures the desired cross-modal dependencies and has the potential to generalize to more text-DB related tasks. Our implementation is available at \url{https://github.com/salesforce/TabularSemanticParsing}.

中文翻译：

桥接文本和表格数据以进行跨域文本到SQL的语义解析

我们提出了BRIDGE，这是一个强大的顺序体系结构，用于在跨数据库语义解析中对自然语言问题与关系数据库之间的依赖关系进行建模。BRIDGE以标记序列表示问题和数据库模式，其中字段的子集使用问题中提到的单元格值进行扩充。混合序列由BERT编码，并具有最少的后续层，并且通过BERT中微调的高度关注来实现text-DB上下文化。结合具有模式一致性驱动的搜索空间修剪的指针生成器解码器，BRIDGE在流行的跨DB文本到SQL基准，Spider（71.1 \％dev，67.5 \％test）下获得了最先进的性能。集成模型）和WikiSQL（92.6 \％dev，91.9 \％test）。我们的分析表明，BRIDGE有效地捕获了所需的跨模式依赖关系，并具有推广到更多与文本数据库相关的任务的潜力。我们的实现位于\ url {https://github.com/salesforce/TabularSemanticParsing}。

更新日期：2020-12-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>