当前位置: X-MOL 学术Mach. Learn. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Predictive spreadsheet autocompletion with constraints
Machine Learning ( IF 7.5 ) Pub Date : 2019-10-25 , DOI: 10.1007/s10994-019-05841-y
Samuel Kolb , Stefano Teso , Anton Dries , Luc De Raedt

Spreadsheets are arguably the most accessible data-analysis tool and are used by millions of people. Despite the fact that they lie at the core of most business practices, working with spreadsheets can be error prone, usage of formulas requires training and, crucially, spreadsheet users do not have access to state-of-the-art analysis techniques offered by machine learning. To tackle these issues, we introduce the novel task of predictive spreadsheet autocompletion , where the goal is to automatically predict the missing entries in the spreadsheets. This task is highly non-trivial: cells can hold heterogeneous data types and there might be unobserved relationships between their values, such as constraints or probabilistic dependencies. Critically, the exact prediction task itself is not given. We consider a simplified, yet non-trivial, setting and propose a principled probabilistic model to solve it. Our approach combines black-box predictive models specialized for different predictive tasks (e.g., classification, regression) and constraints and formulas detected by a constraint learner, and produces a maximally likely prediction for all target cells that is consistent with the constraints. Overall, our approach brings us one step closer to allowing end users to leverage machine learning in their workflows without writing a single line of code.

中文翻译:

具有约束的预测性电子表格自动完成

电子表格可以说是最容易使用的数据分析工具,并且被数百万人使用。尽管它们是大多数业务实践的核心,但使用电子表格可能容易出错,公式的使用需要培训,而且至关重要的是,电子表格用户无法使用机器提供的最先进的分析技术学习。为了解决这些问题,我们引入了预测电子表格自动完成的新任务,其目标是自动预测电子表格中缺失的条目。这项任务非常重要:单元格可以保存异构数据类型,并且它们的值之间可能存在未观察到的关系,例如约束或概率依赖性。关键的是,没有给出确切的预测任务本身。我们考虑一个简化但不平凡的,设置并提出一个有原则的概率模型来解决它。我们的方法结合了专门用于不同预测任务(例如,分类、回归)的黑盒预测模型以及约束学习器检测到的约束和公式,并为所有与约束一致的目标细胞生成最大可能的预测。总体而言,我们的方法使我们更接近于允许最终用户在其工作流程中利用机器学习,而无需编写任何代码。并为所有与约束一致的目标细胞产生最大可能的预测。总体而言,我们的方法使我们更接近于允许最终用户在其工作流程中利用机器学习,而无需编写任何代码。并为所有与约束一致的目标细胞产生最大可能的预测。总体而言,我们的方法使我们更接近于允许最终用户在其工作流程中利用机器学习,而无需编写任何代码。
更新日期:2019-10-25
down
wechat
bug