当前位置: X-MOL 学术arXiv.cs.SE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
TypeWriter: Neural Type Prediction with Search-based Validation
arXiv - CS - Software Engineering Pub Date : 2019-12-08 , DOI: arxiv-1912.03768
Michael Pradel, Georgios Gousios, Jason Liu, Satish Chandra

Maintaining large code bases written in dynamically typed languages, such as JavaScript or Python, can be challenging due to the absence of type annotations: simple data compatibility errors proliferate, IDE support is limited, and APIs are hard to comprehend. Recent work attempts to address those issues through either static type inference or probabilistic type prediction. Unfortunately, static type inference for dynamic languages is inherently limited, while probabilistic approaches suffer from imprecision. This paper presents TypeWriter, the first combination of probabilistic type prediction with search-based refinement of predicted types. TypeWriter's predictor learns to infer the return and argument types for functions from partially annotated code bases by combining the natural language properties of code with programming language-level information. To validate predicted types, TypeWriter invokes a gradual type checker with different combinations of the predicted types, while navigating the space of possible type combinations in a feedback-directed manner. We implement the TypeWriter approach for Python and evaluate it on two code corpora: a multi-million line code base at Facebook and a collection of 1,137 popular open-source projects. We show that TypeWriter's type predictor achieves an F1 score of 0.64 (0.79) in the top-1 (top-5) predictions for return types, and 0.57 (0.80) for argument types, which clearly outperforms prior type prediction models. By combining predictions with search-based validation, TypeWriter can fully annotate between 14% to 44% of the files in a randomly selected corpus, while ensuring type correctness. A comparison with a static type inference tool shows that TypeWriter adds many more non-trivial types. TypeWriter currently suggests types to developers at Facebook and several thousands of types have already been accepted with minimal changes.

中文翻译:

TypeWriter:基于搜索验证的神经类型预测

由于缺少类型注释,维护以动态类型语言(例如 JavaScript 或 Python)编写的大型代码库可能具有挑战性:简单的数据兼容性错误激增,IDE 支持有限,并且 API 难以理解。最近的工作试图通过静态类型推断或概率类型预测来解决这些问题。不幸的是,动态语言的静态类型推断本质上是有限的,而概率方法不精确。本文介绍了 TypeWriter,这是概率类型预测与基于搜索的预测类型细化的第一个组合。打字机' 预测器通过将代码的自然语言属性与编程语言级别的信息相结合,学习从部分注释的代码库中推断函数的返回和参数类型。为了验证预测类型,TypeWriter 调用具有不同预测类型组合的渐进类型检查器,同时以反馈导向的方式导航可能类型组合的空间。我们为 Python 实现了 TypeWriter 方法,并在两个代码语料库上对其进行了评估:Facebook 的数百万行代码库和 1,137 个流行的开源项目的集合。我们表明 TypeWriter 的类型预测器在返回类型的前 1(前 5 个)预测中达到 0.64(0.79)的 F1 分数,在参数类型的预测中达到 0.57(0.80),这明显优于先前的类型预测模型。通过将预测与基于搜索的验证相结合,TypeWriter 可以对随机选择的语料库中 14% 到 44% 的文件进行完全注释,同时确保类型正确性。与静态类型推断工具的比较表明 TypeWriter 添加了更多非平凡类型。TypeWriter 目前向 Facebook 的开发人员推荐类型,并且已经接受了数千种类型,并且更改很少。
更新日期:2020-03-09
down
wechat
bug