当前位置: X-MOL 学术arXiv.cs.SE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Using Large-Scale Anomaly Detection on Code to Improve Kotlin Compiler
arXiv - CS - Software Engineering Pub Date : 2020-04-03 , DOI: arxiv-2004.01618
Timofey Bryksin, Victor Petukhov, Ilya Alexin, Stanislav Prikhodko, Alexey Shpilman, Vladimir Kovalenko, Nikita Povarov

In this work, we apply anomaly detection to source code and bytecode to facilitate the development of a programming language and its compiler. We define anomaly as a code fragment that is different from typical code written in a particular programming language. Identifying such code fragments is beneficial to both language developers and end users, since anomalies may indicate potential issues with the compiler or with runtime performance. Moreover, anomalies could correspond to problems in language design. For this study, we choose Kotlin as the target programming language. We outline and discuss approaches to obtaining vector representations of source code and bytecode and to the detection of anomalies across vectorized code snippets. The paper presents a method that aims to detect two types of anomalies: syntax tree anomalies and so-called compiler-induced anomalies that arise only in the compiled bytecode. We describe several experiments that employ different combinations of vectorization and anomaly detection techniques and discuss types of detected anomalies and their usefulness for language developers. We demonstrate that the extracted anomalies and the underlying extraction technique provide additional value for language development.

中文翻译:

在代码上使用大规模异常检测来改进 Kotlin 编译器

在这项工作中,我们将异常检测应用于源代码和字节码,以促进编程语言及其编译器的开发。我们将异常定义为与用特定编程语言编写的典型代码不同的代码片段。识别此类代码片段对语言开发人员和最终用户都有益,因为异常可能表明编译器或运行时性能存在潜在问题。此外,异常可能对应于语言设计中的问题。在本研究中,我们选择 Kotlin 作为目标编程语言。我们概述并讨论了获取源代码和字节码的向量表示以及跨向量化代码片段检测异常的方法。该论文提出了一种旨在检测两类异常的方法:语法树异常和所谓的编译器引起的异常,它们只出现在编译的字节码中。我们描述了几个采用矢量化和异常检测技术的不同组合的实验,并讨论了检测到的异常的类型及其对语言开发人员的有用性。我们证明提取的异常和底层提取技术为语言开发提供了额外的价值。
更新日期:2020-04-06
down
wechat
bug