当前位置: X-MOL 学术arXiv.cs.PL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Context-Aware Parse Trees
arXiv - CS - Programming Languages Pub Date : 2020-03-24 , DOI: arxiv-2003.11118
Fangke Ye, Shengtian Zhou, Anand Venkat, Ryan Marcus, Paul Petersen, Jesmin Jahan Tithi, Tim Mattson, Tim Kraska, Pradeep Dubey, Vivek Sarkar, Justin Gottschlich

The simplified parse tree (SPT) presented in Aroma, a state-of-the-art code recommendation system, is a tree-structured representation used to infer code semantics by capturing program \emph{structure} rather than program \emph{syntax}. This is a departure from the classical abstract syntax tree, which is principally driven by programming language syntax. While we believe a semantics-driven representation is desirable, the specifics of an SPT's construction can impact its performance. We analyze these nuances and present a new tree structure, heavily influenced by Aroma's SPT, called a \emph{context-aware parse tree} (CAPT). CAPT enhances SPT by providing a richer level of semantic representation. Specifically, CAPT provides additional binding support for language-specific techniques for adding semantically-salient features, and language-agnostic techniques for removing syntactically-present but semantically-irrelevant features. Our research quantitatively demonstrates the value of our proposed semantically-salient features, enabling a specific CAPT configuration to be 39\% more accurate than SPT across the 48,610 programs we analyzed.

中文翻译:

上下文感知解析树

最先进的代码推荐系统 Aroma 中提出的简化解析树 (SPT) 是一种树结构表示,用于通过捕获程序 \emph{structure} 而不是程序 \emph{syntax} 来推断代码语义. 这与主要由编程语言语法驱动的经典抽象语法树背道而驰。虽然我们认为语义驱动的表示是可取的,但 SPT 构造的细节会影响其性能。我们分析了这些细微差别,并提出了一种受 Aroma 的 SPT 严重影响的新树结构,称为 \emph {上下文感知分析树}(CAPT)。CAPT 通过提供更丰富的语义表示来增强 SPT。具体来说,CAPT 为用于添加语义显着特征的特定于语言的技术提供了额外的绑定支持,和语言不可知技术,用于删除语法上存在但语义上不相关的特征。我们的研究定量证明了我们提出的语义显着特征的价值,在我们分析的 48,610 个程序中,特定 CAPT 配置的准确度比 SPT 高 39%。
更新日期:2020-03-26
down
wechat
bug