CROKAGE: effective solution recommendation for programming tasks by leveraging crowd knowledge,Empirical Software Engineering

当前位置： X-MOL 学术 › Empir. Software Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

CROKAGE: effective solution recommendation for programming tasks by leveraging crowd knowledge
Empirical Software Engineering ( IF 3.5 ) Pub Date : 2020-09-02 , DOI: 10.1007/s10664-020-09863-2
Rodrigo Fernandes Gomes da Silva , Chanchal K. Roy , Mohammad Masudur Rahman , Kevin A. Schneider , Klérisson Paixão , Carlos Eduardo de Carvalho Dantas , Marcelo de Almeida Maia

Developers often search for relevant code examples on the web for their programming tasks. Unfortunately, they face three major problems. First, they frequently need to read and analyse multiple results from the search engines to obtain a satisfactory solution. Second, the search is impaired due to a lexical gap between the query (task description) and the information associated with the solution (e.g., code example). Third, the retrieved solution may not be comprehensible, i.e., the code segment might miss a succinct explanation. To address these three problems, we propose CROKAGE (CrowdKnowledge Answer Generator), a tool that takes the description of a programming task (the query) as input and delivers a comprehensible solution for the task. Our solutions contain not only relevant code examples but also their succinct explanations written by human developers. The search for code examples is modeled as an Information Retrieval (IR) problem. We first leverage the crowd knowledge stored in Stack Overflow to retrieve the candidate answers against a programming task. For this, we use a fine-tuned IR technique, chosen after comparing 11 IR techniques in terms of performance. Then we use a multi-factor relevance mechanism to mitigate the lexical gap problem, and select the top quality answers related to the task. Finally, we perform natural language processing on the top quality answers and deliver the comprehensible solutions containing both code examples and code explanations unlike earlier studies. We evaluate and compare our approach against ten baselines, including the state-of-art. We show that CROKAGE outperforms the ten baselines in suggesting relevant solutions for 902 programming tasks (i.e., queries) of three popular programming languages: Java, Python and PHP. Furthermore, we use 24 programming tasks (queries) to evaluate our solutions with 29 developers and confirm that CROKAGE outperforms the state-of-art tool in terms of relevance of the suggested code examples, benefit of the code explanations and the overall solution quality (code + explanation).

中文翻译：

CROKAGE：利用人群知识为编程任务提供有效的解决方案推荐

开发人员经常在网络上搜索相关代码示例以完成他们的编程任务。不幸的是，他们面临三个主要问题。首先，他们经常需要阅读和分析来自搜索引擎的多个结果以获得满意的解决方案。其次，由于查询（任务描述）和与解决方案相关的信息（例如，代码示例）之间的词汇差距，搜索会受到影响。第三，检索到的解决方案可能无法理解，即代码段可能会遗漏一个简洁的解释。为了解决这三个问题，我们提出了 CROKAGE（CrowdKnowledge Answer Generator），这是一种将编程任务（查询）的描述作为输入并为任务提供易于理解的解决方案的工具。我们的解决方案不仅包含相关的代码示例，还包含由人类开发人员编写的简洁解释。代码示例的搜索被建模为信息检索 (IR) 问题。我们首先利用存储在 Stack Overflow 中的人群知识来检索针对编程任务的候选答案。为此，我们使用了一种经过微调的 IR 技术，该技术是在比较 11 种 IR 技术的性能后选择的。然后我们使用多因素相关性机制来缓解词汇差距问题，并选择与任务相关的高质量答案。最后，我们对高质量的答案进行自然语言处理，并提供与早期研究不同的包含代码示例和代码解释的易于理解的解决方案。我们评估和比较我们的方法与十个基线，包括最先进的。我们表明 CROKAGE 在为三种流行的编程语言 Java、Python 和 PHP 的 902 个编程任务（即查询）提出相关解决方案方面优于十个基线。此外，我们使用 24 个编程任务（查询）与 29 个开发人员一起评估我们的解决方案，并确认 CROKAGE 在建议代码示例的相关性、代码解释的好处和整体解决方案质量方面优于最先进的工具（代码+解释）。

更新日期：2020-09-02

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11