CROKAGE: effective solution recommendation for programming tasks by leveraging crowd knowledge

da Silva, Rodrigo Fernandes Gomes; Roy, Chanchal K.; Rahman, Mohammad Masudur; Schneider, Kevin A.; Paixão, Klérisson; Dantas, Carlos Eduardo de Carvalho; Maia, Marcelo de Almeida

doi:10.1007/s10664-020-09863-2

CROKAGE: effective solution recommendation for programming tasks by leveraging crowd knowledge

Published: 02 September 2020

Volume 25, pages 4707–4758, (2020)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Rodrigo Fernandes Gomes da Silva¹,
Chanchal K. Roy²,
Mohammad Masudur Rahman²,
Kevin A. Schneider²,
Klérisson Paixão¹,
Carlos Eduardo de Carvalho Dantas¹ &
…
Marcelo de Almeida Maia ORCID: orcid.org/0000-0003-3578-1380¹

692 Accesses
11 Citations
5 Altmetric
Explore all metrics

Abstract

Developers often search for relevant code examples on the web for their programming tasks. Unfortunately, they face three major problems. First, they frequently need to read and analyse multiple results from the search engines to obtain a satisfactory solution. Second, the search is impaired due to a lexical gap between the query (task description) and the information associated with the solution (e.g., code example). Third, the retrieved solution may not be comprehensible, i.e., the code segment might miss a succinct explanation. To address these three problems, we propose CROKAGE (CrowdKnowledge Answer Generator), a tool that takes the description of a programming task (the query) as input and delivers a comprehensible solution for the task. Our solutions contain not only relevant code examples but also their succinct explanations written by human developers. The search for code examples is modeled as an Information Retrieval (IR) problem. We first leverage the crowd knowledge stored in Stack Overflow to retrieve the candidate answers against a programming task. For this, we use a fine-tuned IR technique, chosen after comparing 11 IR techniques in terms of performance. Then we use a multi-factor relevance mechanism to mitigate the lexical gap problem, and select the top quality answers related to the task. Finally, we perform natural language processing on the top quality answers and deliver the comprehensible solutions containing both code examples and code explanations unlike earlier studies. We evaluate and compare our approach against ten baselines, including the state-of-art. We show that CROKAGE outperforms the ten baselines in suggesting relevant solutions for 902 programming tasks (i.e., queries) of three popular programming languages: Java, Python and PHP. Furthermore, we use 24 programming tasks (queries) to evaluate our solutions with 29 developers and confirm that CROKAGE outperforms the state-of-art tool in terms of relevance of the suggested code examples, benefit of the code explanations and the overall solution quality (code + explanation).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Automatic query reformulation for code search using crowdsourced knowledge

Article 21 January 2019

Mohammad M. Rahman, Chanchal K. Roy & David Lo

Augmenting and structuring user queries to support efficient free-form code search

Article 26 January 2018

Raphael Sirres, Tegawendé F. Bissyandé, … Yves Le Traon

What Is the Cube Root of 27? Question Answering Over CodeOntology

Notes

https://data.stackexchange.com/stackoverflow/query on July, 2019
http://tiny.cc/xh3kbz
https://github.com/muldon/crokage-emse-replication-package
http://isel.ufu.br:9000/
https://archive.org/details/stackexchange - dump published in March 2019
https://bit.ly/1Nt4eMh
https://stackoverflow.com/questions/9416541
https://stackoverflow.com/questions/4917877
the complete list of words is available at: https://bit.ly/2Hjv0tW
https://stackoverflow.com/questions/7189370
Despite this limitation was explicitly stated in the CROKAGE web Page (i.e., http://isel.ufu.br:9000/), a significant number of non Java queries were found
CROKAGE search requires the query to have a minimum of one character and a maximum of 70 characters to run
We adopt the list provided by Stanford: https://bit.ly/1Nt4eMh
despite we use a semi-automatic process to filter out queries not related to Java, several still queries remained
we append to the query “site:stackoverflow.com”
herein we test the IR techniques using their default parameters. In the case of BM25, the default parameters are: k= 1.2 and b= 0.75
The simplest form of language model that disconsiders all conditioning context, and estimates each term independently
Although the title of the Q&A pair alone could represent the query intent, our output is the answer, thus we concatenate the title and body text of the answer in order to match the query with the answers of the candidate pairs
https://bit.ly/3gevm22
Our general conclusions are not statistically confirmed for PHP language, despite supported by the four adopted metrics.
except BIKER, whose behaviour we do not change

References

Ahasanuzzaman M, Asaduzzaman M, Roy CK, Schneider KA (2016) Mining duplicate questions in Stack Overflow. In: Proceeding MSR, pp 402–412
An L, Mlouki O, Khomh F, Antoniol G (2017) Stack overflow: a code laundering platform?. In: Proceeding SANER, pp 283–293
Apache (2020) Lucene, http://lucene.apache.org/
Baeza-Yates R, Ribeiro-Neto B, et al. (1999) Modern information retrieval, vol 463. ACM Press, New York
Google Scholar
Bajracharya S, Ossher J, Lopes C (2010) Searching API usage examples in code repositories with Sourcerer API search. In: Workshop on search-driven development, pp 5–8
BeginnersBook (2020) BeginnersBook, http://beginnersbook.com
Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. TACL 5:135–146
Article Google Scholar
Campbell BA, Treude C (2017) NLP2code: Code snippet content assist via natural language tasks. In: Proceeding ICSME, pp 628–632
Campos EC, Souza LBLD, Maia MA (2014) Nuggets miner: assisting developers by harnessing the Stack Overflow crowd knowledge and the Github traceability. In: Proceeding CBSoft-Tool Session
Campos EC, de Souza LB, Maia MA (2016) Searching crowd knowledge to recommend solutions for API usage tasks. J Softw Evol Process 28 (10):863–892
Article Google Scholar
Chatterjee P, Gause B, Hedinger H, Pollock L (2017) Extracting code segments and their descriptions from research articles. In: Proceeding MSR, pp 91–101
Chen C, Xing Z, Liu Y, Ong KLX (2019) Mining likely analogical apis across third-party libraries via large-scale unsupervised api semantics embedding, TSE
Ciborowska A, Kraft NA, Damevski K (2018) Detecting and characterizing developer behavior following opportunistic reuse of code snippets from the web. In: Proceeding MSR, pp 94–97
Corbin J, Strauss A (1990) Basics of qualitative research: techniques and procedures for developing grounded theory sage publications
De Souza LBL, Campos EC, Maia MA (2014) Ranking crowd knowledge to assist software development. In: Proceeding Intl. Conf. on Program Comprehension, pp 72–82
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding, arXiv:1810.04805
Diamantopoulos T, Symeonidis AL (2015) Employing source code information to improve question-answering in Stack Overflow. In: Proceeding MSR, pp 454–457
Facebook Inc (2020) Word representations in fastText, https://fasttext.cc/docs/en/unsupervised-tutorial.html
Fang H, Zhai C (2005) An exploration of axiomatic approaches to information retrieval. In: Proceeding SIGIR ACM, pp 480–487
Fielding RT, Taylor RN (2002) Principled design of the modern web architecture. ACM Trans Int Technol (TOIT) 2(2):115–150
Article Google Scholar
Fritz C, Peter E, Richler J (2012) Effect size estimates: current use, calculations, and interpretation. JEPG 141(1):2–18
Google Scholar
Fu W, Menzies T (2017) Easy over hard: A case study on deep learning. In: Proceeding ESEC/FSE, pp 49–60
Google Inc (2020) Google search engine, http://google.com
Gu X, Zhang H, Zhang D, Kim S (2016) Deep API learning. In: Proceeding FSE, pp 631–642
Gu X, Zhang H, Kim S (2018) Deep code search. In: Proceeding ICSE, pp 933–944
Gvero T, Kuncak V (2015) Interactive synthesis using free-form queries. In: Proceeding ICSE, pp 689–692
Hill E, Rao S, Kak A (2012) On the use of stemming for concern location and bug localization in Java. In: Proceeding SCAM, pp 184–193
Hoogeveen D, Bennett A, Li Y, Verspoor KM, Baldwin T (2018) Detecting misflagged duplicate questions in community question-answering archives. In: Proceeding ICWSM, pp 112–120
Hu X, Li G, Xia X, Lo D, Jin Z (2018) Deep code comment generation. In: Proceeding ICPC, pp 200–210
Huang Q, Xia X, Xing Z, Lo D, Wang X (2018) API method recommendation without worrying about the task-API knowledge gap. In: Proceeding ASE, pp 293–304
Java2s (2020) Java2s, http://java2s.com
Jsoup (2020) Java HTML parser, http://jsoup.org
Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174
MATH Google Scholar
Li Z, Wang T, Zhang Y, Zhan Y, Yin G (2016) Query reformulation by leveraging crowd wisdom for scenario-based software search. In: Proceedings of the 8th asia-pacific symposium on internetware ACM, pp 36–44
Lv F, Zhang H, Lou J-G, Wang S, Zhang D, Zhao J (2015) Codehow: effective code search based on API understanding and extended boolean model (e). In: Proceeding ASE, pp 260–270
McMillan C, Grechanik M, Poshyvanyk D, Xie Q, Fu C (2011) Portfolio: finding relevant functions and their usage. In: Proceeding ICSE, pp 111–120
Microsoft Inc (2020) Bing search engine, http://bing.com
Mihalcea R, Corley C, Strapparava C, et al. (2006) Corpus-based and knowledge-based measures of text semantic similarity. In: Aaai 6 (2006):775–780
Google Scholar
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013a) Distributed representations of words and phrases and their compositionality. In: Proceeding NIPS, pp 3111–3119
Mikolov T, Chen K, Corrado G, Dean J (2013b) Efficient estimation of word representations in vector space, arXiv:1301.3781
Nasehi SM, Sillito J, Maurer F, Burns C (2012) What makes a good code example?: A study of programming q&a in stackoverflow. In: Procedding ICSM IEEE, pp 25–34
Nguyen T, Rigby PC, Nguyen AT, Karanfil M, Nguyen TN (2016) T2API: synthesizing API code usage templates from English texts with statistical translation. In: Proceeding FSE, pp 1013–1017
Nie L, Jiang H, Ren Z, Sun Z, Li X (2016) Query expansion based on crowd knowledge for code search. IEEE Trans Serv Comput 9(5):771–783
Article Google Scholar
Pagliardini M, Gupta P, Jaggi M (2017) Unsupervised learning of sentence embeddings using compositional n-gram features, arXiv:1703.02507
Ponzanelli L, Bacchelli A, Lanza M (2013a) Seahawk: Stack Overflow in the IDE. In: International conference on software engineering (ICSE), pp 1295–1298
Ponzanelli L, Bacchelli A, Lanza M (2013b) Leveraging crowd knowledge for software comprehension and development. In: Proceeding CSMR, pp 57–66
Ponzanelli L, Bavota G, Di Penta M, Oliveto R, Lanza M (2014a) Mining Stack Overflow to turn the IDE into a self-confident programming prompter. In: Proceeding MSR, pp 102–111
Ponzanelli L, Bavota G, Di Penta M, Oliveto R, Lanza M (2014b) Prompter: A self-confident recommender system. In: Proceeding ICSME. IEEE, pp 577–580
Raghothaman M, Wei Y, Hamadi Y (2016) SWIM: Synthesizing what I mean-code search and idiomatic snippet synthesis. In: Proceeding ICSE, pp 357–367
Ragkhitwetsagul C, Krinke J, Paixao M, Bianco G, Oliveto R (2018) Toxic code snippets on Stack Overflow, arXiv:1806.07659
Rahman MM, Roy CK (2017) STRICT: Information retrieval based search term identification for concept location. In: Proceeding SANER, pp 79–90
Rahman MM, Roy CK (2018) Effective reformulation of query for code search using crowdsourced knowledge and extra-large data analytics. In: Proceedings ICSME, pp 473–484
Rahman MM, Roy CK, Keivanloo I (2015) Recommending insightful comments for source code using crowdsourced knowledge. In: Proceeding SCAM, pp 81–90
Rahman MM, Roy CK, Lo D (2016) RACK: Automatic API recommendation using crowdsourced knowledge. In: Proceeding SANER, pp 349–359
Rahman MM, Roy CK, Lo D (2017) Rack: Code search in the IDE using crowdsourced knowledge. In: Proceeding ICSE, pp 51–54
Robertson SE, Walker S (1994) Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In: Proceeding ACM SIGIR, pp 232–241
Saryada W (2020) Kodejava, http://kodejava.org
Silva RFG, Paixao KVR, Maia MA (2018) Duplicate question detection in Stack overflow: a reproducibility study. In: Proceeding SANER, pp 572–581
Silva RF, Roy CK, Rahman MM, Schneider KA, Paixao K, de Almeida Maia M (2019) Recommending comprehensive solutions for programming tasks by mining crowd knowledge. In: Proceedings of the 27th international conference on program comprehension, IEEE Press, pp 358–368
Stack Exchange Inc (2020) Stack Overflow search engine, http://stackoverflow.com
Van Nguyen T, Nguyen AT, Phan HD, Nguyen TD, Nguyen TN (2017) Combining word2vec with revised vector space model for better code retrieval. In: Proceeding ICSE IEEE Press, pp 183–185
Wang Y, Feng Y, Martins R, Kaushik A, Dillig I, Reiss SP (2016) Hunter: next-generation code reuse for Java. In: Proceeding FSE, pp 1028–1032
Wang S, Lo D, Jiang L (2014) Active code search: incorporating user feedback to improve code search relevance. In: Proceedings of the 29th ACM/IEEE international conference on Automated software engineering. ACM, pp 677–682
Wang X, Peng Y, Zhang B (2018) Comment generation for source code:, State of the art, challenges and opportunities, arXiv:1802.02971
Wilcoxon F (1945) Individual comparisons by ranking methods. Biomet Bull 1(6):80–83
Article Google Scholar
Wong E, Yang J, Tan L (2013) Autocomment: mining question and answer sites for automatic comment generation. In: Proceeding ASE, pp 562–567
Wong E, Liu T, Tan L (2015) Clocom: mining existing source code for automatic comment generation. In: Proceeding SANER, pp 380–389
Xu B, Ye D, Xing Z, Xia X, Chen G, Li S (2016) Predicting semantically linkable knowledge in developer online forums via convolutional neural network. In: Proceeding ASE, pp 51–62
Xu B, Xing Z, Xia X, Lo D (2017) Answerbot: automated generation of answer summary to developers technical questions. In: Proceedings ASE, pp 706–716
Xu B, Shirani A, Lo D, Alipour MA (2018) Prediction of relatedness in stack overflow: deep learning vs. svm: a reproducibility study. In: Proceeding ESEM ACM, p 21
Yang D, Martins P, Saini V, Lopes C (2017) Stack overflow in github: any snippets there?. In: Proceeding MSR, pp 280–290
Ye X, Shen H, Ma X, Bunescu R, Liu C (2016) From word embeddings to document similarities for improved information retrieval in software engineering. In: Proceeding ICSE, pp 404–415
Yin P, Deng B, Chen E, Vasilescu B, Neubig G (2018) Learning to mine aligned code and natural language pairs from stack overflow. In: Proceeding MSR, ser MSR ACM, pp 476–486
Zagalsky A, Barzilay O, Yehudai A (2012) Example overflow: using social media for code recommendation. In: Proceeding RSSE, pp 38–42
Zhai C, Lafferty J (2004) A study of smoothing methods for language models applied to information retrieval. TOIS 22(2):179–214
Article Google Scholar
Zhang Y, Lo D, Xia X, Sun J-L (2015) Multi-factor duplicate question detection in Stack Overflow. JCST 30(5):981–997
Google Scholar
Zhang WE, Sheng QZ, Lau JH, Abebe E (2017a) Detecting duplicate posts in programming qa communities via latent semantics and association rules. In: Proceeding WWW, pp 1221–1229
Zhang WE, Sheng QZ, Shu Y, Nguyen VK (2017b) Feature analysis for duplicate detection in programming qa communities. In: Proceeding ADMA. Springer, New York, pp 623–638

Download references

Acknowledgments

We thank the authors of BIKER for sharing their tool. This research is supported in-part by a Canada First Research Excellence Fund (CFREF) grant coordinated by the Global Institute for Food Security (GIFS). We also thank the Brazilian funding agencies, CAPES, CNPq and FAPEMIG for supporting this research. At last, but not least, we thank the participants that worked in the qualitative evaluation of this work.

Author information

Authors and Affiliations

Federal University of Uberlândia, Uberlândia, (MG), Brazil
Rodrigo Fernandes Gomes da Silva, Klérisson Paixão, Carlos Eduardo de Carvalho Dantas & Marcelo de Almeida Maia
University of Saskatchewan, Department of Computer Science, 110 Science Place, S7N 5C9, Saskatoon, SK, Canada
Chanchal K. Roy, Mohammad Masudur Rahman & Kevin A. Schneider

Authors

Rodrigo Fernandes Gomes da Silva
View author publications
You can also search for this author in PubMed Google Scholar
Chanchal K. Roy
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Masudur Rahman
View author publications
You can also search for this author in PubMed Google Scholar
Kevin A. Schneider
View author publications
You can also search for this author in PubMed Google Scholar
Klérisson Paixão
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Eduardo de Carvalho Dantas
View author publications
You can also search for this author in PubMed Google Scholar
Marcelo de Almeida Maia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marcelo de Almeida Maia.

Additional information

Communicated by: Tim Menzies

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

da Silva, R.F.G., Roy, C.K., Rahman, M.M. et al. CROKAGE: effective solution recommendation for programming tasks by leveraging crowd knowledge. Empir Software Eng 25, 4707–4758 (2020). https://doi.org/10.1007/s10664-020-09863-2

Download citation

Published: 02 September 2020
Issue Date: November 2020
DOI: https://doi.org/10.1007/s10664-020-09863-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

CROKAGE: effective solution recommendation for programming tasks by leveraging crowd knowledge

Abstract

Access this article

Similar content being viewed by others

Automatic query reformulation for code search using crowdsourced knowledge

Augmenting and structuring user queries to support efficient free-form code search

What Is the Cube Root of 27? Question Answering Over CodeOntology

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

CROKAGE: effective solution recommendation for programming tasks by leveraging crowd knowledge

Abstract

Access this article

Similar content being viewed by others

Automatic query reformulation for code search using crowdsourced knowledge

Augmenting and structuring user queries to support efficient free-form code search

What Is the Cube Root of 27? Question Answering Over CodeOntology

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation