Skip to main content
Log in

Characteristics of method extractions in Java: a large scale empirical study

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Extract method is the “Swiss army knife” of refactorings: developers perform method extraction to introduce alternative signatures, decompose long code, improve testability, among many other reasons. Although the rationales behind method extraction are well explored, we are not yet aware of its characteristics. Assessing this information can provide the basis to better understand this important refactoring operation as well as improve refactoring tools and techniques based on the actual behavior of developers. In this paper, we assess characteristics of the extract method refactoring. We rely on a state-of-the-art technique to detect method extraction, and analyze over 70K instances of this refactoring, mined from 124 software systems. We investigate five aspects of this operation: magnitude, content, transformation, size, and degree. We find that (i) the extract method is among the most popular refactorings; (ii) extracted methods are over represented on operations related to creation, validation, and setup; (iii) methods that are targets of the extractions are 2.2x longer than the average, and they are reduced by one statement after the extraction; and (iv) single method extraction represents most, but not all, of the cases. We conclude by proposing improvements to refactoring detection, suggestion, and automation tools and techniques to support both practitioners and researchers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Notes

  1. https://refactoring.com/catalog/extractMethod.html

  2. https://martinfowler.com/articles/refactoringRubicon.html

  3. Data collected with the Stack Exchange API: https://data.stackexchange.com

  4. Question ID: 1155947

  5. Question ID: 10289461

  6. Question ID: 2619228

  7. Question ID: 26674797

  8. Question ID: 511211

  9. https://stackoverflow.com/questions/2470653

  10. Question ID: 29257032

  11. Question ID: 4930742

  12. Question ID: 1898645

  13. 1247835

  14. 19972611

  15. Example from Arduino project: https://goo.gl/aD8n1N

  16. Example from Arduino project: https://goo.gl/CaQWiB

  17. Example from Arduino project: https://goo.gl/yPvj5M

  18. https://github.com/iluwatar/java-design-patterns

  19. https://git-scm.com/docs/git-log#git-log

  20. https://bit.ly/2NsxgyB

  21. The authors excluded the rename operations in their analysis.

  22. Suffixes were much more widespread, with the top 10 prefixes covering only 7% of methods.

  23. We omit the “All Methods” column in Table 6 because it is already presented in Table 5.

  24. https://bit.ly/32lnFzd

  25. https://bit.ly/36EJn4k

  26. https://bit.ly/2pJsogM

  27. https://bit.ly/34vuMXh

  28. https://bit.ly/2oUebxa

  29. https://bit.ly/2rcsqOt

  30. https://bit.ly/2CcVCXY

  31. https://bit.ly/2PWN2Vu

  32. We only count the out-degree in the target methods with respect to the extracted methods, that is, the dashed lines in Fig. 5.

  33. https://goo.gl/qbTHRz

  34. https://goo.gl/otn7dC

  35. https://goo.gl/yeDPLa

  36. https://www.oreilly.com/library/view/refactoring-improving-the/9780134757681

  37. http://jczeus.com/refac_cpp.html

  38. https://refactoring.com/catalog/extractFunction.html

References

  • Alhindawi N, Dragan N, Collard ML, Maletic JI (2013) Improving feature location by enhancing source code with stereotypes. In: International conference on software maintenance. IEEE, pp 300–309

  • Allamanis M, Barr ET, Bird C, Sutton C (2015) Suggesting accurate method and class names. In: Joint meeting on foundations of software engineering, pp 38–49

  • Allamanis M, Peng H, Sutton C (2016) A convolutional attention network for extreme summarization of source code. In: International conference on machine learning, pp 2091–2100

  • Ambler SW, Sadalage PJ (2006) Refactoring databases: Evolutionary database design. Pearson Education

  • Ayewah N, Pugh W, Hovemeyer D, Morgenthaler JD, Penix J (2008) Using static analysis to find bugs. IEEE Softw 25(5):22–29

    Article  Google Scholar 

  • Bavota G, Oliveto R, De Lucia A, Antoniol G, Gueheneuc YG (2010) Playing with refactoring: Identifying extract class opportunities through game theory. In: International conference on software maintenance (ICSM), pp 1–5

  • Bavota G, De Lucia A, Marcus A, Oliveto R (2014a) Automating extract class refactoring: an improved method and its evaluation. Empir Softw Eng 19(6):1617–1664

  • Bavota G, Oliveto R, Gethers M, Poshyvanyk D, De Lucia A (2014b) Methodbook: Recommending move method refactorings via relational topic models. IEEE Trans Softw Eng 40(7):671–694

  • Borges H, Valente MT (2018) What’s in a GitHub star? understanding repository starring practices in a social coding platform. J Sys Softw

  • Brown WH, Malveau RC, McCormick HW, Mowbray TJ (1998) AntiPatterns: refactoring software, architectures, and projects in crisis. Wiley, Hoboken

    Google Scholar 

  • Copeland T (2005) PMD applied, vol 10. Centennial Books Arexandria, Va, USA

  • Dig D, Comertoglu C, Marinov D, Johnson R (2006) Automated detection of refactorings in evolving components. In: European conference on object-oriented programming, pp 404–428

  • Dragan N, Collard ML, Hammad M, Maletic JI (2011) Using stereotypes to help characterize commits. In: International conference on software maintenance (ICSM). IEEE, pp 520–523

  • Dragan N, Collard ML, Maletic JI (2006) Reverse engineering method stereotypes. In: International conference on software maintenance. IEEE, pp 24–34

  • Dragan N, Collard ML, Maletic JI (2009) Using method stereotype distribution as a signature descriptor for software systems. In: International conference on software maintenance. IEEE, pp 567–570

  • Fowler M, Beck K (1999) Refactoring: improving the design of existing code. Addison-Wesley Professional

  • Hora A, Robbes R, Anquetil N, Etien A, Ducasse S, Valente MT (2015) How do developers react to API evolution? the Pharo ecosystem case. In: International conference on software maintenance and evolution, pp 251–260

  • Hora A, Robbes R, Valente MT, Anquetil N, Etien A, Ducasse S (2018) How do developers react to API evolution? a large-scale empirical study. Softw Qual J 26(1):161–191

    Article  Google Scholar 

  • Hora A, Silva D, Robbes R, Valente MT (2018) Assessing the threat of untracked changes in software evolution. In: International conference on software engineering, pp 1102–1113

  • Host EW, Ostvold BM (2007) The programmer’s lexicon, volume i: The verbs. In: International working conference on source code analysis and manipulation, pp 193–202

  • Kalliamvakou E, Gousios G, Blincoe K, Singer L, German DM, Damian D (2014) The promises and perils of mining github. In: Working conference on mining software repositories, pp 92–101

  • Kim M, Gee M, Loh A, Rachatasumrit N (2010) Ref-Finder: a refactoring reconstruction tool based on logic query templates. In: International symposium on the foundations of software engineering, pp 371–372

  • Kim M, Zimmermann T, Nagappan N (2012) A field study of refactoring challenges and benefits. In: International symposium on the foundations of software engineering, p 50

  • Kim M, Zimmermann T, Nagappan N (2014) An empirical study of refactoring challenges and benefits at microsoft. IEEE Trans Softw Eng 40(7):633–649

    Article  Google Scholar 

  • Lippert M, Roock S (2006) Refactoring in large software projects: performing complex restructurings successfully. Wiley, Hoboken

    Google Scholar 

  • Livshits B, Zimmermann T (2005) DynaMine: finding common error patterns by mining software revision histories. In: International symposium on the foundations of software engineering, pp 296–305

  • Martin RC (2009) Clean code: a handbook of agile software craftsmanship. Pearson Education

  • Meng S, Wang X, Zhang L, Mei H (2012) A history-based matching approach to identification of framework evolution. In: International conference on software engineering, pp 353–363

  • Mens T, Tourwé T (2004) A survey of software refactoring. IEEE Trans Softw Eng 30(2):126–139

    Article  Google Scholar 

  • Meszaros G (2007) xUnit test patterns: Refactoring test code. Pearson Education

  • Murphy GC, Kersten M, Findlater L (2006) How are Java software developers using the Elipse IDE? IEEE Softw 23(4):76–83

    Article  Google Scholar 

  • Murphy-Hill E, Black AP (2008) Breaking the barriers to successful refactoring: observations and tools for extract method. In: International conference on software engineering, pp 421–430

  • Murphy-Hill E, Black AP (2008) Refactoring tools: Fitness for purpose. IEEE Software 25(5)

  • Murphy-Hill E, Parnin C, Black AP (2012) How we refactor, and how we know it. IEEE Trans Softw Eng 38(1):5–18

    Article  Google Scholar 

  • Murphy-Hill E, Zimmermann T, Bird C, Nagappan N (2015) The design space of bug fixes and how developers navigate it. IEEE Trans Softw Eng 41(1):65–81

    Article  Google Scholar 

  • Negara S, Chen N, Vakilian M, Johnson RE, Dig D (2013) A comparative study of manual and automated refactorings. In: European conference on object-oriented programming. Springer, pp 552–576

  • Roberts D, Brant J, Johnson R (1997) A refactoring tool for smalltalk. Theory and Practice of Object Systems 3(4)

  • Silva D, Terra R, Valente MT (2014) Recommending automated extract method refactorings. In: International conference on program comprehension (ICPC), pp 146–156

  • Silva D, Tsantalis N, Valente MT (2016) Why we refactor? confessions of GitHub contributors. In: International symposium on the foundations of software engineering, pp 858–870

  • Silva D, Valente MT (2017) RefDiff: detecting refactorings in version histories. In: International conference on mining software repositories, pp 269–279

  • Simon F, Steinbruckner F, Lewerentz C (2001) Metrics based refactoring. In: European conference on software maintenance and reengineering, pp 30–38

  • Terra R, Valente MT, Miranda S, Sales V (2018) JMove: A novel heuristic and tool to detect move method refactoring opportunities. J Sys Softw 138:19–36

    Article  Google Scholar 

  • Tourwé T, Mens T (2003) Identifying refactoring opportunities using logic meta programming. In: European conference on software maintenance and reengineering, pp 91–100

  • Tsantalis N, Chatzigeorgiou A (2009) Identification of move method refactoring opportunities. IEEE Transactions on Software Engineering 35(3)

  • Tsantalis N, Chatzigeorgiou A (2011) Identification of extract method refactoring opportunities for the decomposition of methods. J Syst Softw 84(10):1757–1782

    Article  Google Scholar 

  • Tsantalis N, Guana V, Stroulia E, Hindle A (2013) A multidimensional empirical study on refactoring activity. In: Conference of the centre for advanced studies on collaborative research, pp 132–146

  • Tsantalis N, Mansouri M, Eshkevari LM, Mazinanian D, Dig D (2018) Accurate and efficient refactoring detection in commit history. In: International conference on software engineering, pp 483–494

  • Vakilian M, Johnson RE (2014) Alternate refactoring paths reveal usability problems. In: Proceedings of the 36th international conference on software engineering, pp 1106–1116

  • Vasilescu B, Casalnuovo C, Devanbu P (2017) Recovering clear, natural identifiers from obfuscated js names. In: Joint meeting on foundations of software engineering, pp 683–693

  • Vassallo C, Grano G, Palomba F, Gall HC, Bacchelli A (2019) A large-scale empirical exploration on refactoring activities in open source software projects. Sci Comput Program 180:1–15

    Article  Google Scholar 

  • Wang Y (2009) What motivate software engineers to refactor source code? evidences from professional developers. In: International conference on software maintenance, pp 413–416

  • Weissgerber P, Diehl S (2006) Identifying refactorings from source-code changes. In: International conference on automated software engineering, pp 231–240

  • Wu W, Gueheneuc YG, Antoniol G, Kim M (2010) AURA: a hybrid approach to identify framework evolution. In: International conference on software engineering, pp 325–334

  • Xavier L, Brito A, Hora A, Valente MT (2017) Historical and impact analysis of API breaking changes: A large scale study. In: International conference on software analysis, evolution and reengineering, pp 138–147

  • Xing Z, Stroulia E (2006) Refactoring detection based on umldiff change-facts queries. In: Working conference on reverse engineering, pp 263–274

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andre Hora.

Additional information

Communicated by: Christoph Treude

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hora, A., Robbes, R. Characteristics of method extractions in Java: a large scale empirical study. Empir Software Eng 25, 1798–1833 (2020). https://doi.org/10.1007/s10664-020-09809-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-020-09809-8

Keywords

Navigation