Skip to main content
Log in

Understanding and improving artifact sharing in software engineering research

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

In recent years, many software engineering researchers have begun to include artifacts alongside their research papers. Ideally, artifacts, including tools, benchmarks, and data, support the dissemination of ideas, provide evidence for research claims, and serve as a starting point for future research. However, in practice, artifacts suffer from a variety of issues that prevent the realization of their full potential. To help the software engineering community realize the full potential of artifacts, we seek to understand the challenges involved in the creation, sharing, and use of artifacts. To that end, we perform a mixed-methods study including a survey of artifacts in software engineering publications, and an online survey of 153 software engineering researchers. By analyzing the perspectives of artifact creators, users, and reviewers, we identify several high-level challenges that affect the quality of artifacts including mismatched expectations between these groups, and a lack of sufficient reward for both creators and reviewers. Using Diffusion of Innovations (DoI) as an analytical framework, we examine how these challenges relate to one another, and build an understanding of the factors that affect the sharing and success of artifacts. Finally, we make recommendations to improve the quality of artifacts based on our results and existing best practices.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://2020.icse-conferences.org/info/awards [Date Accessed: 13th March 2021]

  2. https://dblp.uni-trier.de [Date Accessed: March 13th 2021]

  3. https://drive.google.com [Date Accessed: March 13th 2021]

  4. https://github.com [Date Accessed: March 13th 2021]

  5. https://dropbox.com [Accessed: 12 Mar 2021]

  6. https://travis-ci.org [Date Accessed: March 13th 2021]

  7. https://github.com/features/actions [Date Accessed: March 13th 2021]

  8. https://figshare.com [Date Accessed: March 13th 2021]

  9. https://www.gitlab.com [Date Accessed: March 13th 2021]

  10. https://www.bitbucket.org [Date Accessed: March 13th 2021]

References

  • Apache (2004) Apache License, Version 2.0. (January 2004). https://www.apache.org/licenses/LICENSE-2.0 [Date Accessed: July 21st 2020].

  • Association for Computing Machinery (2018) Artifact Review and Badging. (April 2018). https://www.acm.org/publications/policies/artifact-review-badging [Date Accessed: March 30th 2020]

  • Association for Computing Machinery (2020) Software and Data Artifacts in the ACM Digital Library. (2020). https://www.acm.org/publications/artifacts [Date Accessed: July 29th 2020]

  • Austin J, Jackson T, Fletcher M, Jessop M, Liang B, Weeks M, Smith L, Ingram C, Watson P (2011) CARMEN: Code analysis, repository and modeling for e-neuroscience. Procedia Comput Sci 4(2011):768–777

    Article  Google Scholar 

  • Basili VR, Shull F, Lanubile F (1999) Building knowledge through families of experiments. Trans Softw Eng 25(4):456–473

    Article  Google Scholar 

  • Basili VR, Zelkowitz MV, Sjøberg DIK, Johnson P, Cowling AJ (2007) Protocols in the use of empirical software engineering artifacts. Empir Softw Eng 12(1):107–119

    Article  Google Scholar 

  • Bauer MS, Damschroder L, Hagedorn H, Smith J, Kilbourne AM (2015) An introduction to implementation science for the non-specialist. BMC Psychol 3(1):32

    Article  Google Scholar 

  • Beller M (2020) Why I will never join an Artifacts Evaluation Committee Again. (June 2020). https://inventitech.com/blog/why-i-will-never-review-artifacts-again [Date Accessed: July 16th 2020]

  • Brammer GR, Crosby RW, Matthews SJ, Williams TL (2011) Paper mâché: Creating dynamic reproducible science. Procedia Comput Sci 4:658–667

    Article  Google Scholar 

  • Brooks A, Roper M, Wood M, Daly J, Miller J (2008) Replication’s role in software engineering. Springer, London, pp 365–379

    Google Scholar 

  • Carver JC (2010) Towards reporting guidelines for experimental replications: A proposal. In: International workshop on replication in empirical software engineering research (RESER ’10)

  • Carver JC, Juristo N, Baldassarre MT, Vegas S (2014) Replications of software engineering experiments. Empir Softw Eng 19(2):267–276

    Article  Google Scholar 

  • Charmaz K (2014) Constructing grounded theory. Sage

  • Childers BR, Chrysanthis PK (2017) Artifact evaluation: is it a real incentive?. In: International conference on e-science (e-Science ’17), pp 488–489

  • Childers BR, Chrysanthis PK (2018) Artifact evaluation: FAD or real news?. In: International conference on data engineering (ICDE ’18), pp 1664–1665

  • Collberg C, Proebsting TA (2016) Repeatability in computer systems research. Commun ACM 59(3):62–69

    Article  Google Scholar 

  • Collberg C, Proebsting TA, Warren AM (2015) Repeatability and Benefaction in Computer Systems Research: A Study and Modest Proposal. Technical Report TR 14-04. University of Arizona. http://reproducibility.cs.arizona.edu/v2/RepeatabilityTR.pdf

  • Di Cosmo R, Zacchiroli S (2017) Software heritage: Why and how to preserve software source code. In: International conference on digital preservation (iPRES ’17)

  • Creative Commons (2013) Attribution 4.0 International (CC BY 4.0). (November 2013). https://creativecommons.org/licenses/by/4.0 [Date Accessed: July 21st 2020]

  • Creswell JW, Clark VLP (2017) Designing and conducting mixed methods research. Sage Publications

  • Docker Inc (2020) Docker. (2020). https://www.docker.com [Date Accessed: December 11th 2020]

  • El Emam K, Jonker E, Arbuckle L, Malin B (2011) A systematic review of re-identification attacks on health data. PLOS One 6(12):e28071

    Article  Google Scholar 

  • Engineering and Physical Sciences Research Council (2011) EPSRC policy framework on research data. (March 2011). https://epsrc.ukri.org/about/standards/researchdata [Date Accessed: July 31st 2020]

  • European Commission (2016) H2020 Programme: Guidelines on FAIR Data Management in Horizon 2020 (Version 3.0). (July 2016). https://ec.europa.eu/research/participants/data/ref/h2020/grantsmanual/hi/oapilot/h2020-hi-oa-data-mgten.pdf [Date Accessed: July 31st 2020]

  • European Commission (2020) Horizon 2020: Open Access. (2020). https://ec.europa.eu/research/participants/docs/h2020-funding-guide/cross-cutting-issues/open-access-data-management/open-accessen.htm [Date Accessed: July 31st 2020]

  • European Organization For Nuclear Research and OpenAIRE (2013) Zenodo. (2013). https://doi.org/10.25495/7GXK-RD71

  • Flittner M, Bauer R, Rizk A, Geißler S, Zinner T, Zitterbart M (2017) Taming the complexity of artifact reproducibility. In: Reproducibility workshop (Reproducibility ’17), pp 14–16

  • Frambach RT, Schillewaert N (2002) Organizational innovation adoption: A multi-level framework of determinants and opportunities for future research. J Bus Res 55(2):163–176

    Article  Google Scholar 

  • Free Software Foundation (2021) GNU Make. (2021). https://www.gnu.org/software/make [Date Accessed: March 13th 2021]

  • Fursin G, Lokhmotov A, Plowman E (2016) Collective knowledge: Towards R&D sustainability. In: Design, automation & test in Europe conference & exhibition (DATE ’16), pp 864–869

  • Glasgow RE, Vogt TM, Boles SM (1999) Evaluating the public health impact of health promotion interventions: the RE-AIM framework. Am J Public Health 89(9):1322–1327

    Article  Google Scholar 

  • Gómez C, Cleary B, Singer L (2013) A study of innovation diffusion through link sharing on stack overflow. In: Working conference on mining software repositories (MSR ’13), pp 81–84

  • Gradle (2021) Gradle Build Tool. (2021). https://gradle.org [Date Accessed: March 13th 2021]

  • Green GC, Hevner AR (2000) The successful diffusion of innovations: guidance for software development organizations. IEEE Softw 17(6):96–103

    Article  Google Scholar 

  • Hager C (2016) PDFx. (March 2016). https://github.com/metachris/pdfx/releases/tag/v1.3.0 [Date Accessed: March 28th 2021]

  • Herckis L (2018) Passing the baton: Digital literacy and sustained implementation of elearning technologies. Current Issues in Emerging eLearning 5(1):4

    Google Scholar 

  • Hermann B, Winter S, Siegmund J (2020) Community expectations for research artifacts and evaluation processes. In: Joint meeting on european software engineering conference and symposium on the foundations of software engineering (ESEC/FSE ’20), pp 469–480

  • Heumüller R, Nielebock S, Krüger J, Ortmeier F (2020) Publish or perish, but do not forget your software artifacts. Empir Softw Eng 25 (6):4585–4616

    Article  Google Scholar 

  • Ji S, Li W, Srivatsa M, Beyah R (2014) Structural data de-anonymization: quantification, practice, and implications. In: Conference on computer and communications security (SIGSAC ’14), pp 1040–1053

  • Jimenez I, Sevilla M, Watkins N, Maltzahn C, Lofstead J, Mohror K, Arpaci-Dusseau A, Arpaci-Dusseau R (2017) The Popper convention: Making reproducible systems evaluation practical. In: International parallel and distributed processing symposium workshops (IPDPSW ’17), pp 1561–1570

  • Johns G (1993) Constraints On the adoption of psychology-based personnel practices: Lessons from organizational innovation. Personnel Psychol 46(3):569–592

    Article  Google Scholar 

  • Kitchenham B (2008) The role of replications in empirical software engineering—a word of warning. Empir Softw Eng 13:219–221

    Article  Google Scholar 

  • Kitto SC, Chesters J, Grbich C (2008) Quality in qualitative research. Med J Aust 188(4):243–246

    Article  Google Scholar 

  • Kitware (2021) CMake. (2021). https://cmake.org [Date Accessed: March 13th 2021]

  • Kluyver T, Ragan-Kelley B, Pérez F, Granger B, Bussonnier M, Frederic J, Kelley K, Hamrick J, Grout J, Corlay S, Ivanov P, Avila D, Abdalla S, Willing C, Jupyter Development Team Jupyter Notebooks - a publishing format for reproducible computational workflows. In: Positioning and power in academic publishing: Players, agents and agendas. IOS Press, pp 87–90

  • Kotti Z, Kravvaritis K, Dritsa K, Spinellis D (2020) Standing on shoulders or feet? an extended study on the usage of the MSR data papers. Empir Softw Eng 25(5):3288–3322

    Article  Google Scholar 

  • Krishnamurthi S (2013a) Artifact Evaluation for Software Conferences. SIGSOFT Software Engineering Notes 38, 3 (May 2013), 7–10

  • Krishnamurthi S (2013b) Examining “Reproducibility in Computer Science”. (2013). http://cs.brown.edu/~sk/Memos/Examining-Reproducibility/ [Date Accessed: January 6th 2020]

  • Krishnamurthi S (2014) Guidelines for Packaging AEC Submissions. (2014). https://www.artifact-eval.org/guidelines.html [Date Accessed: July 17th 2020]

  • Krishnamurthi S, Vitek J (2015) The real software crisis: Repeatability as a core value. Commun ACM 58(3):34–36

    Article  Google Scholar 

  • Li-Thiao-Té S (2012) Literate program execution for reproducible research and executable papers. Procedia Comput Sci 9:439–448

    Article  Google Scholar 

  • Lindvall M, Rus I, Shull F, Zelkowitz M, Donzelli P, Memon A, Basili V, Costa P, Tvedt R, Hochstein L, Asgari S, Ackermann C, Pech D (2005) An evolutionary testbed for software technology evaluation. Innov Sys Softw Eng 1(1):3–11

    Article  Google Scholar 

  • Meng H, Thain D, Vyushkov A, Wolf M, Woodard A (2016) Conducting reproducible research with umbrella: Tracking, creating, and preserving execution environments. In: International conference on e-science, pp 91–100

  • Miranda M, Ferreira R, de Souza CRB, Filho FMF, Singer L (2014) An exploratory study of the adoption of mobile development platforms by software engineers. In: International conference on mobile software engineering and systems (MOBILESoft ’14), pp 50–53

  • Morse JM (1991) Qualitative nursing research: A free-for-all. Qualitative nursing research: A contemporary dialogue, pp 14–22

  • Murphy-Hill ER, Smith EK, Sadowski C, Jaspan C, Winter C, Jorde M, Knight A, Trenk A, Gross S (2019) Do developers discover new tools on the toilet?. In: International conference on software engineering (ICSE ’19), pp 465–475

  • Narayanan A, Shmatikov V (2008) Robust de-anonymization of large sparse datasets. In: Symposium on security and privacy (SP ’08), pp 111–125

  • NIH (2003) NIH Data Sharing Policy and Implementation Guidance. (2003). https://grants.nih.gov/grants/policy/datasharing/datasharingguidance.htm [Date Accessed: July 21st 2020]

  • NSF (2011) Dissemination and Sharing of Research Results. (2011). https://www.nsf.gov/bfa/dias/policy/dmp.jsp [Date Accessed: July 21st 2020]

  • O’Neill HM, Pouder RW, Buchholtz AK (1998) Patterns in the diffusion of strategies across organizations: Insights from the innovation diffusion literature. Acad Manag Rev 23(1):98–114

    Article  Google Scholar 

  • Oracle (2021) VirtualBox. (2021). https://www.virtualbox.org [Date Accessed: March 13th 2021]

  • Patterson D, Snyder L, Ullman J (1999) Evaluating computer scientists and engineers for promotion and tenure. Comput Res News

  • Premkumar G, Ramamurthy K, Nilakanta S (1994) Implementation of electronic data interchange: An innovation diffusion perspective. J Manag Inf Syst 11(2):157–186

    Article  Google Scholar 

  • PyPA (2021) pip –The Python Package Installer. (2021). https://pip.pypa.io/en/stable [Date Accessed: March 13th 2021]

  • QEMU (2021) QEMU. (2021). https://www.qemu.org [Date Accessed: March 13th 2021]

  • RedHat (2021) Podman. (2021). https://github.com/containers/podman [Date Accessed: March 13th 2021]

  • Rogers EM (2010) Diffusion of innovations. Simon and Schuster

  • Rougier NP, Hinsen K, Alexandre F, Arildsen T, Barba LA, Benureau FCY, Titus Brown C, de Buyl P, Caglayan O, Davison AP, Delsuc M, Detorakis G, Diem AK, Drix D, Enel P, Girard B, Guest O, Hall MG, Henriques RN, Hinaut X, Jaron KS, Khamassi M, Klein A, Manninen T, Marchesi P, McGlinn D, Metzner C, Petchey OL, Plesser HE, Poisot T, Ram K, Ram Y, Roesch EB, Rossant C, Rostami V, Shifman A, Stachelek J, Stimberg M, Stollmeier F, Vaggi F, Viejo G, Vitay J, Vostinar AE, Yurchak R, Zito T (2017) Sustainable computational science: the ReScience initiative. PeerJ Comput Sci 3(12):1–8

    Google Scholar 

  • Saldaña J (2015) The coding manual for qualitative researchers. Sage

  • Shull F, Basili V, Carver J, Maldonado JC, Travassos GH, Mendonca M, Fabbri S (2002) Replicating software engineering experiments: addressing the tacit knowledge problem. In: International symposium on empirical software engineering (ESEM ’2), pp 7–16

  • Shull FJ, Carver JC, Vegas S, Juristo N (2008) The role of replications in empirical software engineering. Empir Softw Eng 13(2):211–218

    Article  Google Scholar 

  • Silverman D, Marvasti A (2008) Doing qualitative research: A comprehensive guide. Sage

  • Smith AM, Katz DS, Niemeyer KE (2016) Software citation principles. PeerJ Comput Sci 2:e86

    Article  Google Scholar 

  • Stodden V (2009a) Enabling reproducible research: Open licensing for scientific innovation. Int J Commun Law Pol 13:1–55

    Google Scholar 

  • Stodden V (2009b) The legal framework for reproducible scientific research: licensing and copyright. Comput Sci Eng 11(1):35–40

    Article  Google Scholar 

  • Stodden V, Leisch F, Peng RD (2014) Implementing reproducible research. CRC Press

  • Teshima CG, Cleary B, Singer L (2013) A study of innovation diffusion through link sharing on stack overflow. In: Working conference on mining software repositories (MSR ’13), pp 81–84

  • Timperley CS, Stepney S, Le Goues C (2018) Bugzoo: A platform for studying software bugs. In: International conference on software engineering: Companion proceeedings, pp 446–447

  • Wolfe RA (1994) Organizational innovation: review, critique and suggested research directions. J Manag Stud 31(3):405–431

    Article  Google Scholar 

  • Wright RE, Palmar JC, Kavanaugh DC (1995) The importance of promoting stakeholder acceptance of educational innovations. Education 115(4):628–633

    Google Scholar 

  • Zerhouni E (2003) The NIH roadmap

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christopher S. Timperley.

Ethics declarations

Compliance with Ethical Standards

Our study included an online survey of software engineering researchers. We received approval for our survey from Carnegie Mellon University’s Institutional Review Board. We have include our informed consent form as part of the additional materials for this study.

Additional information

Communicated by: Martin Monperrus

Availability of data and material

A replication package for this study, including the author survey instrument and results of the publication survey, is available at https://doi.org/10.5281/zenodo.4737346.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Timperley, C.S., Herckis, L., Le Goues, C. et al. Understanding and improving artifact sharing in software engineering research. Empir Software Eng 26, 67 (2021). https://doi.org/10.1007/s10664-021-09973-5

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-021-09973-5

Keywords

Navigation