Understanding and improving the quality and reproducibility of Jupyter notebooks,Empirical Software Engineering

当前位置： X-MOL 学术 › Empir. Software Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Understanding and improving the quality and reproducibility of Jupyter notebooks
Empirical Software Engineering ( IF 3.5 ) Pub Date : 2021-05-08 , DOI: 10.1007/s10664-021-09961-9
João Felipe Pimentel , Leonardo Murta , Vanessa Braganholo , Juliana Freire

Jupyter Notebooks have been widely adopted by many different communities, both in science and industry. They support the creation of literate programming documents that combine code, text, and execution results with visualizations and other rich media. The self-documenting aspects and the ability to reproduce results have been touted as significant benefits of notebooks. At the same time, there has been growing criticism that the way in which notebooks are being used leads to unexpected behavior, encourages poor coding practices, and makes it hard to reproduce its results. To better understand good and bad practices used in the development of real notebooks, in prior work we studied 1.4 million notebooks from GitHub. We presented a detailed analysis of their characteristics that impact reproducibility, proposed best practices that can improve the reproducibility, and discussed open challenges that require further research and development. In this paper, we extended the analysis in four different ways to validate the hypothesis uncovered in our original study. First, we separated a group of popular notebooks to check whether notebooks that get more attention have more quality and reproducibility capabilities. Second, we sampled notebooks from the full dataset for an in-depth qualitative analysis of what constitutes the dataset and which features they have. Third, we conducted a more detailed analysis by isolating library dependencies and testing different execution orders. We report how these factors impact the reproducibility rates. Finally, we mined association rules from the notebooks. We discuss patterns we discovered, which provide additional insights into notebook reproducibility. Based on our findings and best practices we proposed, we designed Julynter, a Jupyter Lab extension that identifies potential issues in notebooks and suggests modifications that improve their reproducibility. We evaluate Julynter with a remote user experiment with the goal of assessing Julynter recommendations and usability.

中文翻译：

了解并提高Jupyter笔记本的质量和可重复性

Jupyter笔记本已经在科学和工业领域被许多不同的社区广泛采用。它们支持创建将代码，文本和执行结果与可视化效果和其他富媒体相结合的识字编程文档。自我记录方面和再现结果的能力被认为是笔记本电脑的显着优势。同时，越来越多的人批评说，笔记本的使用方式会导致意外的行为，鼓励不良的编码习惯，并使其难以再现其结果。为了更好地了解在开发实际笔记本电脑时使用的优缺点，在先前的工作中，我们研究了来自GitHub的140万笔记本电脑。我们对影响重现性的特征进行了详细分析，提出了可以提高可重复性的最佳实践，并讨论了需要进一步研究和开发的挑战。在本文中，我们以四种不同的方式扩展了分析，以验证原始研究中发现的假设。首先，我们将一组受欢迎的笔记本电脑分开，以检查受到更多关注的笔记本电脑是否具有更高的质量和可再现性功能。其次，我们从完整的数据集中对笔记本进行了采样，以对数据集的构成和特征进行深入的定性分析。第三，我们通过隔离库依赖关系和测试不同的执行顺序进行了更详细的分析。我们报告了这些因素如何影响可重复性率。最后，我们从笔记本中挖掘了关联规则。我们讨论发现的模式，其中提供了有关笔记本可重复性的更多见解。根据我们的发现和建议的最佳实践，我们设计了Jupyter Lab扩展Julynter，该扩展可识别笔记本中的潜在问题并提出改进其再现性的修改建议。我们通过远程用户实验来评估Julynter，目的是评估Julynter建议和可用性。

更新日期：2021-05-08

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11