Creating optimal conditions for reproducible data analysis in R with ‘fertile’,Stat

当前位置： X-MOL 学术 › Stat › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Creating optimal conditions for reproducible data analysis in R with ‘fertile’
Stat ( IF 1.7 ) Pub Date : 2020-11-26 , DOI: 10.1002/sta4.332
Audrey M. Bertin ₁ , Benjamin S. Baumer ₂

Affiliation

The advancement of scientific knowledge increasingly depends on ensuring that data‐driven research is reproducible: that two people with the same data obtain the same results. However, while the necessity of reproducibility is clear, there are significant behavioral and technical challenges that impede its widespread implementation and no clear consensus on standards of what constitutes reproducibility in published research. We present fertile, an R package that focuses on a series of common mistakes programmers make while conducting data science projects in R, primarily through the RStudio integrated development environment. fertile operates in two modes: proactively, to prevent reproducibility mistakes from happening in the first place, and retroactively, analyzing code that is already written for potential problems. Furthermore, fertile is designed to educate users on why their mistakes are problematic and how to fix them.

中文翻译：

为“肥沃”的R中的可重复数据分析创造最佳条件

科学知识的发展越来越依赖于确保数据驱动的研究具有可重复性：拥有相同数据的两个人获得相同的结果。然而，尽管明确了可重复性的必要性，但仍存在重大的行为和技术挑战，阻碍了其广泛实施，并且尚未就已发表研究中构成可重复性的标准达成明确共识。我们介绍了fertile，这是一个R程序包，主要关注程序员在R中进行数据科学项目时（主要是通过RStudio集成开发环境）进行的一系列常见错误。沃操作有两种模式：一种是主动预防（首先要防止可再现性错误），另一种是追溯分析已经编写的代码以解决潜在问题。此外，Fertile旨在教育用户有关他们的错误为何出问题以及如何解决这些错误的信息。

更新日期：2020-11-26

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>