当前位置: X-MOL 学术Stat › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Creating optimal conditions for reproducible data analysis in R with ‘fertile’
Stat ( IF 1.7 ) Pub Date : 2020-11-26 , DOI: 10.1002/sta4.332
Audrey M. Bertin 1 , Benjamin S. Baumer 2
Affiliation  

The advancement of scientific knowledge increasingly depends on ensuring that data‐driven research is reproducible: that two people with the same data obtain the same results. However, while the necessity of reproducibility is clear, there are significant behavioral and technical challenges that impede its widespread implementation and no clear consensus on standards of what constitutes reproducibility in published research. We present fertile, an R package that focuses on a series of common mistakes programmers make while conducting data science projects in R, primarily through the RStudio integrated development environment. fertile operates in two modes: proactively, to prevent reproducibility mistakes from happening in the first place, and retroactively, analyzing code that is already written for potential problems. Furthermore, fertile is designed to educate users on why their mistakes are problematic and how to fix them.

中文翻译:

为“肥沃”的R中的可重复数据分析创造最佳条件

科学知识的发展越来越依赖于确保数据驱动的研究具有可重复性:拥有相同数据的两个人获得相同的结果。然而,尽管明确了可重复性的必要性,但仍存在重大的行为和技术挑战,阻碍了其广泛实施,并且尚未就已发表研究中构成可重复性的标准达成明确共识。我们介绍了fertile,这是一个R程序包,主要关注程序员在R中进行数据科学项目时(主要是通过RStudio集成开发环境)进行的一系列常见错误。操作有两种模式:一种是主动预防(首先要防止可再现性错误),另一种是追溯分析已经编写的代码以解决潜在问题。此外,Fertile旨在教育用户有关他们的错误为何出问题以及如何解决这些错误的信息。
更新日期:2020-11-26
down
wechat
bug