当前位置: X-MOL 学术Ecography › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
occCite: Tools for querying and managing large biodiversity occurrence datasets
Ecography ( IF 5.4 ) Pub Date : 2021-06-21 , DOI: 10.1111/ecog.05618
Hannah L. Owens 1, 2 , Cory Merow 3, 4 , Brian S. Maitner 5 , Jamie M. Kass 6 , Vijay Barve 2, 7 , Robert P. Guralnick 2
Affiliation  

The amount of observational and specimen-based biodiversity data available to researchers is increasing exponentially, yet the ability to manage and cite large, complex biodiversity datasets lags behind. This management and citation gap impedes reproducibility for data users and the ability for data publishers to track use and accumulate use citations, ultimately harming the longer-term sustainability of the still-emerging enterprise of research data-sharing. Here we present an R package, occCite (v. 0.4.7), to aid researchers in querying large species occurrence data aggregators (specifically, the Global Biodiversity Information Facility, GBIF, and the Botanical Information and Ecology Network, BIEN), and store metadata such as primary data providers, database accession dates, DOIs, and the taxonomic source used for search terms. occCite also includes tools to summarize and visualize query results and generate citation lists of all data providers and software packages used during the query process. We provide examples of a basic occurrence search and citation workflow as well as an advanced workflow using features for custom optimized searches, visualization, and summary procedures. occCite improves upon existing R packages by uniting data from powerful API-based query packages (rgbif and BIEN) into a unified object-based framework, while maintaining metadata vital to best-practice recommendations for documenting biodiversity analysis workflows. occCite aims to efficiently close the gap in the citation cycle between primary data providers and final research products, allowing researchers to meet dataset documentation standards without sacrificing time and resources to the demands of providing increasing levels of detail on their datasets.

中文翻译:

occCite:用于查询和管理大型生物多样性发生数据集的工具

可供研究人员使用的观察性和基于标本的生物多样性数据的数量呈指数级增长,但管理和引用大型复杂生物多样性数据集的能力却落后了。这种管理和引用差距阻碍了数据用户的可重复性以及数据发布者跟踪使用和积累使用引用的能力,最终损害了仍在兴起的研究数据共享企业的长期可持续性。这里我们展示了一个 R 包,occCite(v. 0.4.7),帮助研究人员查询大型物种发生数据聚合器(特别是全球生物多样性信息设施 GBIF 和植物信息和生态网络 BIEN),并存储元数据,例如原始数据提供者、数据库入藏日期、DOI 和用于搜索词的分类学来源。occCite还包括用于汇总和可视化查询结果并生成查询过程中使用的所有数据提供者和软件包的引文列表的工具。我们提供了基本出现搜索和引文工作流程的示例,以及使用自定义优化搜索、可视化和摘要程序功能的高级工作流程。ocCite通过将来自强大的基于 API 的查询包(rgbifBIEN)的数据整合到一个统一的基于对象的框架中来改进现有的 R 包,同时维护对记录生物多样性分析工作流的最佳实践建议至关重要的元数据。occCite旨在有效缩小原始数据提供者和最终研究产品之间引用周期的差距,使研究人员能够满足数据集文档标准,而无需牺牲时间和资源来满足提供越来越多的数据集详细信息的需求。
更新日期:2021-08-07
down
wechat
bug