当前位置: X-MOL 学术Mol. Ecol. Resour. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Predicting amphibian intraspecific diversity with machine learning: Challenges and prospects for integrating traits, geography, and genetic data
Molecular Ecology Resources ( IF 7.7 ) Pub Date : 2020-12-14 , DOI: 10.1111/1755-0998.13303
Lisa N Barrow 1, 2 , Emanuel Masiero da Fonseca 1 , Coleen E P Thompson 1 , Bryan C Carstens 1
Affiliation  

The growing availability of genetic data sets, in combination with machine learning frameworks, offers great potential to answer long-standing questions in ecology and evolution. One such question has intrigued population geneticists, biogeographers, and conservation biologists: What factors determine intraspecific genetic diversity? This question is challenging to answer because many factors may influence genetic variation, including life history traits, historical influences, and geography, and the relative importance of these factors varies across taxonomic and geographic scales. Furthermore, interpreting the influence of numerous, potentially correlated variables is difficult with traditional statistical approaches. To address these challenges, we analysed repurposed data using machine learning and investigated predictors of genetic diversity, focusing on Nearctic amphibians as a case study. We aggregated species traits, range characteristics, and >42,000 genetic sequences for 299 species using open-access scripts and various databases. After identifying important predictors of nucleotide diversity with random forest regression, we conducted follow-up analyses to examine the roles of phylogenetic history, geography, and demographic processes on intraspecific diversity. Although life history traits were not important predictors for this data set, we found significant phylogenetic signal in genetic diversity within amphibians. We also found that salamander species at northern latitudes contained low genetic diversity. Data repurposing and machine learning provide valuable tools for detecting patterns with relevance for conservation, but concerted efforts are needed to compile meaningful data sets with greater utility for understanding global biodiversity.

中文翻译:

利用机器学习预测两栖动物种内多样性:整合性状、地理和遗传数据的挑战和前景

越来越多的遗传数据集与机器学习框架相结合,为回答生态学和进化中的长期问题提供了巨大的潜力。一个这样的问题引起了种群遗传学家、生物地理学家和保护生物学家的兴趣:什么因素决定了种内遗传多样性?这个问题很难回答,因为许多因素可能会影响遗传变异,包括生活史特征、历史影响和地理,并且这些因素的相对重要性在分类学和地理尺度上有所不同。此外,使用传统的统计方法很难解释众多潜在相关变量的影响。为了应对这些挑战,我们使用机器学习分析了重新利用的数据并调查了遗传多样性的预测因素,以近北区两栖动物为案例研究。我们使用开放获取脚本和各种数据库汇总了 299 个物种的物种特征、范围特征和 >42,000 个基因序列。在通过随机森林回归确定核苷酸多样性的重要预测因子后,我们进行了后续分析,以检查系统发育历史、地理和人口统计学过程对种内多样性的作用。尽管生活史特征不是该数据集的重要预测因子,但我们在两栖动物的遗传多样性中发现了显着的系统发育信号。我们还发现北纬的蝾螈物种遗传多样性较低。数据再利用和机器学习为检测与保护相关的模式提供了宝贵的工具,
更新日期:2020-12-14
down
wechat
bug