当前位置: X-MOL 学术Nat. Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Evolutionarily informed machine learning enhances the power of predictive gene-to-phenotype relationships
Nature Communications ( IF 14.7 ) Pub Date : 2021-09-24 , DOI: 10.1038/s41467-021-25893-w
Chia-Yi Cheng 1 , Ji Huang 1 , Grace J. Kim 1 , Justin Halim 1 , Hung-Jui S. Shih 1 , Grace Levinson 1 , Seo Hyun Park 1 , Ha Young Cho 1 , Gloria M. Coruzzi 1 , Ying Li 2, 3 , Kranthi Varala 2, 3 , Jessica Bubert 4 , Jennifer Arp 4 , Stephen P. Moose 4
Affiliation  

Inferring phenotypic outcomes from genomic features is both a promise and challenge for systems biology. Using gene expression data to predict phenotypic outcomes, and functionally validating the genes with predictive powers are two challenges we address in this study. We applied an evolutionarily informed machine learning approach to predict phenotypes based on transcriptome responses shared both within and across species. Specifically, we exploited the phenotypic diversity in nitrogen use efficiency and evolutionarily conserved transcriptome responses to nitrogen treatments across Arabidopsis accessions and maize varieties. We demonstrate that using evolutionarily conserved nitrogen responsive genes is a biologically principled approach to reduce the feature dimensionality in machine learning that ultimately improved the predictive power of our gene-to-trait models. Further, we functionally validated seven candidate transcription factors with predictive power for NUE outcomes in Arabidopsis and one in maize. Moreover, application of our evolutionarily informed pipeline to other species including rice and mice models underscores its potential to uncover genes affecting any physiological or clinical traits of interest across biology, agriculture, or medicine.



中文翻译:

进化信息机器学习增强了预测基因与表型关系的能力

从基因组特征推断表型结果对系统生物学来说既是希望也是挑战。使用基因表达数据来预测表型结果,并在功能上验证具有预测能力的基因是我们在本研究中解决的两个挑战。我们应用进化信息机器学习方法来根据物种内和物种间共享的转录组反应来预测表型。具体来说,我们利用了拟南芥种质和玉米品种中氮利用效率的表型多样性以及进化上保守的转录组对氮处理的反应。我们证明,使用进化保守的氮响应基因是一种生物学原理的方法,可以降低机器学习中的特征维度,最终提高我们的基因到性状模型的预测能力。此外,我们还对拟南芥中的七种候选转录因子和玉米中的一种对 NUE 结果具有预测能力进行了功能验证。此外,将我们的进化信息管道应用于包括水稻和小鼠模型在内的其他物种,强调了其揭示影响生物学、农业或医学中任何感兴趣的生理或临床特征的基因的潜力。

更新日期:2021-09-24
down
wechat
bug