当前位置: X-MOL 学术bioRxiv. Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Identifying Longevity Associated Genes by Integrating Gene Expression and Curated Annotations
bioRxiv - Genomics Pub Date : 2020-08-07 , DOI: 10.1101/2020.01.31.929232
F. William Townes , Kareem Carr , Jeffrey W. Miller

Aging is a complex process with poorly understood genetic mechanisms. Recent studies have sought to classify genes as pro-longevity or anti-longevity using a variety of machine learning algorithms. However, it is not clear which types of features are best for optimizing classification performance and which algorithms are best suited to this task. Further, performance assessments based on held-out test data are lacking. We systematically compare five popular classification algorithms using gene ontology and gene expression datasets as features to predict the pro-longevity versus anti-longevity status of genes for two model organisms (C. elegans and S. cerevisiae) using the GenAge database as ground truth. We find that elastic net penalized logistic regression performs particularly well at this task. Using elastic net, we make novel predictions of pro- and anti-longevity genes that are not currently in the GenAge database.

中文翻译:

通过整合基因表达和预定注释来识别长寿相关基因

衰老是一个复杂的过程,对遗传机制的了解甚少。最近的研究试图使用各种机器学习算法将基因分类为长寿或反长寿。但是,尚不清楚哪种类型的特征最适合优化分类性能,哪种算法最适合此任务。此外,缺乏基于保留的测试数据的性能评估。我们以基因本体论和基因表达数据集为特征,系统比较了五种流行的分类算法,以GenAge数据库为基础,预测了两种模式生物(秀丽隐杆线虫和酿酒酵母)的基因的长寿与长寿状态。我们发现,弹性净罚逻辑回归在此任务中表现特别出色。使用弹性网,
更新日期:2020-08-10
down
wechat
bug