当前位置: X-MOL 学术Am. J. Hum. Genet. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Mantis-ml: Disease-Agnostic Gene Prioritization from High-Throughput Genomic Screens by Stochastic Semi-supervised Learning.
American Journal of Human Genetics ( IF 8.1 ) Pub Date : 2020-05-07 , DOI: 10.1016/j.ajhg.2020.03.012
Dimitrios Vitsios 1 , Slavé Petrovski 1
Affiliation  

Access to large-scale genomics datasets has increased the utility of hypothesis-free genome-wide analyses. However, gene signals are often insufficiently powered to reach experiment-wide significance, triggering a process of laborious triaging of genomic-association-study results. We introduce mantis-ml, a multi-dimensional, multi-step machine-learning framework that allows objective assessment of the biological relevance of genes to disease studies. Mantis-ml is an automated machine-learning framework that follows a multi-model approach of stochastic semi-supervised learning to rank disease-associated genes through iterative learning sessions on random balanced datasets across the protein-coding exome. When applied to a range of human diseases, including chronic kidney disease (CKD), epilepsy, and amyotrophic lateral sclerosis (ALS), mantis-ml achieved an average area under curve (AUC) prediction performance of 0.81-0.89. Critically, to prove its value as a tool that can be used to interpret exome-wide association studies, we overlapped mantis-ml predictions with data from published cohort-level association studies. We found a statistically significant enrichment of high mantis-ml predictions among the highest-ranked genes from hypothesis-free cohort-level statistics, indicating a substantial improvement over the performance of current state-of-the-art methods and pointing to the capture of true prioritization signals for disease-associated genes. Finally, we introduce a generic mantis-ml score (GMS) trained with over 1,200 features as a generic-disease-likelihood estimator, outperforming published gene-level scores. In addition to our tool, we provide a gene prioritization atlas that includes mantis-ml's predictions across ten disease areas and empowers researchers to interactively navigate through the gene-triaging framework. Mantis-ml is an intuitive tool that supports the objective triaging of large-scale genomic discovery studies and enhances our understanding of complex genotype-phenotype associations.

中文翻译:


Mantis-ml:通过随机半监督学习从高通量基因组筛选中进行与疾病无关的基因优先排序。



大规模基因组学数据集的获取提高了无假设全基因组分析的实用性。然而,基因信号通常不足以达到整个实验范围的显着性,从而引发了对基因组关联研究结果进行费力分类的过程。我们引入了 mantis-ml,这是一种多维度、多步骤的机器学习框架,可以客观评估基因与疾病研究的生物学相关性。 Mantis-ml 是一种自动化机器学习框架,遵循随机半监督学习的多模型方法,通过对蛋白质编码外显子组的随机平衡数据集进行迭代学习,对疾病相关基因进行排名。当应用于一系列人类疾病,包括慢性肾病(CKD)、癫痫和肌萎缩侧索硬化症(ALS)时,mantis-ml 实现了 0.81-0.89 的平均曲线下面积(AUC)预测性能。至关重要的是,为了证明其作为可用于解释全外显子组关联研究的工具的价值,我们将 mantis-ml 预测与已发表的队列水平关联研究的数据进行了重叠。我们发现,在无假设队列水平统计中排名最高的基因中,高 mantis-ml 预测在统计上显着富集,这表明当前最先进方法的性能有了实质性改进,并指出捕获疾病相关基因的真正优先信号。最后,我们引入了一个通用 mantis-ml 评分 (GMS),它经过 1,200 多个特征的训练,作为通用疾病似然估计器,其性能优于已发布的基因水平评分。 除了我们的工具之外,我们还提供了基因优先图集,其中包括 mantis-ml 对十个疾病领域的预测,并使研究人员能够交互式地浏览基因分类框架。 Mantis-ml 是一种直观的工具,支持大规模基因组发现研究的客观分类,并增强我们对复杂基因型-表型关联的理解。
更新日期:2020-05-07
down
wechat
bug