当前位置: X-MOL 学术Genet. Epidemiol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An enhanced machine learning tool for cis-eQTL mapping with regularization and confounder adjustments.
Genetic Epidemiology ( IF 1.7 ) Pub Date : 2020-07-22 , DOI: 10.1002/gepi.22341
Kang K Yan 1 , Hongyu Zhao 2 , Joseph T Wu 1 , Herbert Pang 1
Affiliation  

Many expression quantitative trait loci (eQTL) studies have been conducted to investigate the biological effects of variants in gene regulation. However, these eQTL studies may suffer from low or moderate statistical power and overly conservative false‐discovery rate. In practice, most algorithms for eQTL identification do not model the joint effects of multiple genetic variants with weak or moderate influence. Here we present a novel machine‐learning algorithm, lasso least‐squares kernel machine (LSKM‐LASSO) that model the association between multiple genetic variants and phenotypic traits simultaneously with the existence of nongenetic and genetic confounding. With a more general and flexible framework for the estimation of genetic confounding, LSKM‐LASSO is able to provide a more accurate evaluation of the joint effects of multiple genetic variants. Our simulations demonstrate that our approach outperforms three state‐of‐the‐art alternatives in terms of eQTL identification and phenotype prediction. We then apply our method to genotype and gene expression data of 11 tissues obtained from the Genotype‐Tissue Expression project. Our algorithm was able to identify more genes with eQTL than other algorithms. By incorporating a regularization term and combining it with least‐squares kernel machine, LSKM‐LASSO provides a powerful tool for eQTL mapping and phenotype prediction.

中文翻译:


一种增强型机器学习工具,用于 cis-eQTL 映射,具有正则化和混杂因素调整功能。



已经进行了许多表达数量性状位点(eQTL)研究来研究基因调控中变异的生物学效应。然而,这些 eQTL 研究可能存在低或中等的统计功效以及过于保守的错误发现率。在实践中,大多数 eQTL 识别算法并没有对具有弱或中度影响的多个遗传变异的联合效应进行建模。在这里,我们提出了一种新颖的机器学习算法,即套索最小二乘核机(LSKM-LASSO),该算法可以对多个遗传变异和表型性状之间的关联进行建模,同时存在非遗传和遗传混杂。凭借更通用和灵活的遗传混杂估计框架,LSKM-LASSO 能够更准确地评估多种遗传变异的联合效应。我们的模拟表明,我们的方法在 eQTL 识别和表型预测方面优于三种最先进的替代方法。然后,我们将我们的方法应用于从基因型-组织表达项目获得的 11 个组织的基因型和基因表达数据。与其他算法相比,我们的算法能够通过 eQTL 识别更多的基因。通过合并正则化项并将其与最小二乘核机相结合,LSKM-LASSO 为 eQTL 映射和表型预测提供了强大的工具。
更新日期:2020-07-22
down
wechat
bug