当前位置: X-MOL 学术J. Comput. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Integrating Biological Knowledge Into Case-Control Analysis Through Iterated Conditional Modes/Medians Algorithm.
Journal of Computational Biology ( IF 1.4 ) Pub Date : 2020-07-09 , DOI: 10.1089/cmb.2019.0319
Vitara Pungpapong 1 , Min Zhang 2 , Dabao Zhang 2
Affiliation  

Logistic regression is an effective tool in case–control analysis. With the advanced high throughput technology, a quest to seek a fast and efficient method in fitting high-dimensional logistic regression has gained much interest. An empirical Bayes model for logistic regression is considered in this article. A spike-and-slab prior is used for variable selection purpose, which plays a vital role in building an effective predictive model while making model interpretable. To increase the power of variable selection, we incorporate biological knowledge through the Ising prior. The development of the iterated conditional modes/medians (ICM/M) algorithm is proposed to fit the logistic model that has computational advantage over Markov Chain Monte Carlo (MCMC) algorithms. The implementation of the ICM/M algorithm for both linear and logistic models can be found in R package icmm that is freely available on Comprehensive R Archive Network (CRAN). Simulation studies were carried out to assess the performances of our method, with lasso and adaptive lasso as benchmark. Overall, the simulation studies show that the ICM/M outperform the others in terms of number of false positives and have competitive predictive ability. An application to a real data set from Parkinson's disease study was also carried out for illustration. To identify important variables, our approach provides flexibility to select variables based on local posterior probabilities while controlling false discovery rate at a desired level rather than relying only on regression coefficients.

中文翻译:


通过迭代条件模式/中位数算法将生物知识整合到病例对照分析中。



逻辑回归是病例对照分析的有效工具。随着先进的高通量技术的发展,寻求一种快速有效的方法来拟合高维逻辑回归已经引起了人们的广泛兴趣。本文考虑了逻辑回归的经验贝叶斯模型。尖峰和平板先验用于变量选择目的,这在构建有效的预测模型同时使模型可解释方面发挥着至关重要的作用。为了提高变量选择的能力,我们通过伊辛先验结合了生物学知识。提出了迭代条件模式/中值(ICM/M)算法的开发,以拟合比马尔可夫链蒙特卡罗(MCMC)算法具有计算优势的逻辑模型。线性模型和逻辑模型的 ICM/M 算法的实现可以在 R 包 icmm 中找到,该包可以在综合 R 存档网络 (CRAN) 上免费获得。以套索和自适应套索为基准进行了模拟研究来评估我们方法的性能。总体而言,模拟研究表明 ICM/M 在误报数量方面优于其他模型,并且具有竞争性的预测能力。为了进行说明,还对帕金森病研究的真实数据集进行了应用。为了识别重要变量,我们的方法提供了基于局部后验概率选择变量的灵活性,同时将错误发现率控制在所需水平,而不是仅依赖回归系数。
更新日期:2020-07-10
down
wechat
bug