当前位置: X-MOL 学术Nat. Rev. Genet. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
DNA methylation-based predictors of health: applications and statistical considerations
Nature Reviews Genetics ( IF 39.1 ) Pub Date : 2022-03-18 , DOI: 10.1038/s41576-022-00465-w
Paul D Yousefi 1 , Matthew Suderman 1 , Ryan Langdon 1 , Oliver Whitehurst 1 , George Davey Smith 1 , Caroline L Relton 1
Affiliation  

DNA methylation data have become a valuable source of information for biomarker development, because, unlike static genetic risk estimates, DNA methylation varies dynamically in relation to diverse exogenous and endogenous factors, including environmental risk factors and complex disease pathology. Reliable methods for genome-wide measurement at scale have led to the proliferation of epigenome-wide association studies and subsequently to the development of DNA methylation-based predictors across a wide range of health-related applications, from the identification of risk factors or exposures, such as age and smoking, to early detection of disease or progression in cancer, cardiovascular and neurological disease. This Review evaluates the progress of existing DNA methylation-based predictors, including the contribution of machine learning techniques, and assesses the uptake of key statistical best practices needed to ensure their reliable performance, such as data-driven feature selection, elimination of data leakage in performance estimates and use of generalizable, adequately powered training samples.



中文翻译:


基于 DNA 甲基化的健康预测因子:应用和统计考虑因素



DNA 甲基化数据已成为生物标志物开发的宝贵信息来源,因为与静态遗传风险估计不同,DNA 甲基化随各种外源和内源因素(包括环境风险因素和复杂的疾病病理学)而动态变化。大规模全基因组测量的可靠方法导致了全表观基因组关联研究的激增,并随后开发了基于 DNA 甲基化的预测因子,应用于广泛的健康相关应用,从风险因素或暴露的识别,例如年龄和吸烟,以早期发现疾病或癌症、心血管和神经系统疾病的进展。本综述评估了现有基于 DNA 甲基化的预测器的进展,包括机器学习技术的贡献,并评估了确保其可靠性能所需的关键统计最佳实践的采用情况,例如数据驱动的特征选择、消除数据泄漏性能估计和使用可推广的、足够有力的训练样本。

更新日期:2022-03-18
down
wechat
bug