当前位置: X-MOL 学术Sci. Rep. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Robust phenotype prediction from gene expression data using differential shrinkage of co-regulated genes.
Scientific Reports ( IF 3.8 ) Pub Date : 2018-01-19 , DOI: 10.1038/s41598-018-19635-0
Kourosh Zarringhalam 1 , David Degras 1 , Christoph Brockel 2 , Daniel Ziemek 2
Affiliation  

Discovery of robust diagnostic or prognostic biomarkers is a key to optimizing therapeutic benefit for select patient cohorts - an idea commonly referred to as precision medicine. Most discovery studies to derive such markers from high-dimensional transcriptomics datasets are weakly powered with sample sizes in the tens of patients. Therefore, highly regularized statistical approaches are essential to making generalizable predictions. At the same time, prior knowledge-driven approaches have been successfully applied to the manual interpretation of high-dimensional transcriptomics datasets. In this work, we assess the impact of combining two orthogonal approaches for the discovery of biomarker signatures, namely (1) well-known lasso-based regression approaches and its more recent derivative, the group lasso, and (2) the discovery of significant upstream regulators in literature-derived biological networks. Our method integrates both approaches in a weighted group-lasso model and differentially weights gene sets based on inferred active regulatory mechanism. Using nested cross-validation as well as independent clinical datasets, we demonstrate that our approach leads to increased accuracy and generalizable results. We implement our approach in a computationally efficient, user-friendly R package called creNET. The package can be downloaded at https://github.com/kouroshz/creNethttps://github.com/kouroshz/creNet and is accompanied by a parsed version of the STRING DB data base.

中文翻译:

使用共调节基因的差异收缩从基因表达数据中进行稳健的表型预测。

发现可靠的诊断或预后生物标志物是优化特定患者群体治疗效果的关键——这一想法通常被称为精准医学。大多数从高维转录组学数据集中得出此类标记的发现研究在数十名患者的样本量下都具有微弱的动力。因此,高度正则化的统计方法对于做出可概括的预测至关重要。同时,先验知识驱动的方法已成功应用于高维转录组学数据集的人工解释。在这项工作中,我们评估了结合两种正交方法对发现生物标志物特征的影响,即 (1) 众所周知的基于套索的回归方法及其最近的衍生方法,组套索,(2) 在文献衍生的生物网络中发现重要的上游监管机构。我们的方法将这两种方法集成到加权组套索模型中,并根据推断的主动调节机制对基因集进行差异加权。使用嵌套交叉验证以及独立的临床数据集,我们证明我们的方法可以提高准确性和可推广的结果。我们在一个名为 creNET 的计算高效、用户友好的 R 包中实施我们的方法。该包可以在 https://github.com/kouroshz/creNet https://github.com/kouroshz/creNet 下载,并附有 STRING DB 数据库的解析版本。我们的方法将这两种方法集成到加权组套索模型中,并根据推断的主动调节机制对基因集进行差异加权。使用嵌套交叉验证以及独立的临床数据集,我们证明我们的方法可以提高准确性和可推广的结果。我们在一个名为 creNET 的计算高效、用户友好的 R 包中实施我们的方法。该包可以在 https://github.com/kouroshz/creNet https://github.com/kouroshz/creNet 下载,并附有 STRING DB 数据库的解析版本。我们的方法将这两种方法集成到加权组套索模型中,并根据推断的主动调节机制对基因集进行差异加权。使用嵌套交叉验证以及独立的临床数据集,我们证明我们的方法可以提高准确性和可推广的结果。我们在一个名为 creNET 的计算高效、用户友好的 R 包中实施我们的方法。该包可以在 https://github.com/kouroshz/creNet https://github.com/kouroshz/creNet 下载,并附有 STRING DB 数据库的解析版本。用户友好的 R 包称为 creNET。该包可以在 https://github.com/kouroshz/creNet https://github.com/kouroshz/creNet 下载,并附有 STRING DB 数据库的解析版本。用户友好的 R 包称为 creNET。该包可以在 https://github.com/kouroshz/creNet https://github.com/kouroshz/creNet 下载,并附有 STRING DB 数据库的解析版本。
更新日期:2018-01-19
down
wechat
bug