当前位置: X-MOL 学术Brief. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
ARIC: accurate and robust inference of cell type proportions from bulk gene expression or DNA methylation data
Briefings in Bioinformatics ( IF 9.5 ) Pub Date : 2021-08-17 , DOI: 10.1093/bib/bbab362
Wei Zhang 1 , Hanwen Xu 1 , Rong Qiao 1 , Bixi Zhong 1 , Xianglin Zhang 1 , Jin Gu 1 , Xuegong Zhang 1 , Lei Wei 1 , Xiaowo Wang 1
Affiliation  

Quantifying cell proportions, especially for rare cell types in some scenarios, is of great value in tracking signals associated with certain phenotypes or diseases. Although some methods have been proposed to infer cell proportions from multicomponent bulk data, they are substantially less effective for estimating the proportions of rare cell types which are highly sensitive to feature outliers and collinearity. Here we proposed a new deconvolution algorithm named ARIC to estimate cell type proportions from gene expression or DNA methylation data. ARIC employs a novel two-step marker selection strategy, including collinear feature elimination based on the component-wise condition number and adaptive removal of outlier markers. This strategy can systematically obtain effective markers for weighted $\upsilon$-support vector regression to ensure a robust and precise rare proportion prediction. We showed that ARIC can accurately estimate fractions in both DNA methylation and gene expression data from different experiments. We further applied ARIC to the survival prediction of ovarian cancer and the condition monitoring of chronic kidney disease, and the results demonstrate the high accuracy and robustness as well as clinical potentials of ARIC. Taken together, ARIC is a promising tool to solve the deconvolution problem of bulk data where rare components are of vital importance.

中文翻译:

ARIC:从大量基因表达或 DNA 甲基化数据中准确和稳健地推断细胞类型比例

量化细胞比例,特别是在某些情况下对稀有细胞类型进行量化,对于跟踪与某些表型或疾病相关的信号具有重要价值。尽管已经提出了一些方法来从多组分批量数据中推断细胞比例,但它们对于估计对特征异常值和共线性高度敏感的稀有细胞类型的比例实际上不太有效。在这里,我们提出了一种名为 ARIC 的新反卷积算法,用于根据基因表达或 DNA 甲基化数据估计细胞类型比例。ARIC 采用了一种新颖的两步标记选择策略,包括基于组件条件数的共线特征消除和异常值标记的自适应去除。该策略可以系统地获得加权$\upsilon$-支持向量回归的有效标记,以确保稳健和精确的稀有比例预测。我们表明,ARIC 可以准确估计来自不同实验的 DNA 甲基化和基因表达数据中的分数。我们进一步将ARIC应用于卵巢癌的生存预测和慢性肾脏病的病情监测,结果证明了ARIC的高精度和稳健性以及临床潜力。总而言之,ARIC 是一种很有前途的工具,可以解决稀有成分至关重要的批量数据的反卷积问题。我们进一步将ARIC应用于卵巢癌的生存预测和慢性肾脏病的病情监测,结果证明了ARIC的高精度和稳健性以及临床潜力。总而言之,ARIC 是一种很有前途的工具,可以解决稀有成分至关重要的批量数据的反卷积问题。我们进一步将ARIC应用于卵巢癌的生存预测和慢性肾脏病的病情监测,结果证明了ARIC的高精度和稳健性以及临床潜力。总而言之,ARIC 是一种很有前途的工具,可以解决稀有成分至关重要的批量数据的反卷积问题。
更新日期:2021-08-17
down
wechat
bug