当前位置: X-MOL 学术Biostatistics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A hidden Markov modeling approach for identifying tumor subclones in next-generation sequencing studies
Biostatistics ( IF 1.8 ) Pub Date : 2020-04-13 , DOI: 10.1093/biostatistics/kxaa013
Hyoyoung Choo-Wosoba 1 , Paul S Albert 1 , Bin Zhu 1
Affiliation  

Allele-specific copy number alteration (ASCNA) analysis is for identifying copy number abnormalities in tumor cells. Unlike normal cells, tumor cells are heterogeneous as a combination of dominant and minor subclones with distinct copy number profiles. Estimating the clonal proportion and identifying mainclone and subclone genotypes across the genome are important for understanding tumor progression. Several ASCNA tools have recently been developed, but they have been limited to the identification of subclone regions, and not the genotype of subclones. In this article, we propose subHMM, a hidden Markov model-based approach that estimates both subclone region and region-specific subclone genotype and clonal proportion. We specify a hidden state variable representing the conglomeration of clonal genotype and subclone status. We propose a two-step algorithm for parameter estimation, where in the first step, a standard hidden Markov model with this conglomerated state variable is fit. Then, in the second step, region-specific estimates of the clonal proportions are obtained by maximizing region-specific pseudo-likelihoods. We apply subHMM to study renal cell carcinoma datasets in The Cancer Genome Atlas. In addition, we conduct simulation studies that show the good performance of the proposed approach. The R source code is available online at https://dceg.cancer.gov/tools/analysis/subhmm. Expectation–Maximization algorithm; Forward–backward algorithm; Somatic copy number alteration; Tumor subclones.

中文翻译:

用于在下一代测序研究中识别肿瘤亚克隆的隐马尔可夫建模方法

等位基因特异性拷贝数改变 (ASCNA) 分析用于识别肿瘤细胞中的拷贝数异常。与正常细胞不同,肿瘤细胞是异质的,是具有不同拷贝数特征的显性亚克隆和次要亚克隆的组合。估计克隆比例并识别整个基因组中的主克隆和亚克隆基因型对于了解肿瘤进展非常重要。最近开发了几种 ASCNA 工具,但它们仅限于识别亚克隆区域,而不是亚克隆的基因型。在本文中,我们提出了 subHMM,一种基于隐马尔可夫模型的方法,可估计亚克隆区域和区域特异性亚克隆基因型和克隆比例。我们指定一个隐藏状态变量,表示克隆基因型和亚克隆状态的组合。我们提出了一种用于参数估计的两步算法,其中在第一步中,拟合具有该聚合状态变量的标准隐马尔可夫模型。然后,在第二步中,通过最大化区域特定的伪似然来获得克隆比例的区域特定估计。我们应用 subHMM 来研究癌症基因组图谱中的肾细胞癌数据集。此外,我们还进行了仿真研究,显示了所提出方法的良好性能。R 源代码可在线获取:https://dceg.cancer.gov/tools/analysis/subhmm。期望最大化算法;前向-后向算法;体细胞拷贝数改变;肿瘤亚克隆。
更新日期:2020-04-17
down
wechat
bug