当前位置: X-MOL 学术IEEE/ACM Trans. Comput. Biol. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
scLRTD : A Novel Low Rank Tensor Decomposition Method for Imputing Missing Values in Single-Cell Multi-Omics Sequencing Data
IEEE/ACM Transactions on Computational Biology and Bioinformatics ( IF 4.5 ) Pub Date : 2020-09-22 , DOI: 10.1109/tcbb.2020.3025804
Zhijie Ni 1 , Xiaoying Zheng 1 , Xiao Zheng 1 , Xiufen Zou 1
Affiliation  

With the successful application of single-cell sequencing technology, a large number of single-cell multi-omics sequencing (scMO-seq)data have been generated, which enables researchers to study heterogeneity between individual cells. One prominent problem in single-cell data analysis is the prevalence of dropouts, caused by failures in amplification during the experiments. It is necessary to develop effective approaches for imputing the missing values. Different with general methods imputing single type of single-cell data, we propose an imputation method called scLRTD, using low-rank tensor decomposition based on nuclear norm to impute scMO-seq data and single-cell RNA-sequencing (scRNA-seq)data with different stages, tissues or conditions. Furthermore, four sets of simulated and two sets of real scRNA-seq data from mouse embryonic stem cells and hepatocellular carcinoma, respectively, are used to carry out numerical experiments and compared with other six published methods. Error accuracy and clustering results demonstrate the effectiveness of proposed method. Moreover, we clearly identify two cell subpopulations after imputing the real scMO-seq data from hepatocellular carcinoma. Further, Gene Ontology identifies 7 genes in Bile secretion pathway, which is related to metabolism in hepatocellular carcinoma. The survival analysis using the database TCGA also show that two cell subpopulations after imputing have distinguished survival rates.

中文翻译:

scLRTD : 一种新的低秩张量分解方法,用于在单细胞多组学测序数据中输入缺失值

随着单细胞测序技术的成功应用,产生了大量的单细胞多组学测序(scMO-seq)数据,使研究人员能够研究单个细胞之间的异质性。单细胞数据分析中的一个突出问题是辍学的普遍性,这是由于实验期间扩增失败引起的。有必要开发有效的方法来估算缺失值。与插补单一类型单细胞数据的一般方法不同,我们提出了一种插补方法,称为 scLRTD,使用基于核范数的低秩张量分解来插补 scMO-seq 数据和单细胞 RNA 测序 (scRNA-seq) 数据具有不同的阶段、组织或条件。此外,分别使用来自小鼠胚胎干细胞和肝细胞癌的四组模拟和两组真实 scRNA-seq 数据进行数值实验,并与其他六种已发表的方法进行比较。误差精度和聚类结果证明了所提方法的有效性。此外,在输入来自肝细胞癌的真实 scMO-seq 数据后,我们清楚地识别了两个细胞亚群。此外,Gene Ontology 确定了 7 个胆汁分泌途径中的基因,这些基因与肝细胞癌的代谢有关。使用数据库 TCGA 进行的生存分析也表明,插补后的两个细胞亚群具有不同的生存率。误差精度和聚类结果证明了所提方法的有效性。此外,在输入来自肝细胞癌的真实 scMO-seq 数据后,我们清楚地识别了两个细胞亚群。此外,Gene Ontology 确定了 7 个胆汁分泌途径中的基因,这些基因与肝细胞癌的代谢有关。使用数据库 TCGA 进行的生存分析也表明,插补后的两个细胞亚群具有不同的生存率。误差精度和聚类结果证明了所提方法的有效性。此外,在输入来自肝细胞癌的真实 scMO-seq 数据后,我们清楚地识别了两个细胞亚群。此外,Gene Ontology 确定了 7 个胆汁分泌途径中的基因,这些基因与肝细胞癌的代谢有关。使用数据库 TCGA 进行的生存分析也表明,插补后的两个细胞亚群具有不同的生存率。
更新日期:2020-09-22
down
wechat
bug