当前位置: X-MOL 学术BMC Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Model-based clustering for flow and mass cytometry data with clinical information.
BMC Bioinformatics ( IF 2.9 ) Pub Date : 2020-09-17 , DOI: 10.1186/s12859-020-03671-7
Ko Abe 1 , Kodai Minoura 1, 2 , Yuka Maeda 3 , Hiroyoshi Nishikawa 2, 3 , Teppei Shimamura 1
Affiliation  

High-dimensional flow cytometry and mass cytometry allow systemic-level characterization of more than 10 protein profiles at single-cell resolution and provide a much broader landscape in many biological applications, such as disease diagnosis and prediction of clinical outcome. When associating clinical information with cytometry data, traditional approaches require two distinct steps for identification of cell populations and statistical test to determine whether the difference between two population proportions is significant. These two-step approaches can lead to information loss and analysis bias. We propose a novel statistical framework, called LAMBDA (Latent Allocation Model with Bayesian Data Analysis), for simultaneous identification of unknown cell populations and discovery of associations between these populations and clinical information. LAMBDA uses specified probabilistic models designed for modeling the different distribution information for flow or mass cytometry data, respectively. We use a zero-inflated distribution for the mass cytometry data based the characteristics of the data. A simulation study confirms the usefulness of this model by evaluating the accuracy of the estimated parameters. We also demonstrate that LAMBDA can identify associations between cell populations and their clinical outcomes by analyzing real data. LAMBDA is implemented in R and is available from GitHub ( https://github.com/abikoushi/lambda ).

中文翻译:


基于模型的流式和质谱流式细胞术数据与临床信息的聚类。



高维流式细胞术和质谱流式细胞仪可以在单细胞分辨率下对 10 多种蛋白质谱进行系统级表征,并在许多生物应用中提供更广阔的前景,例如疾病诊断和临床结果预测。当将临床信息与细胞计数数据相关联时,传统方法需要两个不同的步骤来识别细胞群并进行统计测试以确定两个细胞群比例之间的差异是否显着。这两个步骤的方法可能会导致信息丢失和分析偏差。我们提出了一种新颖的统计框架,称为 LAMBDA(贝叶斯数据分析的潜在分配模型),用于同时识别未知细胞群体并发现这些群体与临床信息之间的关联。 LAMBDA 使用指定的概率模型,分别为流式或质谱流式细胞术数据的不同分布信息建模。我们根据数据的特征对质谱流式数据使用零膨胀分布。模拟研究通过评估估计参数的准确性证实了该模型的有用性。我们还证明 LAMBDA 可以通过分析真实数据来识别细胞群与其临床结果之间的关联。 LAMBDA 在 R 中实现,可从 GitHub (https://github.com/abikoushi/lambda) 获取。
更新日期:2020-09-18
down
wechat
bug