当前位置: X-MOL 学术Cytom. Part A › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
PeacoQC: Peak-based selection of high quality cytometry data
Cytometry Part A ( IF 2.5 ) Pub Date : 2021-09-22 , DOI: 10.1002/cyto.a.24501
Annelies Emmaneel 1, 2 , Katrien Quintelier 1, 2, 3 , Dorine Sichien 4, 5 , Paulina Rybakowska 6 , Concepción Marañón 6 , Marta E Alarcón-Riquelme 6, 7 , Gert Van Isterdael 8, 9 , Sofie Van Gassen 1, 2 , Yvan Saeys 1, 2
Affiliation  

In cytometry analysis, a large number of markers is measured for thousands or millions of cells, resulting in high-dimensional datasets. During the measurement of these samples, erroneous events can occur such as clogs, speed changes, slow uptake of the sample etc., which can influence the downstream analysis and can even lead to false discoveries. As these issues can be difficult to detect manually, an automated approach is recommended. In order to filter these erroneous events out, we created a novel quality control algorithm, Peak Extraction And Cleaning Oriented Quality Control (PeacoQC), that allows for automated cleaning of cytometry data. The algorithm will determine density peaks per channel on which it will remove low quality events based on their position in the isolation tree and on their mean absolute deviation distance to these density peaks. To evaluate PeacoQC's cleaning capability, it was compared to three other existing quality control algorithms (flowAI, flowClean and flowCut) on a wide variety of datasets. In comparison to the other algorithms, PeacoQC was able to filter out all different types of anomalies in flow, mass and spectral cytometry data, while the other methods struggled with at least one type. In the quantitative comparison, PeacoQC obtained the highest median balanced accuracy and a similar running time compared to the other algorithms while having a better scalability for large files. To ensure that the parameters chosen in the PeacoQC algorithm are robust, the cleaning tool was run on 16 public datasets. After inspection, only one sample was found where the parameters should be further optimized. The other 15 datasets were analyzed correctly indicating a robust parameter choice. Overall, we present a fast and accurate quality control algorithm that outperforms existing tools and ensures high-quality data that can be used for further downstream analysis. An R implementation is available.

中文翻译:

PeacoQC:基于峰的高质量流式细胞仪数据选择

在细胞计数分析中,对数千或数百万个细胞测量大量标记,从而产生高维数据集。在这些样品的测量过程中,可能会发生堵塞、速度变化、样品吸收缓慢等错误事件,从而影响下游分析,甚至可能导致错误发现。由于这些问题可能难以手动检测,因此建议使用自动化方法。为了过滤掉这些错误事件,我们创建了一种新的质量控制算法,即峰提取和清洁导向质量控制 (PeacoQC),它允许自动清洁细胞计数数据。该算法将根据低质量事件在隔离树中的位置以及它们到这些密度峰值的平均绝对偏差距离来确定每个通道的密度峰值。为了评估 PeacoQC 的清洁能力,我们在各种数据集上将其与其他三种现有的质量控制算法(flowAI、flowClean 和 flowCut)进行了比较。与其他算法相比,PeacoQC 能够过滤掉流式、质量和光谱流式细胞仪数据中所有不同类型的异常,而其他方法至少在一种类型上遇到了困难。在定量比较中,PeacoQC 获得了最高的中值平衡精度和与其他算法相似的运行时间,同时对大文件具有更好的可扩展性。为确保 PeacoQC 算法中选择的参数稳健,清理工具在 16 个公共数据集上运行。经过检查,仅发现一个样本需要进一步优化参数。正确分析了其他 15 个数据集,表明了稳健的参数选择。总体而言,我们提出了一种快速准确的质量控制算法,该算法优于现有工具,并确保可用于进一步下游分析的高质量数据。一个 R 实现是可用的。我们提出了一种快速准确的质量控制算法,该算法优于现有工具,并确保可用于进一步下游分析的高质量数据。一个 R 实现是可用的。我们提出了一种快速准确的质量控制算法,该算法优于现有工具,并确保可用于进一步下游分析的高质量数据。一个 R 实现是可用的。
更新日期:2021-09-22
down
wechat
bug