当前位置: X-MOL 学术Methods › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SigEMD: A powerful method for differential gene expression analysis in single-cell RNA sequencing data
Methods ( IF 4.8 ) Pub Date : 2018-08-01 , DOI: 10.1016/j.ymeth.2018.04.017
Tianyu Wang , Sheida Nabavi

Differential gene expression analysis is one of the significant efforts in single cell RNA sequencing (scRNAseq) analysis to discover the specific changes in expression levels of individual cell types. Since scRNAseq exhibits multimodality, large amounts of zero counts, and sparsity, it is different from the traditional bulk RNA sequencing (RNAseq) data. The new challenges of scRNAseq data promote the development of new methods for identifying differentially expressed (DE) genes. In this study, we proposed a new method, SigEMD, that combines a data imputation approach, a logistic regression model and a nonparametric method based on the Earth Mover's Distance, to precisely and efficiently identify DE genes in scRNAseq data. The regression model and data imputation are used to reduce the impact of large amounts of zero counts, and the nonparametric method is used to improve the sensitivity of detecting DE genes from multimodal scRNAseq data. By additionally employing gene interaction network information to adjust the final states of DE genes, we further reduce the false positives of calling DE genes. We used simulated datasets and real datasets to evaluate the detection accuracy of the proposed method and to compare its performance with those of other differential expression analysis methods. Results indicate that the proposed method has an overall powerful performance in terms of precision in detection, sensitivity, and specificity.

中文翻译:

SigEMD:一种在单细胞 RNA 测序数据中进行差异基因表达分析的强大方法

差异基因表达分析是单细胞 RNA 测序 (scRNAseq) 分析中的一项重大工作,旨在发现单个细胞类型表达水平的特定变化。由于 scRNAseq 表现出多模态、大量零计数和稀疏性,因此它不同于传统的批量 RNA 测序 (RNAseq) 数据。scRNAseq 数据的新挑战促进了识别差异表达 (DE) 基因的新方法的发展。在这项研究中,我们提出了一种新方法 SigEMD,它结合了数据插补方法、逻辑回归模型和基于地球移动距离的非参数方法,以精确有效地识别 scRNAseq 数据中的 DE 基因。回归模型和数据插补用于减少大量零计数的影响,非参数方法用于提高从多模式 scRNAseq 数据中检测 DE 基因的灵敏度。通过额外使用基因相互作用网络信息来调整 DE 基因的最终状态,我们进一步减少了调用 DE 基因的误报。我们使用模拟数据集和真实数据集来评估所提出方法的检测精度,并将其性能与其他差异表达分析方法的性能进行比较。结果表明,所提出的方法在检测精度、灵敏度和特异性方面具有整体强大的性能。我们进一步减少了调用 DE 基因的误报。我们使用模拟数据集和真实数据集来评估所提出方法的检测精度,并将其性能与其他差异表达分析方法的性能进行比较。结果表明,所提出的方法在检测精度、灵敏度和特异性方面具有整体强大的性能。我们进一步减少了调用 DE 基因的误报。我们使用模拟数据集和真实数据集来评估所提出方法的检测精度,并将其性能与其他差异表达分析方法的性能进行比较。结果表明,所提出的方法在检测精度、灵敏度和特异性方面具有整体强大的性能。
更新日期:2018-08-01
down
wechat
bug