当前位置: X-MOL 学术Mol. Ther. Nucl. Acids › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Machine Learning of Single-Cell Transcriptome Highly Identifies mRNA Signature by Comparing F-Score Selection with DGE Analysis.
Molecular Therapy - Nucleic Acids ( IF 8.8 ) Pub Date : 2020-02-13 , DOI: 10.1016/j.omtn.2020.02.004
Pengfei Liang 1 , Wuritu Yang 1 , Xing Chen 1 , Chunshen Long 1 , Lei Zheng 1 , Hanshuang Li 1 , Yongchun Zuo 1
Affiliation  

Human preimplantation development is a complex process involving dramatic changes in transcriptional architecture. For a better understanding of their time-spatial development, it is indispensable to identify key genes. Although the single-cell RNA sequencing (RNA-seq) techniques could provide detailed clustering signatures, the identification of decisive factors remains difficult. Additionally, it requires high experimental cost and a long experimental period. Thus, it is highly desired to develop computational methods for identifying effective genes of development signature. In this study, we first developed a predictor called EmPredictor to identify developmental stages of human preimplantation embryogenesis. First, we compared the F-score of feature selection algorithms with differential gene expression (DGE) analysis to find specific signatures of the development stage. In addition, by training the support vector machine (SVM), four types of signature subsets were comprehensively discussed. The prediction results showed that a feature subset with 1,881 genes from the F-score algorithm obtained the best predictive performance, which achieved the highest accuracy of 93.3% on the cross-validation set. Further function enrichment demonstrated that the gene set selected by the feature selection method was involved in more development-related pathways and cell fate determination biomarkers. This indicates that the F-score algorithm should be preferentially proposed for detecting key genes of multi-period data in mammalian early development.



中文翻译:

单细胞转录组的机器学习通过将F得分选择与DGE分析进行比较来高度识别mRNA签名。

人类植入前发育是一个复杂的过程,涉及转录结构的巨大变化。为了更好地了解它们的时空发展,确定关键基因是必不可少的。尽管单细胞RNA测序(RNA-seq)技术可以提供详细的聚类特征,但是确定决定性因素仍然很困难。另外,它需要较高的实验成本和较长的实验周期。因此,非常需要开发用于鉴定发育特征的有效基因的计算方法。在这项研究中,我们首先开发了一种称为EmPredictor的预测因子,以鉴定人类植入前胚胎发生的发育阶段。第一,我们将特征选择算法的F得分与差异基因表达(DGE)分析进行了比较,以找到开发阶段的特定特征。此外,通过训练支持向量机(SVM),全面讨论了四种类型的签名子集。预测结果表明,来自F评分算法的具有1,881个基因的特征子集获得了最佳的预测性能,在交叉验证集上达到了93.3%的最高准确度。进一步的功能富集表明,通过特征选择方法选择的基因集参与了更多的发育相关途径和细胞命运确定生物标志物。这表明应该优先提出F分数算法,以检测哺乳动物早期发育中多周期数据的关键基因。

更新日期:2020-02-13
down
wechat
bug