Proceedings of the National Academy of Sciences of the United States of America ( IF 9.412 ) Pub Date : 2021-04-13 , DOI: 10.1073/pnas.2023070118 Kevin E. Wu, Kathryn E. Yost, Howard Y. Chang, James Zou
Simultaneous profiling of multiomic modalities within a single cell is a grand challenge for single-cell biology. While there have been impressive technical innovations demonstrating feasibility—for example, generating paired measurements of single-cell transcriptome (single-cell RNA sequencing [scRNA-seq]) and chromatin accessibility (single-cell assay for transposase-accessible chromatin using sequencing [scATAC-seq])—widespread application of joint profiling is challenging due to its experimental complexity, noise, and cost. Here, we introduce BABEL, a deep learning method that translates between the transcriptome and chromatin profiles of a single cell. Leveraging an interoperable neural network model, BABEL can predict single-cell expression directly from a cell’s scATAC-seq and vice versa after training on relevant data. This makes it possible to computationally synthesize paired multiomic measurements when only one modality is experimentally available. Across several paired single-cell ATAC and gene expression datasets in human and mouse, we validate that BABEL accurately translates between these modalities for individual cells. BABEL also generalizes well to cell types within new biological contexts not seen during training. Starting from scATAC-seq of patient-derived basal cell carcinoma (BCC), BABEL generated single-cell expression that enabled fine-grained classification of complex cell states, despite having never seen BCC data. These predictions are comparable to analyses of experimental BCC scRNA-seq data for diverse cell types related to BABEL’s training data. We further show that BABEL can incorporate additional single-cell data modalities, such as protein epitope profiling, thus enabling translation across chromatin, RNA, and protein. BABEL offers a powerful approach for data exploration and hypothesis generation.
中文翻译:

BABEL可在单细胞分辨率下实现多基因组图谱之间的跨模态翻译[遗传学]
在单细胞内同时分析多组学模式对单细胞生物学是一个巨大的挑战。尽管已经出现了令人印象深刻的技术创新,证明了可行性,例如,生成单细胞转录组(单细胞RNA测序[scRNA-seq])和染色质可及性(用于转座酶可利用的染色质的单细胞测定,使用测序[scATAC]的配对测量-seq])-联合配置文件的广泛应用由于其实验复杂性,噪声和成本而具有挑战性。在这里,我们介绍BABEL,这是一种在单个细胞的转录组和染色质图谱之间进行翻译的深度学习方法。利用可互操作的神经网络模型,BABEL可以在训练相关数据后直接从细胞的scATAC-seq预测单细胞表达,反之亦然。当实验上只有一个模态可用时,这使得有可能计算合成多组学测量值。在人类和小鼠的几个成对的单细胞ATAC和基因表达数据集中,我们验证了BABEL可以准确地在单个细胞的这些模式之间进行翻译。BABEL还可以很好地概括训练期间未见的新生物学环境中的细胞类型。从患者源性基底细胞癌(BCC)的scATAC-seq开始,尽管从未见过BCC数据,但BABEL产生了单细胞表达,可对复杂细胞状态进行细粒度分类。这些预测与分析与BABEL训练数据有关的多种细胞类型的BCC scRNA-seq实验数据相当。我们进一步证明,BABEL可以合并其他单细胞数据模式,例如蛋白质表位分析,从而可以在染色质,RNA和蛋白质之间进行翻译。BABEL为数据探索和假设生成提供了一种强大的方法。