当前位置: X-MOL 学术Nat. Biotechnol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Mapping single-cell data to reference atlases by transfer learning
Nature Biotechnology ( IF 46.9 ) Pub Date : 2021-08-30 , DOI: 10.1038/s41587-021-01001-7
Mohammad Lotfollahi 1, 2 , Mohsen Naghipourfar 1 , Malte D Luecken 1 , Matin Khajavi 1 , Maren Büttner 1 , Marco Wagenstetter 1 , Žiga Avsec 3 , Adam Gayoso 4 , Nir Yosef 4, 5, 6, 7 , Marta Interlandi 8 , Sergei Rybakov 1, 9 , Alexander V Misharin 10 , Fabian J Theis 1, 2, 9
Affiliation  

Large single-cell atlases are now routinely generated to serve as references for analysis of smaller-scale studies. Yet learning from reference data is complicated by batch effects between datasets, limited availability of computational resources and sharing restrictions on raw data. Here we introduce a deep learning strategy for mapping query datasets on top of a reference called single-cell architectural surgery (scArches). scArches uses transfer learning and parameter optimization to enable efficient, decentralized, iterative reference building and contextualization of new datasets with existing references without sharing raw data. Using examples from mouse brain, pancreas, immune and whole-organism atlases, we show that scArches preserves biological state information while removing batch effects, despite using four orders of magnitude fewer parameters than de novo integration. scArches generalizes to multimodal reference mapping, allowing imputation of missing modalities. Finally, scArches retains coronavirus disease 2019 (COVID-19) disease variation when mapping to a healthy reference, enabling the discovery of disease-specific cell states. scArches will facilitate collaborative projects by enabling iterative construction, updating, sharing and efficient use of reference atlases.



中文翻译:

通过迁移学习将单细胞数据映射到参考图谱

现在通常会生成大型单细胞图谱,作为小规模研究分析的参考。然而,由于数据集之间的批处理效应、计算资源的有限可用性以及对原始数据的共享限制,从参考数据中学习变得复杂。在这里,我们介绍了一种深度学习策略,用于在称为单细胞建筑手术 (scArches) 的参考之上映射查询数据集。scArches 使用迁移学习和参数优化来实现高效、分散、迭代的参考构建和新数据集与现有参考的上下文化,而无需共享原始数据。使用来自小鼠大脑、胰腺、免疫和整体有机体图谱的示例,我们表明 scArches 在消除批次效应的同时保留了生物状态信息,尽管使用的参数比从头集成少四个数量级。scArches 推广到多模态参考映射,允许对缺失的模态进行插补。最后,当映射到健康参考时,scArches 保留了 2019 年冠状病毒病 (COVID-19) 疾病变异,从而能够发现疾病特异性细胞状态。scArches 将通过实现参考图集的迭代构建、更新、共享和有效使用来促进协作项目。

更新日期:2021-08-30
down
wechat
bug