当前位置: X-MOL 学术Comput. Biol. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
scAuto as a comprehensive framework for single-cell chromatin accessibility data analysis
Computers in Biology and Medicine ( IF 7.7 ) Pub Date : 2024-02-29 , DOI: 10.1016/j.compbiomed.2024.108230
Meiqin Gong , Yun Yu , Zixuan Wang , Junming Zhang , Xiongyi Wang , Cheng Fu , Yongqing Zhang , Xiaodong Wang

Interpreting single-cell chromatin accessibility data is crucial for understanding intercellular heterogeneity regulation. Despite the progress in computational methods for analyzing this data, there is still a lack of a comprehensive analytical framework and a user-friendly online analysis tool. To fill this gap, we developed a pre-trained deep learning-based framework, single-cell auto-correlation transformers (scAuto), to overcome the challenge. Following DNABERT’s methodology of pre-training and fine-tuning, scAuto learns a general understanding of DNA sequence’s grammar by being pre-trained on unlabeled human genome via self-supervision; it is then transferred to the single-cell chromatin accessibility analysis task of scATAC-seq data for supervised fine-tuning. We extensively validated scAuto on the Buenrostro2018 dataset, demonstrating its superior performance on chromatin accessibility prediction, single-cell clustering, and data denoising. Based on scAuto, we further developed an interactive web server for single-cell chromatin accessibility data analysis. It integrates tutorial-style interfaces for those with limited programming skills. The platform is accessible at . To our knowledge, this work is expected to help analyze single-cell chromatin accessibility data and facilitate the development of precision medicine.

中文翻译:

scAuto 作为单细胞染色质可及性数据分析的综合框架

解释单细胞染色质可及性数据对于理解细胞间异质性调控至关重要。尽管分析这些数据的计算方法取得了进展,但仍然缺乏全面的分析框架和用户友好的在线分析工具。为了填补这一空白,我们开发了一个基于深度学习的预训练框架,即单细胞自相关变压器(scAuto),以克服这一挑战。 scAuto遵循DNABERT的预训练和微调方法,通过自我监督对未标记的人类基因组进行预训练,从而学习对DNA序列语法的总体理解;然后将其转移到 scATAC-seq 数据的单细胞染色质可及性分析任务中进行监督微调。我们在 Buenrostro2018 数据集上广泛验证了 scAuto,证明了其在染色质可及性预测、单细胞聚类和数据去噪方面的卓越性能。基于 scAuto,我们进一步开发了用于单细胞染色质可及性数据分析的交互式网络服务器。它为那些编程技能有限的人集成了教程式界面。该平台可通过以下网址访问。据我们所知,这项工作有望帮助分析单细胞染色质可及性数据并促进精准医学的发展。
更新日期:2024-02-29
down
wechat
bug