当前位置: X-MOL 学术Genome Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Biologically relevant transfer learning improves transcription factor binding prediction
Genome Biology ( IF 12.3 ) Pub Date : 2021-09-27 , DOI: 10.1186/s13059-021-02499-5
Gherman Novakovsky 1, 2 , Manu Saraswat 1, 2 , Oriol Fornes 1, 2 , Sara Mostafavi 1, 2, 3, 4 , Wyeth W Wasserman 1, 2
Affiliation  

Deep learning has proven to be a powerful technique for transcription factor (TF) binding prediction but requires large training datasets. Transfer learning can reduce the amount of data required for deep learning, while improving overall model performance, compared to training a separate model for each new task. We assess a transfer learning strategy for TF binding prediction consisting of a pre-training step, wherein we train a multi-task model with multiple TFs, and a fine-tuning step, wherein we initialize single-task models for individual TFs with the weights learned by the multi-task model, after which the single-task models are trained at a lower learning rate. We corroborate that transfer learning improves model performance, especially if in the pre-training step the multi-task model is trained with biologically relevant TFs. We show the effectiveness of transfer learning for TFs with ~ 500 ChIP-seq peak regions. Using model interpretation techniques, we demonstrate that the features learned in the pre-training step are refined in the fine-tuning step to resemble the binding motif of the target TF (i.e., the recipient of transfer learning in the fine-tuning step). Moreover, pre-training with biologically relevant TFs allows single-task models in the fine-tuning step to learn useful features other than the motif of the target TF. Our results confirm that transfer learning is a powerful technique for TF binding prediction.

中文翻译:

生物学相关的迁移学习改进了转录因子结合预测

深度学习已被证明是一种强大的转录因子 (TF) 结合预测技术,但需要大量的训练数据集。与为每个新任务训练单独的模型相比,迁移学习可以减少深度学习所需的数据量,同时提高整体模型性能。我们评估了用于 TF 绑定预测的迁移学习策略,包括预训练步骤,其中我们训练具有多个 TF 的多任务模型,以及微调步骤,其中我们使用权重初始化单个 TF 的单任务模型由多任务模型学习,之后以较低的学习率训练单任务模型。我们证实迁移学习提高了模型性能,特别是如果在预训练步骤中,多任务模型使用生物学相关的 TF 进行训练。我们展示了迁移学习对具有约 500 个 ChIP-seq 峰值区域的 TF 的有效性。使用模型解释技术,我们证明了在预训练步骤中学习的特征在微调步骤中被细化,以类似于目标 TF 的绑定主题(即微调步骤中迁移学习的接受者)。此外,使用生物学相关的 TF 进行预训练允许单任务模型在微调步骤中学习目标 TF 主题以外的有用特征。我们的结果证实,迁移学习是一种强大的 TF 结合预测技术。我们证明了在预训练步骤中学习的特征在微调步骤中被细化,以类似于目标 TF 的绑定主题(即,在微调步骤中迁移学习的接受者)。此外,使用生物学相关的 TF 进行预训练允许单任务模型在微调步骤中学习目标 TF 主题以外的有用特征。我们的结果证实,迁移学习是一种强大的 TF 结合预测技术。我们证明了在预训练步骤中学习的特征在微调步骤中被细化,以类似于目标 TF 的绑定主题(即,在微调步骤中迁移学习的接受者)。此外,使用生物学相关的 TF 进行预训练允许单任务模型在微调步骤中学习目标 TF 主题以外的有用特征。我们的结果证实,迁移学习是一种强大的 TF 结合预测技术。
更新日期:2021-09-28
down
wechat
bug