当前位置: X-MOL 学术Nucleic Acids Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Uncovering tissue-specific binding features from differential deep learning.
Nucleic Acids Research ( IF 14.9 ) Pub Date : 2020-03-18 , DOI: 10.1093/nar/gkaa009
Mike Phuycharoen 1 , Peyman Zarrineh 2 , Laure Bridoux 3 , Shilu Amin 3 , Marta Losa 4 , Ke Chen 1 , Nicoletta Bobola 3 , Magnus Rattray 2
Affiliation  

Transcription factors (TFs) can bind DNA in a cooperative manner, enabling a mutual increase in occupancy. Through this type of interaction, alternative binding sites can be preferentially bound in different tissues to regulate tissue-specific expression programmes. Recently, deep learning models have become state-of-the-art in various pattern analysis tasks, including applications in the field of genomics. We therefore investigate the application of convolutional neural network (CNN) models to the discovery of sequence features determining cooperative and differential TF binding across tissues. We analyse ChIP-seq data from MEIS, TFs which are broadly expressed across mouse branchial arches, and HOXA2, which is expressed in the second and more posterior branchial arches. By developing models predictive of MEIS differential binding in all three tissues, we are able to accurately predict HOXA2 co-binding sites. We evaluate transfer-like and multitask approaches to regularizing the high-dimensional classification task with a larger regression dataset, allowing for the creation of deeper and more accurate models. We test the performance of perturbation and gradient-based attribution methods in identifying the HOXA2 sites from differential MEIS data. Our results show that deep regularized models significantly outperform shallow CNNs as well as k-mer methods in the discovery of tissue-specific sites bound in vivo.

中文翻译:

从差异深度学习中发现组织特异性结合特征。

转录因子(TFs)可以以协作方式结合DNA,从而使占用率相互增加。通过这种类型的相互作用,可以将替代的结合位点优先结合在不同的组织中,以调节组织特异性表达程序。最近,深度学习模型已成为各种模式分析任务中的最新技术,包括基因组学领域的应用。因此,我们研究了卷积神经网络(CNN)模型在确定序列特征的过程中的应用,这些序列特征确定了跨组织的协作和差异TF绑定。我们分析了来自MEIS的ChIP-seq数据,在小鼠branch弓上广泛表达的TF和在第二和更多后branch弓中表达的HOXA2。通过开发可预测所有三个组织中MEIS差异结合的模型,我们能够准确预测HOXA2共结合位点。我们评估类似转移的多任务方法,以使用较大的回归数据集对高维分类任务进行正则化,从而创建更深,更准确的模型。我们在从差分MEIS数据中识别HOXA2网站时,测试了基于摄动和基于梯度的归因方法的性能。我们的结果表明,在发现体内结合的组织特异性位点时,深度正则化模型显着优于浅层CNN和k-mer方法。允许创建更深入,更准确的模型。我们在从差分MEIS数据中识别HOXA2网站时,测试了基于摄动和基于梯度的归因方法的性能。我们的结果表明,在发现体内结合的组织特异性位点时,深度正则化模型显着优于浅层CNN和k-mer方法。允许创建更深入,更准确的模型。我们在从差分MEIS数据中识别HOXA2网站时,测试了基于摄动和基于梯度的归因方法的性能。我们的结果表明,在发现体内结合的组织特异性位点时,深度正则化模型显着优于浅层CNN和k-mer方法。
更新日期:2020-03-02
down
wechat
bug