当前位置: X-MOL 学术J. Comput. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep Learning of Sequence Patterns for CCCTC-Binding Factor-Mediated Chromatin Loop Formation
Journal of Computational Biology ( IF 1.4 ) Pub Date : 2021-02-04 , DOI: 10.1089/cmb.2020.0225
Shuzhen Kuang 1, 2 , Liangjiang Wang 1
Affiliation  

The three-dimensional (3D) organization of the human genome is of crucial importance for gene regulation, and the CCCTC-binding factor (CTCF) plays an important role in chromatin interactions. However, it is still unclear what sequence patterns in addition to CTCF motif pairs determine chromatin loop formation. To discover the underlying sequence patterns, we have developed a deep learning model, called DeepCTCFLoop, to predict whether a chromatin loop can be formed between a pair of convergent or tandem CTCF motifs using only the DNA sequences of the motifs and their flanking regions. Our results suggest that DeepCTCFLoop can accurately distinguish the CTCF motif pairs forming chromatin loops from the ones not forming loops. It significantly outperforms CTCF-MP, a machine learning model based on word2vec and boosted trees, when using DNA sequences only. Furthermore, we show that DNA motifs binding to several transcription factors, including ZNF384, ZNF263, ASCL1, SP1, and ZEB1, may constitute the complex sequence patterns for CTCF-mediated chromatin loop formation. DeepCTCFLoop has also been applied to disease-associated sequence variants to identify candidates that may disrupt chromatin loop formation. Therefore, our results provide useful information for understanding the mechanism of 3D genome organization and may also help annotate and prioritize the noncoding sequence variants associated with human diseases.

中文翻译:

深度学习 CCCTC 结合因子介导的染色质环形成的序列模式

人类基因组的三维 (3D) 组织对于基因调控至关重要,而 CCCTC 结合因子 (CTCF) 在染色质相互作用中起着重要作用。然而,目前尚不清楚除了 CTCF 基序对之外还有哪些序列模式决定了染色质环的形成。为了发现潜在的序列模式,我们开发了一种称为 DeepCTCFLoop 的深度学习模型,以仅使用基序的 DNA 序列及其侧翼区域来预测是否可以在一对会聚或串联 CTCF 基序之间形成染色质环。我们的结果表明,DeepCTCFLoop 可以准确地区分形成染色质环的 CTCF 基序对和不形成环的 CTCF 基序对。它明显优于 CTCF-MP,这是一种基于 word2vec 和 boosted 树的机器学习模型,仅使用 DNA 序列时。此外,我们表明与几种转录因子(包括 ZNF384、ZNF263、ASCL1、SP1 和 ZEB1)结合的 DNA 基序可能构成 CTCF 介导的染色质环形成的复杂序列模式。DeepCTCFLoop 也已应用于疾病相关的序列变体,以识别可能破坏染色质环形成的候选者。因此,我们的结果为理解 3D 基因组组织的机制提供了有用的信息,也可能有助于注释和优先考虑与人类疾病相关的非编码序列变异。DeepCTCFLoop 也已应用于疾病相关的序列变体,以识别可能破坏染色质环形成的候选者。因此,我们的结果为理解 3D 基因组组织的机制提供了有用的信息,也可能有助于注释和优先考虑与人类疾病相关的非编码序列变异。DeepCTCFLoop 也已应用于疾病相关的序列变体,以识别可能破坏染色质环形成的候选者。因此,我们的结果为理解 3D 基因组组织的机制提供了有用的信息,也可能有助于注释和优先考虑与人类疾病相关的非编码序列变异。
更新日期:2021-02-05
down
wechat
bug