当前位置: X-MOL 学术Epigenet. Chromatin › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Integrative prediction of gene expression with chromatin accessibility and conformation data.
Epigenetics & Chromatin ( IF 4.2 ) Pub Date : 2020-02-06 , DOI: 10.1186/s13072-020-0327-0
Florian Schmidt 1, 2, 3, 4 , Fabian Kern 1, 3, 5 , Marcel H Schulz 1, 2, 3, 6, 7
Affiliation  

BACKGROUND Enhancers play a fundamental role in orchestrating cell state and development. Although several methods have been developed to identify enhancers, linking them to their target genes is still an open problem. Several theories have been proposed on the functional mechanisms of enhancers, which triggered the development of various methods to infer promoter-enhancer interactions (PEIs). The advancement of high-throughput techniques describing the three-dimensional organization of the chromatin, paved the way to pinpoint long-range PEIs. Here we investigated whether including PEIs in computational models for the prediction of gene expression improves performance and interpretability. RESULTS We have extended our [Formula: see text] framework to include DNA contacts deduced from chromatin conformation capture experiments and compared various methods to determine PEIs using predictive modelling of gene expression from chromatin accessibility data and predicted transcription factor (TF) motif data. We designed a novel machine learning approach that allows the prioritization of TFs binding to distal loop and promoter regions with respect to their importance for gene expression regulation. Our analysis revealed a set of core TFs that are part of enhancer-promoter loops involving YY1 in different cell lines. CONCLUSION We present a novel approach that can be used to prioritize TFs involved in distal and promoter-proximal regulatory events by integrating chromatin accessibility, conformation, and gene expression data. We show that the integration of chromatin conformation data can improve gene expression prediction and aids model interpretability.

中文翻译:


基因表达与染色质可及性和构象数据的综合预测。



背景增强子在协调细胞状态和发育中发挥着基础作用。尽管已经开发了几种方法来识别增强子,但将它们与其靶基因联系起来仍然是一个悬而未决的问题。关于增强子的功能机制已经提出了多种理论,这引发了各种推断启动子-增强子相互作用(PEI)方法的发展。描述染色质三维​​组织的高通量技术的进步,为精确定位长距离 PEI 铺平了道路。在这里,我们研究了在基因表达预测的计算模型中包含 PEI 是否可以提高性能和可解释性。结果我们扩展了我们的[公式:参见文本]框架,以包括从染色质构象捕获实验推导的 DNA 接触,并比较了使用染色质可及性数据和预测转录因子 (TF) 基序数据的基因表达预测模型来确定 PEIs 的各种方法。我们设计了一种新颖的机器学习方法,根据其对基因表达调控的重要性,可以优先考虑与远端环和启动子区域结合的 TF。我们的分析揭示了一组核心转录因子,它们是不同细胞系中涉及 YY1 的增强子-启动子环的一部分。结论 我们提出了一种新方法,可通过整合染色质可及性、构象和基因表达数据来优先考虑参与远端和启动子近端调控事件的 TF。我们表明,染色质构象数据的整合可以改善基因表达预测并有助于模型的可解释性。
更新日期:2020-04-22
down
wechat
bug