当前位置: X-MOL 学术Brief. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A statistical framework for predicting critical regions of p53-dependent enhancers.
Briefings in Bioinformatics ( IF 6.8 ) Pub Date : 2020-05-11 , DOI: 10.1093/bib/bbaa053
Xiaohui Niu , Kaixuan Deng , Lifen Liu , Kun Yang , Xuehai Hu

P53 is the 'guardian of the genome' and is responsible for regulating cell cycle and apoptosis. The genomic p53 binding regions, where activating transcriptional factors and cofactors like p300 simultaneously bind, are called 'p53-dependent enhancers', which play an important role in tumorigenesis. Current experimental assays generally provide a broad peak of each enhancer element, leaving our knowledge about critical enhancer regions (CERs) limited. Under the inspiration of enhancer dissection by CRISPR-Cas9 screen library on genome-wide p53 binding sites, here we introduce a statistical framework called 'Computational CRISPR Strategy' (CCS), to predict whether a given DNA fragment will be a p53-dependent CER by employing 7-mer as feature extractions along with random forest as the regressor. When training on a p53 CRISPR enhancer dataset, CCS not only accurately fitted the top-ranked enriched single guide RNAs (sgRNAs) but also successfully reproduced two known CERs that were validated by experiments. When applying it to an independent testing dataset on a tilling of a 2K-b genomic region of CRISPR-deCDKN1A-Lib, the trained model shows great generalizability by identifying a CER containing five top-ranked sgRNAs. A feature importance analysis further indicates that top-ranked 7-mers are mapped onto informative TF motifs including POU5F1 and SOX5, which are differentially enriched in p53-dependent CERs and are potential factors to make a general p53 binding site to form a p53-dependent CER, providing the interpretability of the trained model. Our results demonstrate that CCS is an alternative way of the CRISPR experiment to screen the genome for mapping p53-dependent CERs.

中文翻译:


用于预测 p53 依赖性增强子关键区域的统计框架。



P53 是“基因组的守护者”,负责调节细胞周期和细胞凋亡。基因组 p53 结合区域被称为“p53 依赖性增强子”,其中激活转录因子和 p300 等辅助因子同时结合,在肿瘤发生中发挥重要作用。目前的实验测定通常提供每个增强子元件的宽峰,这使得我们对关键增强子区域(CER)的了解有限。受 CRISPR-Cas9 筛选文库对全基因组 p53 结合位点增强子解剖的启发,我们在此引入一种称为“计算 CRISPR 策略”(CCS) 的统计框架,以预测给定的 DNA 片段是否是 p53 依赖性 CER通过使用 7-mer 作为特征提取以及随机森林作为回归器。在 p53 CRISPR 增强子数据集上进行训练时,CCS 不仅准确地拟合了排名靠前的富集单向导 RNA (sgRNA),而且还成功复制了两个经过实验验证的已知 CER。当将其应用于对 CRISPR-deCDKN1A-Lib 的 2K-b 基因组区域进行耕作的独立测试数据集时,经过训练的模型通过识别包含五个排名最高的 sgRNA 的 CER 表现出良好的通用性。特征重要性分析进一步表明,排名靠前的 7 聚体被映射到信息丰富的 TF 基序上,包括 POU5F1 和 SOX5,这些基序在 p53 依赖性 CER 中差异富集,并且是形成通用 p53 结合位点以形成 p53 依赖性 CER 的潜在因素。 CER,提供训练模型的可解释性。我们的结果表明,CCS 是 CRISPR 实验筛选基因组以绘制 p53 依赖性 CER 的替代方法。
更新日期:2020-05-11
down
wechat
bug