当前位置: X-MOL 学术BMC Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Theoretical characterisation of strand cross-correlation in ChIP-seq.
BMC Bioinformatics ( IF 3 ) Pub Date : 2020-09-22 , DOI: 10.1186/s12859-020-03729-6
Hayato Anzawa 1 , Hitoshi Yamagata 2 , Kengo Kinoshita 1, 2, 3, 4
Affiliation  

Strand cross-correlation profiles are used for both peak calling pre-analysis and quality control (QC) in chromatin immunoprecipitation followed by sequencing (ChIP-seq) analysis. Despite its potential for robust and accurate assessments of signal-to-noise ratio (S/N) because of its peak calling independence, it remains unclear what aspects of quality such strand cross-correlation profiles actually measure. We introduced a simple model to simulate the mapped read-density of ChIP-seq and then derived the theoretical maximum and minimum of cross-correlation coefficients between strands. The results suggest that the maximum coefficient of typical ChIP-seq samples is directly proportional to the number of total mapped reads and the square of the ratio of signal reads, and inversely proportional to the number of peaks and the length of read-enriched regions. Simulation analysis supported our results and evaluation using 790 ChIP-seq data obtained from the public database demonstrated high consistency between calculated cross-correlation coefficients and estimated coefficients based on the theoretical relations and peak calling results. In addition, we found that the mappability-bias-correction improved sensitivity, enabling differentiation of maximum coefficients from the noise level. Based on these insights, we proposed virtual S/N (VSN), a novel peak call-free metric for S/N assessment. We also developed PyMaSC, a tool to calculate strand cross-correlation and VSN efficiently. VSN achieved most consistent S/N estimation for various ChIP targets and sequencing read depths. Furthermore, we demonstrated that a combination of VSN and pre-existing peak calling results enable the estimation of the numbers of detectable peaks for posterior experiments and assess peak calling results. We present the first theoretical insights into the strand cross-correlation, and the results reveal the potential and the limitations of strand cross-correlation analysis. Our quality assessment framework using VSN provides peak call-independent QC and will help in the evaluation of peak call analysis in ChIP-seq experiments.

中文翻译:

ChIP-seq中链互相关的理论表征。

链互相关谱可用于染色质免疫沉淀中的峰调用前分析和质量控制(QC),然后进行测序(ChIP-seq)分析。尽管由于其峰值调用独立性而具有进行鲁棒且准确的信噪比(S / N)评估的潜力,但仍不清楚此类链互相关配置文件实际测量的质量方面。我们引入了一个简单的模型来模拟ChIP-seq的映射读取密度,然后得出链之间互相关系数的理论最大值和最小值。结果表明,典型ChIP-seq样本的最大系数与总映射读取的数目和信号读取比率的平方成正比,与峰的数量和读取富集区域的长度成反比。仿真分析支持了我们的结果,使用从公共数据库获得的790 ChIP-seq数据进行的评估表明,基于理论关系和峰调用结果,计算出的互相关系数与估计系数之间具有高度一致性。此外,我们发现可映射性偏倚校正提高了灵敏度,从而可以将最大系数与噪声水平区分开。基于这些见解,我们提出了虚拟S / N(VSN),这是一种用于S / N评估的新颖的无峰值峰值度量。我们还开发了PyMaSC,这是一种可有效计算链互相关和VSN的工具。VSN针对各种ChIP目标和测序读取深度实现了最一致的S / N估计。此外,我们证明了VSN和预先存在的峰调用结果的组合能够估算后验实验中可检测峰的数量,并评估峰调用结果。我们提出了关于链互相关的第一个理论见解,结果揭示了链互相关分析的潜力和局限性。我们使用VSN的质量评估框架提供了与峰调用无关的质量控制,将有助于在ChIP-seq实验中评估峰调用分析。结果揭示了链互相关分析的潜力和局限性。我们使用VSN的质量评估框架提供了与峰调用无关的质量控制,将有助于在ChIP-seq实验中评估峰调用分析。结果揭示了链互相关分析的潜力和局限性。我们使用VSN的质量评估框架提供了与峰调用无关的质量控制,将有助于在ChIP-seq实验中评估峰调用分析。
更新日期:2020-09-23
down
wechat
bug