当前位置: X-MOL 学术BMC Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Comparative analysis of ChIP-exo peak-callers: impact of data quality, read duplication and binding subtypes.
BMC Bioinformatics ( IF 3 ) Pub Date : 2020-02-21 , DOI: 10.1186/s12859-020-3403-3
Vasudha Sharma 1 , Sharmistha Majumdar 1
Affiliation  

BACKGROUND ChIP (Chromatin immunoprecipitation)-exo has emerged as an important and versatile improvement over conventional ChIP-seq as it reduces the level of noise, maps the transcription factor (TF) binding location in a very precise manner, upto single base-pair resolution, and enables binding mode prediction. Availability of numerous peak-callers for analyzing ChIP-exo reads has motivated the need to assess their performance and report which tool executes reasonably well for the task. RESULTS This study has focussed on comparing peak-callers that report direct binding events with those that report indirect binding events. The effect of strandedness of reads and duplication of data on the performance of peak-callers has been investigated. The number of peaks reported by each peak-caller is compared followed by a comparison of the annotated motifs present in the reported peaks. The significance of peaks is assessed based on the presence of a motif in top peaks. Indirect binding tools have been compared on the basis of their ability to identify annotated motifs and predict mode of protein-DNA interaction. CONCLUSION By studying the output of the peak-callers investigated in this study, it is concluded that the tools that use self-learning algorithms, i.e. the tools that estimate all the essential parameters from the aligned reads, perform better than the algorithms which require formation of peak-pairs. The latest tools that account for indirect binding of TFs appear to be an upgrade over the available tools, as they are able to reveal valuable information about the mode of binding in addition to direct binding. Furthermore, the quality of ChIP-exo reads have important consequences on the output of data analysis.

中文翻译:

ChIP-exo峰调用者的对比分析:数据质量,读取重复和结合亚型的影响。

背景技术ChIP(染色质免疫沉淀)-exo作为常规ChIP-seq的重要且用途广泛的改进而出现,因为它可以降低噪音水平,以非常精确的方式绘制转录因子(TF)结合位置,直至单个碱基对分辨率,并启用绑定模式预测。用于分析ChIP-exo读数的众多高峰调用者的出现,激发了评估其性能并报告哪种工具能够很好地完成任务的需求。结果本研究的重点是比较报告直接结合事件的峰调用者与报告间接结合事件的峰调用者。已经研究了读取的搁浅和数据重复对峰值调用者性能的影响。比较每个峰调用者报告的峰数,然后比较报告的峰中存在的带注释的基序。根据最高峰中是否存在基序来评估峰的重要性。间接结合工具已经基于其识别带注释的基序和预测蛋白质-DNA相互作用模式的能力进行了比较。结论通过研究本研究中峰调用者的输出,可以得出结论,使用自学习算法的工具(即从比对读数估算所有基本参数的工具)比需要形成的算法的性能更好。峰对。间接绑定TF的最新工具似乎是对可用工具的升级,因为除了直接绑定之外,他们还可以揭示有关绑定模式的有价值的信息。此外,ChIP-exo读取的质量对数据分析的输出有重要影响。
更新日期:2020-02-21
down
wechat
bug