当前位置: X-MOL 学术IEEE/ACM Trans. Comput. Biol. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
G4detector: Convolutional Neural Network to Predict DNA G-quadruplexes.
IEEE/ACM Transactions on Computational Biology and Bioinformatics ( IF 4.5 ) Pub Date : 2021-04-19 , DOI: 10.1109/tcbb.2021.3073595
Mira Barshai , Alice Aubert , Yaron Orenstein

G-quadruplexes (G4s) are nucleic acid secondary structures that form within guanine-rich DNA or RNA sequences. G4 formation can affect chromatin architecture and gene regulation and has been associated with genomic instability, genetic diseases and cancer progression. The experimental data produced by the G4-seq experiment provides unprecedented details on G4 formation in the genome. Still, running the experimental protocol on a whole genome is an expensive and time-consuming process. Thus, it is highly desirable to have a computational method to predict G4 formation of new DNA sequences or whole genomes. Here, we present G4detector, a new method to predict G4s from DNA sequences based on a convolutional neural network. On top of the sequence information, we improved prediction accuracy by combining RNA secondary structure information. To train and test G4detector, we compiled novel high-throughput benchmarks over multiple species genomes measured by the G4-seq protocol. We show that G4detector outperforms extant methods for the same task on all benchmark datasets and is able to extrapolate human-trained measurements to various non-human species. The code and benchmarks are publicly available on github.com/OrensteinLab/G4detector.

中文翻译:

G4detector:用于预测DNA G四链体的卷积神经网络。

G-四链体(G4s)是在富含鸟嘌呤的DNA或RNA序列内形成的核酸二级结构。G4的形成会影响染色质的结构和基因调控,并与基因组不稳定,遗传疾病和癌症进展有关。G4-seq实验产生的实验数据提供了有关基因组中G4形成的空前细节。尽管如此,在整个基因组上运行实验方案仍然是一个昂贵且耗时的过程。因此,非常需要一种计算方法来预测新的DNA序列或整个基因组的G4形成。在这里,我们介绍G4detector,这是一种基于卷积神经网络从DNA序列预测G4的新方法。在序列信息的基础上,我们通过结合RNA二级结构信息提高了预测准确性。为了训练和测试G4detector,我们针对通过G4-seq协议测量的多个物种基因组编制了新颖的高通量基准。我们显示,对于所有基准数据集,G4detector在相同任务上的性能均优于现有方法,并且能够将经过人类训练的测量值外推到各种非人类物种。该代码和基准可以在github.com/OrensteinLab/G4detector上公开获得。
更新日期:2021-04-19
down
wechat
bug