SpliceViNCI: Visualizing the splicing of non-canonical introns through recurrent neural networks

Aparajita Dutta; Kusum Kumari Singh; Ashish Anand

doi:10.1101/2020.02.09.940551

Abstract

Most of the current computational models for splice junction prediction are based on the identification of canonical splice junctions. However, it is observed that the junctions lacking the consensus dimers GT and AG also undergo splicing. Identification of such splice junctions, called the non-canonical splice junctions, is also essentially important for a comprehensive understanding of the splicing phenomenon. This work focuses on the identification of non-canonical splice junctions through the application of a bidirectional long short-term memory (BLSTM) network. Furthermore, we apply a back-propagation based (integrated gradient) and a perturbation based (occlusion) visualization techniques to extract the non-canonical splicing features learned by the model. The features obtained are validated with the existing knowledge from the literature. Integrated gradient extracts features that comprise contiguous nucleotides, whereas occlusion extracts features that are individual nucleotides distributed across the sequence.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

URL: d.aparajita{at}iitg.ac.in (Aparajita Dutta), kusumsingh{at}iitg.ac.in (Kusum Kumari Singh)

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license.