Skip to main content

Advertisement

Log in

Attentive gated neural networks for identifying chromatin accessibility

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Accessible chromatin is associated strongly with active gene regulatory regions. Enhancers and promoters commonly occur in accessible chromatin, and systematically discovering functional sites is indispensable at the whole genome level. However, biological experiments are expensive and time-consuming, and currently, computational methods could not completely learn the hidden key regulatory patterns of genomic contexts. Moreover, the feature encoding methods of genetic sequences often ignore position information among sequences, and accurately identifying accessibility regions greatly depends on capturing more informative sequence features. To address the issues, we first encode the DNA sequences by using position embeddings, which are produced by integrating position information of the original sequences into embedding vectors and then propose a novel deep learning framework, called attentive gated neural networks (AGNet), to automatically extract complex patterns for predicting chromatin accessibility from DNA sequences. Specifically, we combine gated neural networks (GNNs) with dual attention to extract multiple patterns and long-term associations merely from DNA sequences. Experimental results on five cell-type datasets show that AGNet obtains the best performance than the published methods for the accessibility prediction. Furthermore, the results not only reveal that AGNet can learn more regulatory patterns that underlie DNA sequences, but also validate the significance of position embeddings for the accessibility prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Piao Y, Lee SK, Lee E-J, Robertson KD, Shi H, Ryu KH, Choi J-H (2016) CAME: identification of chromatin accessibility from nucleosome occupancy and methylome sequencing. Bioinformatics 33(8):1139–1146

    Google Scholar 

  2. Liu Q, Xia F, Yin Q, Jiang R (2017) Chromatin accessibility prediction via a hybrid deep convolutional neural network. Bioinformatics 34(5):732–738

    Article  Google Scholar 

  3. Min X, Zeng W, Chen N, Chen T, Jiang R (2017) Chromatin accessibility prediction via convolutional long short-term memory networks with k-mer embedding. Bioinformatics 33(14):i92–i101

    Article  Google Scholar 

  4. Manolio TA (2010) Genomewide association studies and assessment of the risk of disease. N Engl J Med 363(2):166–176

    Article  Google Scholar 

  5. Stranger BE, Stahl EA, Raj T (2011) Progress and promise of genome-wide association studies for human complex trait genetics. Genetics 187(2):367–383

    Article  Google Scholar 

  6. Gao L, Wu K, Liu Z, Yao X, Yuan S, Tao W, Yi L, Yu G, Hou Z, Fan D (2018) Chromatin accessibility landscape in human early embryos and its association with evolution. Cell 173(1):S0092867418301727

    Article  Google Scholar 

  7. Li W, Wong WH, Jiang R (2019) DeepTACT: predicting 3D chromatin contacts via bootstrapping deep learning. Nucleic Acids Res 47(10):e60

    Article  Google Scholar 

  8. Johnson DS, Mortazavi A, Myers RM, Wold BJ (2007) Genome-wide mapping of in vivo protein-DNA interactions. Science 316(5830):1497–1502

    Article  Google Scholar 

  9. Crawford GE, Holt I, Whittle J, Webb BD, Tai D, Davis S, Margulies EH, Chen Y, Bernat JA, Ginsburg D (2005) Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS). Genome Res 16(1):123–131

    Article  Google Scholar 

  10. Giresi PG, Kim J, Mcdaniell RM, Iyer VR, Lieb JD (2007) FAIRE (formaldehyde-assisted isolation of regulatory elements) isolates active regulatory elements from human chromatin. Genome Res 17(6):877–885

    Article  Google Scholar 

  11. Vierstra J, Wang H, John S, Sandstrom R, Stamatoyannopoulos JA (2014) Coupling transcription factor occupancy to nucleosome architecture with DNase-FLASH. Nat Methods 11(1):66–72

    Article  Google Scholar 

  12. Guo Y, Zhou D, Nie R, Ruan X, Li W (2020) DeepANF: a deep attentive neural framework with distributed representation for chromatin accessibility prediction. Neurocomputing 379:305–318

    Article  Google Scholar 

  13. Guo Y, Li W, Wang B, Liu H, Zhou D (2019) DeepACLSTM: deep asymmetric convolutional long short-term memory neural models for protein secondary structure prediction. BMC Bioinform 20(1):341

    Article  Google Scholar 

  14. Guo Y, Wang B, Li W, Yang B (2018) Protein secondary structure prediction improved by recurrent neural networks integrated with two-dimensional convolutional neural networks. J Bioinform Comput Biol 16(5):1850021

    Article  Google Scholar 

  15. Jing F, Zhang S, Cao Z, Zhang S (2019) An integrative framework for combining sequence and epigenomic data to predict transcription factor binding sites using deep learning. IEEE ACM Trans Comput Biol Bioinf. https://doi.org/10.1109/TCBB.2019.2901789

    Article  Google Scholar 

  16. Singh R, Lanchantin J, Sekhon A, Qi Y (2017) Attend and predict: understanding gene regulation by selective attention on chromatin. In: Advances in neural information processing systems, 2017, pp 6785–6795

  17. Lee D, Karchin R, Beer MA (2011) Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res 21(12):2167–2180

    Article  Google Scholar 

  18. Ghandi M, Lee D, Mohammadnoori M, Beer M (2014) Enhanced regulatory sequence prediction using gapped k-mer features. PLoS Comput Biol 10(7):e1003711

    Article  Google Scholar 

  19. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436

    Article  Google Scholar 

  20. Gómez-Ríos A, Tabik S, Luengo J, Shihavuddin A, Herrera F (2019) Coral species identification with texture or structure images using a two-level classifier based on convolutional neural networks. Knowl Based Syst 184:104891

    Article  Google Scholar 

  21. Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2019, pp 3146–3154

  22. Chen M-Y, Chiang H-S, Sangaiah AK, Hsieh T-C (2019) Recurrent neural network with attention mechanism for language model. Neural Comput Appl. https://doi.org/10.1007/s00521-019-04301-x

    Article  Google Scholar 

  23. Tran V-K, Nguyen L-M (2019) Gating mechanism based natural language generation for spoken dialogue systems. Neurocomputing 325:48–58

    Article  Google Scholar 

  24. Acharya UR, Fujita H, Lih OS, Adam M, Tan JH, Chua CK (2017) Automated detection of coronary artery disease using different durations of ECG segments with convolutional neural network. Knowl Based Syst 132:62–71

    Article  Google Scholar 

  25. Pei M, Wu X, Guo Y, Fujita H (2017) Small bowel motility assessment based on fully convolutional networks and long short-term memory. Knowl Based Syst 121:163–172

    Article  Google Scholar 

  26. Zhou J, Troyanskaya OG (2015) Predicting effects of noncoding variants with deep learning-based sequence model. Nat Methods 12(10):931–934

    Article  Google Scholar 

  27. Kelley DR, Snoek J, Rin JL (2016) Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res 26(7):990

    Article  Google Scholar 

  28. Zhang J, Peng W, Wang L (2018) LeNup: learning nucleosome positioning from DNA sequences with improved convolutional neural networks. Bioinformatics 34(10):1705–1712

    Article  Google Scholar 

  29. Sakar CO, Polat SO, Katircioglu M, Kastro Y (2018) Real-time prediction of online shoppers’ purchasing intention using multilayer perceptron and LSTM recurrent neural networks. Neural Comput Appl 31:6893–6908

    Article  Google Scholar 

  30. Zhang H, Li J, Ji Y, Yue H (2016) Understanding subtitles by character-level sequence-to-sequence learning. IEEE Trans Ind Inf 13(2):616–624

    Article  Google Scholar 

  31. Yang KK, Wu Z, Bedbrook CN, Arnold FH (2018) Learned protein embeddings for machine learning. Bioinformatics 34(15):2642–2648

    Article  Google Scholar 

  32. Ng P (2017) dna2vec: consistent vector representations of variable-length k-mers. arXiv preprint arXiv:170106279

  33. Pan X, Shen H-B (2018) Learning distributed representations of RNA sequences and its application for predicting RNA-protein binding sites with a convolutional neural network. Neurocomputing 305:51–58

    Article  Google Scholar 

  34. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł (2017) Polosukhin I attention is all you need. Adv Neural Inf Process Syst 2017:5998–6008

    Google Scholar 

  35. Dauphin YN, Fan A, Auli M, Grangier D (2016) Language modeling with gated convolutional networks. In: International conference on international conference on machine learning, pp 933–941

  36. Xue W, Li T (2018) Aspect Based sentiment analysis with gated convolutional networks. In: Proceedings of the 56th annual meeting of the association for computational linguistics, 2018, pp 2514–2523

  37. Cho K, Van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp 1724–1734

  38. Consortium EP (2004) The ENCODE (ENCyclopedia Of DNA Elements) project. Science 306(5696):636–640

    Article  Google Scholar 

  39. Den Oord AV, Kalchbrenner N, Vinyals O, Espeholt L, Graves A, Kavukcuoglu K (2016) Conditional image generation with PixelCNN decoders. In: Advances in neural information processing systems, pp 4797–4805

  40. Zhang H, Wang S, Xu X, Chow TW, Wu QJ (2018) Tree2Vector: learning a vectorial representation for tree-structured data. IEEE Trans Neural Netw Learn Syst 99:1–15

    MathSciNet  Google Scholar 

  41. Klambauer G, Unterthiner T, Mayr A, Hochreiter S (2017) Self-normalizing neural networks. In: Advances in neural information processing systems, pp 971–980

  42. Han J, Moraga C (1995) The influence of the sigmoid function parameters on the speed of backpropagation learning. In: International workshop on artificial neural networks, 1995. Springer, pp 195–201

  43. Kirk JM, Kim SO, Inoue K, Smola MJ, Lee DM, Schertzer MD, Wooten JS, Baker AR, Sprague D, Collins DW (2018) Functional classification of long non-coding RNAs by k-mer content. Nat Genet 50(10):1474–1482

    Article  Google Scholar 

  44. Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, 2014, pp 1532–1543

  45. Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: International conference on international conference on machine learning, 2010, pp 807–814

  46. Boureau Y-L, Ponce J, LeCun Y (2010) A theoretical analysis of feature pooling in visual recognition. In: Proceedings of the 27th international conference on machine learning, 2010, pp 111–118

  47. Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:14123555

  48. He K, Sun J (2015) Convolutional neural networks at constrained time cost. In: IEEE conference on computer vision and pattern recognition, 2015, pp 5353–5360

  49. Yuan Y, Ning H, Lu X (2019) Bio-inspired representation learning for visual attention prediction. IEEE Trans Cybern. https://doi.org/10.1109/tcyb.2019.2931735

    Article  Google Scholar 

  50. LeCun YA, Bottou L, Orr GB, Müller K-R (2012) Efficient backprop. In: Goos G, Hartmanis J (eds) Neural networks: tricks of the trade. Springer, Berlin, pp 9–48

    Chapter  Google Scholar 

  51. Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: International conference on learning representations

  52. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  53. John S, Sabo PJ, Thurman RE, Sung M-H, Biddie SC, Johnson TA, Hager GL, Stamatoyannopoulos JA (2011) Chromatin accessibility pre-determines glucocorticoid receptor binding patterns. Nat Genet 43(3):264

    Article  Google Scholar 

  54. Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, 2010, pp 249–256

  55. Yin Z, Shen Y (2018) On the dimensionality of word embedding. Adv Neural Inf Process Syst 2018:887–898

    Google Scholar 

  56. Zitnik M, Agrawal M, Leskovec J (2018) Modeling polypharmacy side effects with graph convolutional networks. Bioinformatics 34(13):i457–i466

    Article  Google Scholar 

Download references

Acknowledgements

We sincerely thank the editors and the anonymous reviewers for their valuable comments. Moreover, we also thank Min et al. for the discussion and consultation very much.

Funding

This work was primarily supported by the National Natural Science Foundation of China under Grants 61966037, 61463052 and 61365001, Science Foundation of Educational Department of Yunnan Province (Nos. 2019J0006 and 2019Y0003), Yunnan University’s Research Innovation Fund for Graduate Students (No. 2019152) and Yunnan Province University Key Laboratory Construction Plan Funding, China.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dongming Zhou.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, Y., Zhou, D., Li, W. et al. Attentive gated neural networks for identifying chromatin accessibility. Neural Comput & Applic 32, 15557–15571 (2020). https://doi.org/10.1007/s00521-020-04879-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-020-04879-7

Keywords

Navigation