Skip to main content
Log in

Contextual embedding bootstrapped neural network for medical information extraction of coronary artery disease records

  • Original Article
  • Published:
Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Abstract

Coronary artery disease (CAD) is the major cause of human death worldwide. The development of new CAD early diagnosis methods based on medical big data has a great potential to reduce the risk of CAD death. In this process, neural network (NN), as a powerful tool for electronic medical record (EMR) processing, enables extract structured data accurately to unlock medical information and to further improve CAD diagnosis. However, the excessive time and labor caused by dataset’s annotation is the main limitation of its application, especially on the CAD records situation with large natural language text and biomedical professional content. In this study, we present an annotation cost saving NN approach for CAD records, which is bootstrapped by deep language model with contextual embedding pre-trained on large unannotated CAD corpus. To demonstrate the feasibility and to further evaluate the performance of our approach, we performed pre-training experiment and term classification experiment, by using the unannotated and annotated CAD records, respectively. The results showed that our contextual embedding bootstrapped NN for CAD records has better performance under the condition of annotations reduction.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Bonow RO, Mann DL, Zipes DP, Libby P (2011) Braunwald’s Heart disease e-book: a textbook of cardiovascular medicine. Elsevier Health Sciences

  2. Organization WH (2019) World health statistics 2019: monitoring health for the SDGs, sustainable development goals

  3. Alizadehsani R, Roshanzamir M, Abdar M, Beykikhoshk A, Khosravi A, Panahiazar M, Koohestani A, Khozeimeh F, Nahavandi S, Sarrafzadegan N (2019) A database for using machine learning and data mining techniques for coronary artery disease diagnosis. Sci Data 6:1–13

    Article  Google Scholar 

  4. Alizadehsani R, Abdar M, Roshanzamir M, Khosravi A, Kebria PM, Khozeimeh F, Nahavandi S, Sarrafzadegan N, Acharya UR (2019) Machine learning-based coronary artery disease diagnosis: a comprehensive review. Comput Biol Med 111:103346

  5. Pławiak P (2018) Novel methodology of cardiac health recognition based on ECG signals and evolutionary-neural system. Expert Syst Appl 92:334–349

    Article  Google Scholar 

  6. Alizadehsani R, Hosseini MJ, Khosravi A, Khozeimeh F, Roshanzamir M, Sarrafzadegan N, Nahavandi S (2018) Non-invasive detection of coronary artery disease in high-risk patients based on the stenosis prediction of separate coronary arteries. Comput Methods Prog Biomed 162:119–127

    Article  Google Scholar 

  7. Lamy M, Pereira R, Ferreira JC, Vasconcelos JB, Melo F, Velez I (2018) Extracting clinical information from electronic medical records. In: International Symposium on Ambient Intelligence. Springer, pp 113–120

  8. Denis M (2017) UK clinical record interactive search (CRIS). Alzheimer’s Dement J Alzheimer’s Assoc 13:P1223

    Article  Google Scholar 

  9. Jensen PB, Jensen LJ, Brunak S (2012) Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet 13:395–405

    Article  CAS  Google Scholar 

  10. Murdoch TB, Detsky AS (2013) The inevitable application of big data to health care. Jama 309:1351–1352

    Article  CAS  Google Scholar 

  11. Karystianis G, Nevado AJ, Kim C, Dehghan A, Keane JA, Nenadic G (2018) Automatic mining of symptom severity from psychiatric evaluation notes. Int J Methods Psychiatr Res 27:e1602

    Article  Google Scholar 

  12. Cambria E, White B (2014) Jumping NLP curves: a review of natural language processing research. IEEE Comput Intell Mag 9:48–57

    Article  Google Scholar 

  13. Mao R, Zhang P, Li X, Liu X, Lu M (2016) Pivot selection for metric-space indexing. Int J Mach Learn Cybern 7:311–323

    Article  Google Scholar 

  14. Wang P, Qian Y, Soong FK, He L, Zhao H (2015) A unified tagging solution: bidirectional lstm recurrent neural network with word embedding. arXiv Prepr arXiv151100215

  15. Yao C, Qu Y, Jin B, Guo L, Li C, Cui W, Feng L (2016) A convolutional neural network model for online medical guidance. IEEE Access 4:4094–4103

    Article  Google Scholar 

  16. Si Y, Wang J, Xu H, Roberts K (2019) Enhancing clinical concept extraction with contextual embeddings. J Am Med Inform Assoc 26:1297–1304

    Article  Google Scholar 

  17. Bowman SR, Angeli G, Potts C, Manning CD (2015) A large annotated corpus for learning natural language inference. arXiv Prepr arXiv150805326

  18. Yang Z, Salakhutdinov R, Cohen WW (2017) Transfer learning for sequence tagging with hierarchical recurrent networks. arXiv Prepr arXiv170306345

  19. Gligic L, Kormilitzin A, Goldberg P, Nevado-Holgado A (2020) Named entity recognition in electronic health records using transfer learning bootstrapped neural networks. Neural Netw 121:132–139

    Article  Google Scholar 

  20. Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. arXiv Prepr arXiv180205365

  21. Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv Prepr arXiv181004805

  22. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems. pp 3111–3119

  23. Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). pp 1532–1543

  24. Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146

    Article  Google Scholar 

  25. Clark K, Luong M-T, Le Q V, Manning CD (2020) Electra: pre-training text encoders as discriminators rather than generators. arXiv Prepr arXiv200310555

  26. Khin K, Burckhardt P, Padman R (2018) A deep learning architecture for de-identification of patient notes: implementation and evaluation. arXiv Prepr arXiv181001570

  27. Zhu H, Paschalidis IC, Tahmasebi A (2018) Clinical concept extraction with contextual word embedding. arXiv Prepr arXiv181010566

  28. Dyer C, Ballesteros M, Ling W, Matthews A, Smith NA (2015) Transition-based dependency parsing with stack long short-term memory. arXiv Prepr arXiv150508075

Download references

Funding

This work has been supported by the Shanghai Municipal Commission of Economy and Information (Grant no. XX-XXFZ-02-20-2042, XX-RGZN-01-19-6584).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junyi Yuan.

Ethics declarations

Ethics approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional review board and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. This article does not contain any studies with animals performed by any of the authors.

Informed consent

Verbal informed consent was obtained from all individual participants included in the study.

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cen, X., Yuan, J., Pan, C. et al. Contextual embedding bootstrapped neural network for medical information extraction of coronary artery disease records. Med Biol Eng Comput 59, 1111–1121 (2021). https://doi.org/10.1007/s11517-021-02359-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11517-021-02359-1

Keywords

Navigation