Transfer learning for informative-frame selection in laryngoscopic videos through learned features

Patrini, Ilaria; Ruperti, Michela; Moccia, Sara; Mattos, Leonardo S.; Frontoni, Emanuele; De Momi, Elena

doi:10.1007/s11517-020-02127-7

Transfer learning for informative-frame selection in laryngoscopic videos through learned features

Original Article
Published: 24 March 2020

Volume 58, pages 1225–1238, (2020)
Cite this article

Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Ilaria Patrini¹,
Michela Ruperti¹,
Sara Moccia ORCID: orcid.org/0000-0002-4494-8907^2,3,
Leonardo S. Mattos³,
Emanuele Frontoni² &
…
Elena De Momi¹

892 Accesses
27 Citations
Explore all metrics

Abstract

Narrow-band imaging (NBI) laryngoscopy is an optical-biopsy technique used for screening and diagnosing cancer of the laryngeal tract, reducing the biopsy risks but at the cost of some drawbacks, such as large amount of data to review to make the diagnosis. The purpose of this paper is to develop a deep-learning-based strategy for the automatic selection of informative laryngoscopic-video frames, reducing the amount of data to process for diagnosis. The strategy leans on the transfer learning process that is implemented to perform learned-features extraction using six different convolutional neural networks (CNNs) pre-trained on natural images. To test the proposed strategy, the learned features were extracted from the NBI-InfFrames dataset. Support vector machines (SVMs) and CNN-based approach were then used to classify frames as informative (I) and uninformative ones such as blurred (B), with saliva or specular reflections (S), and underexposed (U). The best-performing learned-feature set was achieved with VGG 16 resulting in a recall of I of 0.97 when classifying frames with SVMs and 0.98 with the CNN-based classification. This work presents a valuable novel approach towards the selection of informative frames in laryngoscopic videos and a demonstration of the potential of transfer learning in medical image analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Real-time detection of laryngopharyngeal cancer using an artificial intelligence-assisted system with multimodal data

Article Open access 07 October 2023

Yun Li, Wenxin Gu, … Wenbin Lei

A deep learning pipeline for automated classification of vocal fold polyps in flexible laryngoscopy

Article 11 September 2023

Peter Yao, Dan Witte, … Anaïs Rameau

Contact Endoscopy – Narrow Band Imaging (CE-NBI) data set for laryngeal lesion assessment

Article Open access 21 October 2023

Nazila Esmaeili, Nikolaos Davaris, … Christoph Arens

Notes

References

Maier-Hein L, Vedula SS, Speidel S, Navab N, Kikinis R, Park A, Eisenmann M, Feussner H, Forestier G, Giannarou S et al (2017) Surgical data science for next-generation interventions. Nat Biomed Eng 1(9):691
Article PubMed Google Scholar
Campochiaro PA (2015) Molecular pathogenesis of retinal and choroidal vascular diseases. Progress in Retinal and Eye Research
Moccia S, De Momi E, El Hadji S, Mattos LS (2018) Blood vessel segmentation algorithms – Review of methods, datasets and evaluation metrics. Comput Methods Prog Biomed 158:71–91
Article Google Scholar
Piazza C, Del Bon F, Peretti G, Nicolai P (2012) Narrow band imaging in endoscopic evaluation of the larynx. Curr Opin Otolaryngol Head Neck Surg 20(6):472–476
Article PubMed Google Scholar
Moccia S, De Momi E, Guarnaschelli M, Savazzi M, Laborai A, Guastini L, Peretti G, Mattos LS (2017) Confident texture-based laryngeal tissue classification for early stage diagnosis support. J Med Imaging 4(3):034,502
Article Google Scholar
Araújo T, Santos CP, De Momi E, Moccia S (2019) Learned and handcrafted features for early-stage laryngeal SCC diagnosis. Med Biol Eng Comput 57(12):2683–2692
Article PubMed Google Scholar
Essert C, Fernandez-Vidal S, Capobianco A, Haegelen C, Karachi C, Bardinet E, Marchal M, Jannin P (2015) Statistical study of parameters for deep brain stimulation automatic preoperative planning of electrodes trajectories. Int J CARS 10(12):1973–1983
Article Google Scholar
Moccia S, Foti S, Routray A, Prudente F, Perin A, Sekula RF, Mattos LS, Balzer JR, Fellows-Mayle W, De Momi E et al (2018) Toward improving safety in neurosurgery with an active handheld instrument. Ann Biomed Eng, pp 1–15
Gómez P, Semmler M, Sch’́utzenberger A, Bohr C, D’́ollinger M (2019) Low-light image enhancement of high-speed endoscopic videos using a convolutional neural network. Medical & Biological Engineering & Computing, pp 1–13
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Article PubMed Google Scholar
Moccia S, Vanone GO, De Momi E, Laborai A, Guastini L, Peretti G, Mattos LS (2018) Learning-based classification of informative laryngoscopic frames. Comput Methods Prog Biomed 158:21–30
Article Google Scholar
Perperidis A, Akram A, Altmann Y, McCool P, Westerfeld J, Wilson D, Dhaliwal K, McLaughlin S (2017) Automated detection of uninformative frames in pulmonary optical endomicroscopy. IEEE Trans Biomed Eng 64(1):87–98
Article PubMed Google Scholar
Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep Learning, vol 1. MIT Press, Cambridge
Google Scholar
Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542(7639):115–118
Article CAS PubMed Google Scholar
Wang Q, Zheng Y, Yang G, Jin W, Chen X, Yin Y (2018) Multiscale rotation-invariant convolutional neural networks for lung texture classification. IEEE J Biomed Health Inform 22(1): 184–195
Article PubMed Google Scholar
Nanni L, Ghidoni S, Brahnam S (2017) Handcrafted vs. non-handcrafted features for computer vision classification. Pattern Recogn 71:158–172
Article Google Scholar
Poplin R, Varadarajan AV, Blumer K, Liu Y, McConnell MV, Corrado GS, Peng L, Webster DR (2018) Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning. Nature Biomedical Engineering, p 1
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article Google Scholar
Bashar MK, Kitasaka T, Suenaga Y, Mekada Y, Mori K (2010) Automatic detection of informative frames from wireless capsule endoscopy images. Med Image Anal 14(3):449–470
Article CAS PubMed Google Scholar
Atasoy S, Mateus D, Meining A, Yang GZ, Navab N (2012) Endoscopic video manifolds for targeted optical biopsy. IEEE Trans Med Imaging 31(3):637–653
Article PubMed Google Scholar
Park SY, Sargent D, Spofford I, Vosburgh KG, A-Rahim Y (2012) A colon video analysis framework for polyp detection. IEEE Trans Biomed Eng 59(5):1408
Article PubMed Google Scholar
Maghsoudi OH, Talebpour A, Soltanian-Zadeh H, Alizadeh M, Soleimani HA (2014) Informative and uninformative regions detection in WCE frames. J Adv Comput 3(1):12–34
Google Scholar
Ishijima A, Schwarz RA, Shin D, Mondrik S, Vigneswaran N, Gillenwater AM, Anandasabapathy S, Richards-Kortum R (2015) Automated frame selection process for high-resolution microendoscopy. J Biomed Opt 20(4):046,014
Article Google Scholar
Armin MA, Chetty G, Jurgen F, De Visser H, Dumas C, Fazlollahi A, Grimpen F, Salvado O (2015) Uninformative frame detection in colonoscopy through motion, edge and color features. In: International Workshop on Computer-Assisted and Robotic Endoscopy, Springer, pp 153–162
Kumar A, Kim J, Lyndon D, Fulham M, Feng D (2017) An ensemble of fine-tuned convolutional neural networks for medical image classification. IEEE J Biomed Health Inform 21(1): 31–40
Article PubMed Google Scholar
Yoo TK, Choi JY, Seo JG, Ramasubramanian B, Selvaperumal S, Kim DW (2019) The possibility of the combination of oct and fundus images for improving the diagnostic accuracy of deep learning for age-related macular degeneration: a preliminary experiment. Med Biol Eng Comput 57(3):677–687
Article PubMed Google Scholar
Cheplygina V, Pena IP, Pedersen JH, Lynch DA, Sørensen L, de Bruijne M (2018) Transfer learning for multicenter classification of chronic obstructive pulmonary disease. IEEE J Biomed Health Inform 22 (5):1486–1496
Article PubMed Google Scholar
Zhang R, Zheng Y, Mak TWC, Yu R, Wong SH, Lau JY, Poon CC (2017) Automatic detection and classification of colorectal polyps by transferring low-level CNN features from nonmedical domain. IEEE J Biomed Health Inform 21(1):41–47
Article PubMed Google Scholar
Behrens A (2008) Creating panoramic images for bladder fluorescence endoscopy. Acta Polytechnica 48 (3):50–54
Google Scholar
Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks? In: Advances in neural information processing systems, pp 3320–3328
Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp 1097–1105
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:14091556
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. Association for the Advancement of Artificial Intelligence 4:12
Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European Conference on Computer Vision, Springer, pp 630–645
Burges CJ (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc 2 (2):121–167
Article Google Scholar
Lin Y, Lv F, Zhu S, Yang M, Cour T, Yu K, Cao L, Huang T (2011) Large-scale image classification: fast feature extraction and SVM training. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp 1689–1696
Tajbakhsh N, Shin JY, Gurudu SR, Hurst RT, Kendall CB, Gotway MB, Liang J (2016) Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans Med Imaging 35 (5):1299–1312
Article PubMed Google Scholar
Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, Van der Laak JA, Van Ginneken B, Sánchez CI (2017) A survey on deep learning in medical image analysis. Med Image Anal 42:60–88
Article PubMed Google Scholar
Esmaeili N, Illanes A, Boese A, Davaris N, Arens C, Friebe M (2019) Novel automated vessel pattern characterization of larynx contact endoscopic video images. Int J CARS, pp 1–11
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp 249–256
Li X, La R, Wang Y, Niu J, Zeng S, Sun S, Zhu J (2019) EEG-Based mild depression recognition using convolutional neural network. Med Biol Eng Comput 57(6):1341–1352
Article PubMed Google Scholar
Dao TT (2019) From deep learning to transfer learning for the prediction of skeletal muscle forces. Med Biol Eng Comput 57(5):1049–1058
Article PubMed Google Scholar
Singh R, Ahmed T, Singh R, Udmale SS, Singh SK (2019) Identifying tiny faces in thermal images using transfer learning. Journal of Ambient Intelligence and Humanized Computing, pp 1–10
Taylor ME, Stone P (2009) Transfer learning for reinforcement learning domains: a survey. J Mach Learn Res 10(Jul):1633–1685
Google Scholar
Pan SJ, Shen D, Yang Q, Kwok JT (2008) Transferring localization models across space. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence, pp 1383–1388
Moccia S, Penza V, Vanone GO, De Momi E, Mattos LS (2016) Automatic workflow for narrow-band laryngeal video stitching. In: IEEE Annual International Conference of the Engineering in Medicine and Biology Society, pp 1188–1191
Wilson AC, Roelofs R, Stern M, Srebro N, Recht B (2017) The marginal value of adaptive gradient methods in machine learning. In: Advances in Neural Information Processing Systems, pp 4148–4158
Bradley AP (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn 30(7):1145–1159
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronics, Information and Bioengineering, Politecnico di Milano, Piazza Leonardo Da Vinci 32, Milan, Italy
Ilaria Patrini, Michela Ruperti & Elena De Momi
Department of Information Engineering, Università Politecnica delle Marche, via Brecce Bianche 12, Ancona, Italy
Sara Moccia & Emanuele Frontoni
Department of Advanced Robotics, Istituto Italiano di Tecnologia, via Morego 30, Genoa, Italy
Sara Moccia & Leonardo S. Mattos

Authors

Ilaria Patrini
View author publications
You can also search for this author in PubMed Google Scholar
Michela Ruperti
View author publications
You can also search for this author in PubMed Google Scholar
Sara Moccia
View author publications
You can also search for this author in PubMed Google Scholar
Leonardo S. Mattos
View author publications
You can also search for this author in PubMed Google Scholar
Emanuele Frontoni
View author publications
You can also search for this author in PubMed Google Scholar
Elena De Momi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sara Moccia.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Ilaria Patrini and Michela Ruperti equally contributed to this paper

Rights and permissions

Reprints and permissions

About this article

Cite this article

Patrini, I., Ruperti, M., Moccia, S. et al. Transfer learning for informative-frame selection in laryngoscopic videos through learned features. Med Biol Eng Comput 58, 1225–1238 (2020). https://doi.org/10.1007/s11517-020-02127-7

Download citation

Received: 12 June 2019
Accepted: 07 January 2020
Published: 24 March 2020
Issue Date: June 2020
DOI: https://doi.org/10.1007/s11517-020-02127-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Transfer learning for informative-frame selection in laryngoscopic videos through learned features

Abstract

Access this article

Similar content being viewed by others

Real-time detection of laryngopharyngeal cancer using an artificial intelligence-assisted system with multimodal data

A deep learning pipeline for automated classification of vocal fold polyps in flexible laryngoscopy

Contact Endoscopy – Narrow Band Imaging (CE-NBI) data set for laryngeal lesion assessment

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Transfer learning for informative-frame selection in laryngoscopic videos through learned features

Abstract

Access this article

Similar content being viewed by others

Real-time detection of laryngopharyngeal cancer using an artificial intelligence-assisted system with multimodal data

A deep learning pipeline for automated classification of vocal fold polyps in flexible laryngoscopy

Contact Endoscopy – Narrow Band Imaging (CE-NBI) data set for laryngeal lesion assessment

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation