Skip to main content

Advertisement

Log in

LRTD: long-range temporal dependency based active learning for surgical workflow recognition

  • Original Article
  • Published:
International Journal of Computer Assisted Radiology and Surgery Aims and scope Submit manuscript

Abstract

Purpose

Automatic surgical workflow recognition in video is an essentially fundamental yet challenging problem for developing computer-assisted and robotic-assisted surgery. Existing approaches with deep learning have achieved remarkable performance on analysis of surgical videos, however, heavily relying on large-scale labelled datasets. Unfortunately, the annotation is not often available in abundance, because it requires the domain knowledge of surgeons. Even for experts, it is very tedious and time-consuming to do a sufficient amount of annotations.

Methods

In this paper, we propose a novel active learning method for cost-effective surgical video analysis. Specifically, we propose a non-local recurrent convolutional network, which introduces non-local block to capture the long-range temporal dependency (LRTD) among continuous frames. We then formulate an intra-clip dependency score to represent the overall dependency within this clip. By ranking scores among clips in unlabelled data pool, we select the clips with weak dependencies to annotate, which indicates the most informative ones to better benefit network training.

Results

We validate our approach on a large surgical video dataset (Cholec80) by performing surgical workflow recognition task. By using our LRTD based selection strategy, we can outperform other state-of-the-art active learning methods who only consider neighbor-frame information. Using only up to 50% of samples, our approach can exceed the performance of full-data training.

Conclusion

By modeling the intra-clip dependency, our LRTD based strategy shows stronger capability to select informative video clips for annotation compared with other active learning methods, through the evaluation on a popular public surgical dataset. The results also show the promising potential of our framework for reducing annotation workload in the clinical practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Ahmidi N, Tao L, Sefati S, Gao Y, Lea C, Haro BB, Zappella L, Khudanpur S, Vidal R, Hager GD (2017) A dataset and benchmarks for segmentation and recognition of gestures in robotic surgery. IEEE Transactions on Biomedical Engineering 64(9):2025–2041

    Article  PubMed  Google Scholar 

  2. Bodenstedt S, Rivoir D, Jenke A, Wagner M, Breucha M, Müller-Stich B, Mees ST, Weitz J, Speidel S (2019) Active learning using deep Bayesian networks for surgical workflow analysis. International Journal of Computer Assisted Radiology and Surgery 14(6):1079–1087

    Article  PubMed  Google Scholar 

  3. Bodenstedt S, Wagner M, Katić D, Mietkowski P, Mayer B, Kenngott H, Müller-Stich B, Dillmann R, Speidel S (2017) Unsupervised temporal context learning using convolutional neural networks for laparoscopic workflow analysis. arXiv preprint arXiv:1702.03684

  4. Bouget D, Allan M, Stoyanov D, Jannin P (2017) Vision-based and marker-less surgical tool detection and tracking: a review of the literature. Medical Image Analysis 35:633–654

    Article  PubMed  Google Scholar 

  5. Bouget D, Benenson R, Omran M, Riffaud L, Schiele B, Jannin P (2015) Detecting surgical tools by modelling local appearance and global shape. IEEE Transactions on Medical Imaging 34(12):2603–2617

    Article  PubMed  Google Scholar 

  6. Bricon-Souf N, Newman CR (2007) Context awareness in health care: A review. International Journal of Medical Informatics 76(1):2–12

    Article  PubMed  Google Scholar 

  7. Cleary K, Kinsella A (2005) OR 2020: the operating room of the future. Journal of laparoscopic & advanced surgical techniques. Part A 15(5):495–497

    Article  Google Scholar 

  8. Dergachyova O, Bouget D, Huaulmé A, Morandi X, Jannin P (2016) Automatic data-driven real-time segmentation and recognition of surgical workflow. International Journal of Computer Assisted Radiology and Surgery 11(6):1081–1089

    Article  PubMed  Google Scholar 

  9. Doersch C, Zisserman A (2017) Multi-task self-supervised visual learning. In IEEE International Conference on Computer Vision, pp. 2051–2060

  10. Forestier G, Riffaud L, Jannin P (2015) Automatic phase prediction from low-level surgical activities. International Journal of Computer Assisted Radiology and Surgery 10(6):833–841

    Article  PubMed  Google Scholar 

  11. Funke I, Jenke A, Mees ST, Weitz J, Speidel S, Bodenstedt S (2018) Temporal coherence-based self-supervised learning for laparoscopic workflow analysis. In OR 2.0 Context-Aware Operating Theaters, Computer Assisted Robotic Endoscopy, Clinical Image-Based Procedures, and Skin Image Analysis, Springer, pp. 85–93

  12. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778

  13. James A, Vieira D, Lo B, Darzi A, Yang G-Z (2007) Eye-gaze driven surgical workflow segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 110–117

  14. Jin Y, Dou Q, Chen H, Yu L, Qin J, Fu C-W, Heng P-A (2017) SV-RCNet: workflow recognition from surgical videos using recurrent convolutional network. IEEE Transactions on Medical Imaging 37(5):1114–1126

    Article  Google Scholar 

  15. Jin Y, Li H, Dou Q, Chen H, Qin J, Fu C-W, Heng P-A (2019) Multi-task recurrent convolutional network with correlation loss for surgical video analysis. Medical Image Analysis, page 101572

  16. Mahapatra D, Bozorgtabar B, Thiran J-P, Reyes M (2018) Efficient active learning for image classification and segmentation using a sample selection and conditional generative adversarial network. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 580–588

  17. Quellec G, Charrière K, Lamard M, Droueche Z, Roux C, Cochener B, Cazuguel G (2014) Real-time recognition of surgical tasks in eye surgery videos. Medical Image Analysis 18(3):579–590

    Article  PubMed  Google Scholar 

  18. Ross T, Zimmerer D, Vemuri A, Isensee F, Wiesenfarth M, Bodenstedt S, Both F, Kessler P, Wagner M, Müller B, Kenngott H, Speidel S, Kopp-Schneider A, Maier-Hein K, Maier-Hein L (2018) Exploiting the potential of unlabeled endoscopic video data with self-supervised learning. International Journal of Computer Assisted Radiology and Surgery 13(6):925–933

    Article  PubMed  Google Scholar 

  19. Settles B (2009) Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin–Madison

  20. Shi X, Dou Q, Xue C, Qin J, Chen H, Heng P-A (2019) An active learning approach for reducing annotation cost in skin lesion analysis. In International Workshop on Machine Learning in Medical Imaging, Springer, pp. 628–636

  21. Twinanda AP, Shehata S, Mutter D, Marescaux J, De Mathelin M, Padoy N (2016) Endonet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Transactions on Medical Imaging 36(1):86–97

    Article  PubMed  Google Scholar 

  22. Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803

  23. Yang L, Zhang Y, Chen J, Zhang S, Chen DZ (2017) Suggestive annotation: A deep active learning framework for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 399–407

  24. Yengera G, Mutter D, Marescaux J, Padoy N (2018) Less is more: surgical phase recognition with less annotations through self-supervised pre-training of CNN-LSTM networks. arXiv preprint arXiv:1805.08569

  25. Yu T, Mutter D, Marescaux J, Padoy N (2018) Learning from a tiny dataset of manual annotations: a teacher/student approach for surgical phase recognition. arXiv preprint arXiv:1812.00033

  26. Zappella L, Béjar B, Hager G, Vidal R (2013) Surgical gesture classification from video and kinematic data. Medical Image Analysis 17(7):732–745

    Article  PubMed  Google Scholar 

  27. Zheng H, Yang L, Chen J, Han J, Zhang Y, Liang P, Zhao Z, Wang C, Chen DZ (2019) Biomedical image segmentation via representative annotation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 33, pp 5901–5908

  28. Zhou Z, Shin JY, Zhang L, Gurudu SR, Gotway MB, Liang J, Fine-tuning convolutional neural networks for biomedical image analysis: Actively and incrementally. In IEEE Conference on Computer Vision and Pattern Recognition

Download references

Acknowledgements

The work was partially supported by HK RGC TRS project T42-409/18-R, and a grant from the National Natural Science Foundation of China (Project No. U1813204) and CUHK T Stone Robotics Institute.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qi Dou.

Ethics declarations

Conflict of interest

Xueying Shi, Yueming Jin, Qi Dou and Pheng-Ann Heng declare that they have no conflict of interest.

Ethical approval

For this type of study formal consent is not required.

Informed consent

This article contains patient data from publicly available datasets.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Appendix A

See Tables 4,  3 and Fig 7.

Fig. 7
figure 7

Ratio statistics about selected clips’ phases

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shi, X., Jin, Y., Dou, Q. et al. LRTD: long-range temporal dependency based active learning for surgical workflow recognition. Int J CARS 15, 1573–1584 (2020). https://doi.org/10.1007/s11548-020-02198-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11548-020-02198-9

Keywords

Navigation