Skip to main content
Log in

An integrated neural network model for pupil detection and tracking

  • Data analytics and machine learning
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

The accurate detection and tracking of pupil is important to many applications such as human–computer interaction, driver’s fatigue detection and diagnosis of brain diseases. Existing approaches however face challenges in handing low quality of pupil images. In this paper, we propose an integrated pupil tracking framework, namely LVCF, based on deep learning. LVCF consists of the pupil detection model VCF which is an end-to-end network, and the LSTM pupil motion prediction model which applies LSTM to track pupil’s position. The proposed network was trained and evaluated on 10600 images and 75 videos taken from 3 realistic datasets. Within an error threshold of 5 pixels, VCF achieves an accuracy of more than 81%, and LVCF outperforms the state of arts by 9% in terms of percentage of pupils tracked. The project of LCVF is available at https://github.com/UnderTheMangoTree/LVCF.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. http://www.robots.ox.ac.uk/~vgg/software/via/via.html.

References

  • Bhunia AK, Bhattacharyya A, Banerjee P, Roy PP, Murala S (2019) A novel feature descriptor for image retrieval by combining modified color histogram and diagonally symmetric co-occurrence texture pattern. Pattern Analysis and Applications pp 1–21

  • Chen LC, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587

  • De Mulder W, Bethard S, Moens MF (2015) A survey on the application of recurrent neural networks to statistical language modeling. Comput Speech Lang 30(1):61–98

    Article  Google Scholar 

  • Dinges DF, Grace R (1998) Perclos: A valid psychophysiological measure of alertness as assessed by psychomotor vigilance. US Department of Transportation, Federal Highway Administration, Publication Number FHWA-MCRT-98-006

  • Dinges DF, Mallis MM, Maislin G, Powell JW, et al. (1998) Evaluation of techniques for ocular measurement as an index of fatigue and as the basis for alertness management. Tech. rep., United States. National Highway Traffic Safety Administration

  • Fitzgibbon A, Pilu M, Fisher RB (1999) Direct least square fitting of ellipses. IEEE Trans Pattern Anal Mach Intell 21(5):476–480

    Article  Google Scholar 

  • Ford JK, Schmitt N, Schechtman SL, Hults BM, Doherty ML (1989) Process tracing methods: Contributions, problems, and neglected research questions. Organ Behav Hum Decis Process 43(1):75–117

    Article  Google Scholar 

  • Fuhl W, Santini TC, Kübler T, Kasneci E (2016) Else: Ellipse selection for robust pupil detection in real-world environments. In: Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications, ACM, pp 123–130

  • Fuhl W, Geisler D, Rosenstiel W, Kasneci E (2019) The applicability of cycle gans for pupil and eyelid segmentation, data generation and image refinement. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp 0–0

  • Gers FA, Schmidhuber J, Cummins F (1999) Learning to forget: Continual prediction with lstm

  • Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680

  • Graves A (2012) Supervised sequence labelling. Supervised sequence labelling with recurrent neural networks. Springer, Berlin, pp 5–13

    Chapter  Google Scholar 

  • Hansen DW, Ji Q (2009) In the eye of the beholder: A survey of models for eyes and gaze. IEEE Trans Pattern Anal Mach Intell 32(3):478–500

    Article  Google Scholar 

  • He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  • Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580

  • Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  • Jaiswal S, Virmani S, Sethi V, De K, Roy PP (2019) An intelligent recommendation system using gaze and emotion detection. Multimedia Tools Appl 78(11):14231–14250

    Article  Google Scholar 

  • Kalman RE (1960) A new approach to linear filtering and prediction problems. J Basic Eng 82(1):35–45

    Article  MathSciNet  Google Scholar 

  • Ketchantang W, Derrode S, Bourennane S, Martin L (2005) Video pupil tracking for iris based identification. In: International Conference on Advanced Concepts for Intelligent Vision Systems, Springer, pp 1–8

  • Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980

  • Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  • Li D, Winfield D, Parkhurst DJ (2005) Starburst: A hybrid algorithm for video-based eye tracking combining feature-based and model-based approaches. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05)-Workshops, IEEE, pp 79–79

  • Lipton ZC, Berkowitz J, Elkan C (2015) A critical review of recurrent neural networks for sequence learning. arXiv preprint arXiv:1506.00019

  • Liu N, Li H, Zhang M, Liu J, Sun Z, Tan T (2016a) Accurate iris segmentation in non-cooperative environments using fully convolutional networks. In: 2016 International Conference on Biometrics (ICB), IEEE, pp 1–8

  • Liu N, Zhang M, Li H, Sun Z, Tan T (2016b) Deepiris: Learning pairwise filter bank for heterogeneous iris verification. Pattern Recogn Lett 82:154–161

    Article  Google Scholar 

  • Lohse GL (1997) Consumer eye movement patterns on yellow pages advertising. J Advert 26(1):61–73

    Article  MathSciNet  Google Scholar 

  • Mallis MM (1999) Evaluation of techniques for drowsiness detection: Experiment on performance-based validation of fatigue-tracking technologies. Ph.D thesis, Drexel University

  • Meißner M, Pfeiffer J, Pfeiffer T, Oppewal H (2019) Combining virtual reality and mobile eye tracking to provide a naturalistic experimental environment for shopper research. J Bus Res 100:445–458

    Article  Google Scholar 

  • Milletari F, Navab N, Ahmadi SA (2016) V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), IEEE, pp 565–571

  • Morimoto CH, Koons D, Amir A, Flickner M (2000) Pupil detection and tracking using multiple light sources. Image Vis Comput 18(4):331–335

    Article  Google Scholar 

  • Rajpal S, Sadhya D, De K, Roy PP, Raman B (2019) Eai-net: Effective and accurate iris segmentation network. In: International Conference on Pattern Recognition and Machine Intelligence, Springer, pp 442–451

  • Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767

  • Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

  • Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99

  • Resnick M, Albert W (2014) The impact of advertising location and user task on the emergence of banner ad blindness: An eye-tracking study. Int J Human-Comput Interaction 30(3):206–219

    Article  Google Scholar 

  • Schnipke SK, Todd MW (2000) Trials and tribulations of using an eye-tracking system. In: CHI’00 Extended Abstracts on Human Factors in Computing Systems, ACM, pp 273–274

  • Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681

    Article  Google Scholar 

  • Swirski L, Bulling A, Dodgson NA (2012) Robust real-time pupil tracking in highly off-axis images. In: Etra, pp 173–176

  • Tonsen M, Zhang X, Sugano Y, Bulling A (2016) Labelled pupils in the wild: a dataset for studying pupil detection in unconstrained environments. In: Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications, pp 139–142

  • Vera-Olmos F, Pardo E, Melero H, Malpica N (2019) Deepeye: Deep convolutional network for pupil detection in real environments. Integr Comput-Aided Eng 26(1):85–95

    Article  Google Scholar 

  • Yilmaz A, Javed O, Shah M (2006) Object tracking: A survey. ACM Comput Survey (CSUR) 38(4):13

    Article  Google Scholar 

  • Yiu YH, Aboulatta M, Raiser T, Ophey L, Flanagin VL, zu Eulenburg P, Ahmadi SA (2019) Deepvog: Open-source pupil segmentation and gaze estimation in neuroscience using deep learning. Journal of neuroscience methods

  • Zaremba W, Sutskever I (2014) Learning to execute. arXiv preprint arXiv:1410.4615

  • Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision, Springer, pp 818–833

  • Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232

  • Zhu Z, Ji Q (2005) Robust real-time eye detection and tracking under variable lighting conditions and various face orientations. Comput Vis Image Underst 98(1):124–154

    Article  MathSciNet  Google Scholar 

  • Zhu Z, Fujimura K, Ji Q (2002a) Real-time eye detection and tracking under various light conditions. In: Proceedings of the 2002 symposium on Eye tracking research & applications, ACM, pp 139–144

  • Zhu Z, Ji Q, Fujimura K, Lee K (2002b) Combining kalman filtering and mean shift for real time eye tracking under active ir illumination. Object recognition supported by user interaction for service robots, IEEE 4:318–321

Download references

Funding

Authors gratefully acknowledge the financial support provided by National Defense Science and Technology Innovation Zone (No. ZT001007104).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lu Shi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shi, L., Wang, C., Tian, F. et al. An integrated neural network model for pupil detection and tracking. Soft Comput 25, 10117–10127 (2021). https://doi.org/10.1007/s00500-021-05984-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-021-05984-y

Keywords

Navigation