Abstract
The accurate detection and tracking of pupil is important to many applications such as human–computer interaction, driver’s fatigue detection and diagnosis of brain diseases. Existing approaches however face challenges in handing low quality of pupil images. In this paper, we propose an integrated pupil tracking framework, namely LVCF, based on deep learning. LVCF consists of the pupil detection model VCF which is an end-to-end network, and the LSTM pupil motion prediction model which applies LSTM to track pupil’s position. The proposed network was trained and evaluated on 10600 images and 75 videos taken from 3 realistic datasets. Within an error threshold of 5 pixels, VCF achieves an accuracy of more than 81%, and LVCF outperforms the state of arts by 9% in terms of percentage of pupils tracked. The project of LCVF is available at https://github.com/UnderTheMangoTree/LVCF.
Similar content being viewed by others
References
Bhunia AK, Bhattacharyya A, Banerjee P, Roy PP, Murala S (2019) A novel feature descriptor for image retrieval by combining modified color histogram and diagonally symmetric co-occurrence texture pattern. Pattern Analysis and Applications pp 1–21
Chen LC, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587
De Mulder W, Bethard S, Moens MF (2015) A survey on the application of recurrent neural networks to statistical language modeling. Comput Speech Lang 30(1):61–98
Dinges DF, Grace R (1998) Perclos: A valid psychophysiological measure of alertness as assessed by psychomotor vigilance. US Department of Transportation, Federal Highway Administration, Publication Number FHWA-MCRT-98-006
Dinges DF, Mallis MM, Maislin G, Powell JW, et al. (1998) Evaluation of techniques for ocular measurement as an index of fatigue and as the basis for alertness management. Tech. rep., United States. National Highway Traffic Safety Administration
Fitzgibbon A, Pilu M, Fisher RB (1999) Direct least square fitting of ellipses. IEEE Trans Pattern Anal Mach Intell 21(5):476–480
Ford JK, Schmitt N, Schechtman SL, Hults BM, Doherty ML (1989) Process tracing methods: Contributions, problems, and neglected research questions. Organ Behav Hum Decis Process 43(1):75–117
Fuhl W, Santini TC, Kübler T, Kasneci E (2016) Else: Ellipse selection for robust pupil detection in real-world environments. In: Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications, ACM, pp 123–130
Fuhl W, Geisler D, Rosenstiel W, Kasneci E (2019) The applicability of cycle gans for pupil and eyelid segmentation, data generation and image refinement. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp 0–0
Gers FA, Schmidhuber J, Cummins F (1999) Learning to forget: Continual prediction with lstm
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Graves A (2012) Supervised sequence labelling. Supervised sequence labelling with recurrent neural networks. Springer, Berlin, pp 5–13
Hansen DW, Ji Q (2009) In the eye of the beholder: A survey of models for eyes and gaze. IEEE Trans Pattern Anal Mach Intell 32(3):478–500
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Jaiswal S, Virmani S, Sethi V, De K, Roy PP (2019) An intelligent recommendation system using gaze and emotion detection. Multimedia Tools Appl 78(11):14231–14250
Kalman RE (1960) A new approach to linear filtering and prediction problems. J Basic Eng 82(1):35–45
Ketchantang W, Derrode S, Bourennane S, Martin L (2005) Video pupil tracking for iris based identification. In: International Conference on Advanced Concepts for Intelligent Vision Systems, Springer, pp 1–8
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Li D, Winfield D, Parkhurst DJ (2005) Starburst: A hybrid algorithm for video-based eye tracking combining feature-based and model-based approaches. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05)-Workshops, IEEE, pp 79–79
Lipton ZC, Berkowitz J, Elkan C (2015) A critical review of recurrent neural networks for sequence learning. arXiv preprint arXiv:1506.00019
Liu N, Li H, Zhang M, Liu J, Sun Z, Tan T (2016a) Accurate iris segmentation in non-cooperative environments using fully convolutional networks. In: 2016 International Conference on Biometrics (ICB), IEEE, pp 1–8
Liu N, Zhang M, Li H, Sun Z, Tan T (2016b) Deepiris: Learning pairwise filter bank for heterogeneous iris verification. Pattern Recogn Lett 82:154–161
Lohse GL (1997) Consumer eye movement patterns on yellow pages advertising. J Advert 26(1):61–73
Mallis MM (1999) Evaluation of techniques for drowsiness detection: Experiment on performance-based validation of fatigue-tracking technologies. Ph.D thesis, Drexel University
Meißner M, Pfeiffer J, Pfeiffer T, Oppewal H (2019) Combining virtual reality and mobile eye tracking to provide a naturalistic experimental environment for shopper research. J Bus Res 100:445–458
Milletari F, Navab N, Ahmadi SA (2016) V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), IEEE, pp 565–571
Morimoto CH, Koons D, Amir A, Flickner M (2000) Pupil detection and tracking using multiple light sources. Image Vis Comput 18(4):331–335
Rajpal S, Sadhya D, De K, Roy PP, Raman B (2019) Eai-net: Effective and accurate iris segmentation network. In: International Conference on Pattern Recognition and Machine Intelligence, Springer, pp 442–451
Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Resnick M, Albert W (2014) The impact of advertising location and user task on the emergence of banner ad blindness: An eye-tracking study. Int J Human-Comput Interaction 30(3):206–219
Schnipke SK, Todd MW (2000) Trials and tribulations of using an eye-tracking system. In: CHI’00 Extended Abstracts on Human Factors in Computing Systems, ACM, pp 273–274
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681
Swirski L, Bulling A, Dodgson NA (2012) Robust real-time pupil tracking in highly off-axis images. In: Etra, pp 173–176
Tonsen M, Zhang X, Sugano Y, Bulling A (2016) Labelled pupils in the wild: a dataset for studying pupil detection in unconstrained environments. In: Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications, pp 139–142
Vera-Olmos F, Pardo E, Melero H, Malpica N (2019) Deepeye: Deep convolutional network for pupil detection in real environments. Integr Comput-Aided Eng 26(1):85–95
Yilmaz A, Javed O, Shah M (2006) Object tracking: A survey. ACM Comput Survey (CSUR) 38(4):13
Yiu YH, Aboulatta M, Raiser T, Ophey L, Flanagin VL, zu Eulenburg P, Ahmadi SA (2019) Deepvog: Open-source pupil segmentation and gaze estimation in neuroscience using deep learning. Journal of neuroscience methods
Zaremba W, Sutskever I (2014) Learning to execute. arXiv preprint arXiv:1410.4615
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision, Springer, pp 818–833
Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232
Zhu Z, Ji Q (2005) Robust real-time eye detection and tracking under variable lighting conditions and various face orientations. Comput Vis Image Underst 98(1):124–154
Zhu Z, Fujimura K, Ji Q (2002a) Real-time eye detection and tracking under various light conditions. In: Proceedings of the 2002 symposium on Eye tracking research & applications, ACM, pp 139–144
Zhu Z, Ji Q, Fujimura K, Lee K (2002b) Combining kalman filtering and mean shift for real time eye tracking under active ir illumination. Object recognition supported by user interaction for service robots, IEEE 4:318–321
Funding
Authors gratefully acknowledge the financial support provided by National Defense Science and Technology Innovation Zone (No. ZT001007104).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Shi, L., Wang, C., Tian, F. et al. An integrated neural network model for pupil detection and tracking. Soft Comput 25, 10117–10127 (2021). https://doi.org/10.1007/s00500-021-05984-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-021-05984-y