Skip to main content
Log in

Automatic camera calibration by landmarks on rigid objects

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

This article presents a new method for automatic calibration of surveillance cameras. We are dealing with traffic surveillance, and therefore, the camera is calibrated by observing vehicles; however, other rigid objects can be used instead. The proposed method is using keypoints or landmarks automatically detected on the observed objects by a convolutional neural network. By using fine-grained recognition of the vehicles (calibration objects), and by knowing the 3D positions of the landmarks for the (very limited) set of known objects, the extracted keypoints are used for calibration of the camera, resulting in internal (focal length) and external (rotation, translation) parameters and scene scale of the surveillance camera. We collected a dataset in two parking lots and equipped it with a calibration ground truth by measuring multiple distances in the ground plane. This dataset seems to be more accurate than the existing comparable data (GT calibration error reduced from 4.62 % to 0.99 %). Also, the experiments show that our method overcomes the best existing alternative in terms of accuracy (error reduced from 6.56 % to \(4.03\,\%\)) and our solution is also more flexible in terms of viewpoint change and other.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. https://medusa.fit.vutbr.cz/traffic.

References

  1. Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24, 381–395 (1981)

    Article  MathSciNet  Google Scholar 

  2. Bell, S., Lawrence Zitnick, C., Bala, K., Girshick, R.: Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2874–2883 (2016)

  3. Bhardwaj, R., Tummala, GK., Ramalingam, G., Ramjee, R., Sinha, P.: AutoCalib: Automatic traffic camera calibration at scale. In: The 4th ACM International Conference on Systems for Energy-Efficient Built Environments (BuildSys 2017) (2017)

  4. Bukhari, F., Dailey, M.: Automatic radial distortion estimation from a single image. J. Math. Imag. Vis. (2013). https://doi.org/10.1007/s10851-012-0342-2

    Article  MathSciNet  MATH  Google Scholar 

  5. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)

    MathSciNet  MATH  Google Scholar 

  6. Cathey, F., Dailey, D.: A novel technique to dynamically measure vehicle speed using uncalibrated roadway cameras. In: Intelligent Vehicles Symposium, pp. 777–782, (2005) https://doi.org/10.1109/IVS.2005.1505199

  7. Dai, J., He, K., Sun, J.: Instance-aware semantic segmentation via multi-task network cascades. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3150–3158 (2016)

  8. Dai, J., Li, Y., He, K., Sun, J.: R-fcn: Object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 379–387 (2016)

  9. Dailey, D., Cathey, F., Pumrin, S.: An algorithm to estimate mean traffic speed using uncalibrated cameras. IEEE Trans. Intell. Trans. Syst. 1(2), 98–107 (2000). https://doi.org/10.1109/6979.880967

    Article  Google Scholar 

  10. Do, V.H., Nghiem, L.H., Thi, N.P., Ngoc, N.P.: A simple camera calibration method for vehicle velocity estimation. In: 12th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), 2015, pp. 1–5 (2015)

  11. Dubská, M., Herout, A.: Real projective plane mapping for detection of orthogonal vanishing points. In: British Machine Vision Conference (BMVC), The British Machine Vision Association and Society for Pattern Recognition, pp. 1–10 (2013)

  12. Dubská, M., Sochor, J., Herout, A.: Automatic camera calibration for traffic understanding. In: British Machine Vision Conference (BMVC) (2014)

  13. Filipiak, P., Golenko, B., Dolega, C.: NSGA-II based auto-calibration of automatic number plate recognition camera for vehicle speed measurement. In: EvoApplications 2016, Springer International Publishing, pp. 803–818, (2016) https://doi.org/10.1007/978-3-319-31204-0_51

  14. Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

  15. Girshick, R.: Fast r-cnn. In: The IEEE International Conference on Computer Vision (ICCV), pp. 1440–1448 (2015)

  16. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)

  17. Grammatikopoulos, L., Karras, G., Petsa, E.: Automatic estimation of vehicle speed from uncalibrated video sequences. In: Proceedings of International Symposium on Modern Technologies, Educationand Profeesional Practice in Geodesy and Related Fields, pp. 332–338 (2005)

  18. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

  19. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: The IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988, (2017). https://doi.org/10.1109/ICCV.2017.322

  20. He, X.C., Yung, N.H.C.: A novel algorithm for estimating vehicle speed from two consecutive images. In: IEEE Workshop on Applications of Computer Vision, WACV, (2007). https://doi.org/10.1109/WACV.2007.7

  21. Hesch, J.A., Roumeliotis, S.I.: A direct least-squares (dls) method for pnp. In: The IEEE International Conference on Computer Vision (ICCV), pp. 383–390, (2011). https://doi.org/10.1109/ICCV.2011.6126266

  22. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint (2017). arXiv:1704.04861

  23. Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint (2015). arXiv:1502.03167

  24. Juránek, R., Herout, A., Dubská, M., ZemČík, P.: Real-time pose estimation piggybacked on object detection. In: The IEEE International Conference on Computer Vision (ICCV) (2015)

  25. Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014)

  26. Kirillov, A., Girshick, R.B., He, K., Dollár, P.: Panoptic feature pyramid networks. CoRR abs/1901.02446 (2019)

  27. Kneip, L., Li, H., Seo, Y.: Upnp: An optimal o(n) solution to the absolute pose problem with universal applicability. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) Computer Vision - ECCV 2014, pp. 127–142. Springer International Publishing, Cham (2014)

    Chapter  Google Scholar 

  28. Lan, J., Li, J., Hu, G., Ran, B., Wang, L.: Vehicle speed measurement based on gray constraint optical flow algorithm. Optik - Int. J. Light Electron Opt. 125(1), 289–295 (2014). https://doi.org/10.1016/j.ijleo.2013.06.036

    Article  Google Scholar 

  29. Lepetit, V., Moreno-Noguer, F., Fua, P.: Epnp: an accurate o(n) solution to the pnp problem. Int. J. Comput. Vis. 81, 155–166 (2008)

    Article  Google Scholar 

  30. Lin, D., Shen, X., Lu, C., Jia, J.: Deep LAC: Deep localization, alignment and classification for fine-grained recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

  31. Lin, T.Y., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: The IEEE International Conference on Computer Vision (ICCV) (2015)

  32. Lin, Y.L., Morariu, V.I., Hsu, W., Davis, L.S.: Jointly optimizing 3D model fitting and fine-grained classification. In: European Conference on Computer Vision (ECCV) (2014)

  33. Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45(1), 503–528 (1989). https://doi.org/10.1007/BF01589116

    Article  MathSciNet  MATH  Google Scholar 

  34. Liu, H., Tian, Y., Yang, Y., Pang, L., Huang, T.: Deep relative distance learning: Tell the difference between similar vehicles. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

  35. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European Conference on Computer Vision (ECCV), Springer, pp. 21–37 (2016)

  36. Llorca, D.F., Salinas, C., Jimenez, M., Parra, I., Morcillo, A.G., Izquierdo, R., Lorenzo, J., Sotelo, M.A.: Two-camera based accurate vehicle speed measurement using average speed at a fixed point. In: 19th International IEEE Conference on Intelligent Transportation Systems (ITSC), (2016). https://doi.org/10.1109/ITSC.2014.6958187

  37. Luvizon, D., Nassu, B., Minetto, R.: Vehicle speed estimation by license plate detection and tracking. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014, pp. 6563–6567, (2014). https://doi.org/10.1109/ICASSP.2014.6854869

  38. Maduro, C., Batista, K., Peixoto, P., Batista, J.: Estimation of vehicle velocity and traffic intensity using rectified images. In: 15th IEEE International Conference on Image Processing (ICIP), pp. 777–780, (2008). https://doi.org/10.1109/ICIP.2008.4711870

  39. Meng, X., Hu, Z.: A new easy camera calibration technique based on circular points. Pattern Recogn. 36, 1155–1164 (2003). https://doi.org/10.1016/S0031-3203(02)00225-X

    Article  MATH  Google Scholar 

  40. Moulon, P., Monasse, P., Marlet, R.: Global fusion of relative motions for robust, accurate and scalable structure from motion. In: The IEEE International Conference on Computer Vision (ICCV), pp. 3248–3255, (2013). https://doi.org/10.1109/ICCV.2013.403

  41. Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: European Conference on Computer Vision (ECCV), Springer, pp. 483–499 (2016)

  42. Nurhadiyatna, A., Hardjono, B., Wibisono, A., Sina, I., Jatmiko, W., Ma’sum, M., Mursanto, P.: Improved vehicle speed estimation using gaussian mixture model and hole filling algorithm. In: International Conference on Advanced Computer Science and Information Systems (ICACSIS), 2013, pp. 451–456, (2013). https://doi.org/10.1109/ICACSIS.2013.6761617

  43. Pearce, G., Pears, N.: Automatic make and model recognition from frontal images of cars. In: IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 373–378, (2011). https://doi.org/10.1109/AVSS.2011.6027353

  44. Penate-Sanchez, A., Andrade-Cetto, J., Moreno-Noguer, F.: Exhaustive linearization for robust camera pose and focal length estimation. IEEE Trans. Pattern Anal. Mach. Intell. 35(10), 2387–2400 (2013). https://doi.org/10.1109/TPAMI.2013.36

    Article  Google Scholar 

  45. Pirsiavash, H., Ramanan, D., Fowlkes, C.C.: Bilinear classifiers for visual recognition. In: Schuurmans, D., Lafferty, J., Williams, C., Culotta, A. (eds.) Bengio Y, pp. 1482–1490. Curran Associates Inc, Advances in Neural Information Processing Systems (NIPS) (2009)

    Google Scholar 

  46. Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: Unified, real-time object detection. Computing Research Repository (CoRR) abs/1506.02640 (2015)

  47. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems (NIPS) (2015)

  48. Schoepflin, T., Dailey, D.: Dynamic camera calibration of roadside traffic management cameras for vehicle speed estimation. IEEE Trans. Intell. Trans. Syst. 4(2), 90–98 (2003). https://doi.org/10.1109/TITS.2003.821213

    Article  Google Scholar 

  49. Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 761–769 (2016)

  50. Simon, M., Rodner, E.: Neural activation constellations: Unsupervised part model discovery with convolutional networks. In: The IEEE International Conference on Computer Vision (ICCV) (2015)

  51. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Computing Research Repository (CoRR) abs/1409.1556 (2014)

  52. Sochor, J., Herout, A., Havel, J.: BoxCars: 3D boxes as CNN input for improved fine-grained vehicle recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

  53. Sochor, J., Juranek, R., Herout, A.: Traffic surveillance camera calibration by 3D model bounding box alignment for accurate vehicle speed measurement. Comput. Vis. Image Understand. 161, 87–98 (2017). https://doi.org/10.1016/j.cviu.2017.05.015

    Article  Google Scholar 

  54. Sochor, J., Špaňhel, J., Herout, A.: BoxCars: Improving fine-grained recognition of vehicles using 3D bounding boxes in traffic surveillance. (2017) arXiv:1703.00686

  55. Sochor, J., Juránek, R., Špaňhel, J., Maršík, L., Široký, A., Herout, A., Zemčík, P.: Comprehensive data set for automatic single camera visual speed measurement. IEEE Trans. Intell. Trans. Syst. (2018). https://doi.org/10.1109/TITS.2018.2825609

    Article  Google Scholar 

  56. Song, K.T., Tai, J.C.: Dynamic calibration of Pan-Tilt-Zoom cameras for traffic monitoring. IEEE Trans. Syst. Man Cybern. B Cybern. 36(5), 1091–1103 (2006). https://doi.org/10.1109/TSMCB.2006.872271

    Article  Google Scholar 

  57. Storn, R., Price, K.: Differential evolution - a simple and efficient heuristic for global optimization over continuous spaces. J. Global Optim. 11(4), 341–359 (1997). https://doi.org/10.1023/A:1008202821328

    Article  MathSciNet  MATH  Google Scholar 

  58. Szegedy, C., Reed, S., Erhan, D., Anguelov, D., Ioffe, S.: Scalable, high-quality object detection. arXiv preprint (2014) arXiv:1412.1441

  59. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2818–2826 (2016)

  60. de Villiers, J.P., Leuschner, F.W., Geldenhuys, R.: Centi-pixel accurate real-time inverse distortion correction. In: Wen, J.T., Hodko, D., Otani, Y., Kofman, J., Kaynak, O. (eds.) Optomechatronic Technologies 2008, International Society for Optics and Photonics, vol. 7266, pp. 320–327. SPIE, Cham (2008). https://doi.org/10.1117/12.804771

    Chapter  Google Scholar 

  61. Wang, Z., Tang, L., Liu, X., Yao, Z., Yi, S., Shao, J., Yan, J., Wang, S., Li, H., Wang, X.: Orientation invariant feature embedding and spatial temporal regularization for vehicle re-identification. In: The IEEE International Conference on Computer Vision (ICCV) (2017)

  62. You, X., Zheng, Y.: An accurate and practical calibration method for roadside camera using two vanishing points. Neurocomputing (2016). https://doi.org/10.1016/j.neucom.2015.09.132

    Article  Google Scholar 

  63. Zagoruyko, S., Lerer, A., Lin, T.Y., Pinheiro, P.O., Gross, S., Chintala, S., Dollár, P.: A multipath network for object detection. arXiv preprint arXiv:1604.02135 (2016)

  64. Zhang, B.: Classification and identification of vehicle type and make by cortex-like image descriptor HMAX. Int. J. Comput. Vis. Robot. 4, 195–211 (2014)

    Article  Google Scholar 

  65. Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22 (2000)

  66. Zhang, Z.: Camera calibration with one-dimensional objects. IEEE Trans. Pattern Anal. Mach. Intell. 26(7), 892–899 (2004)

    Article  Google Scholar 

  67. Zheng, Y., Kneip, L.: A direct least-squares solution to the PnP problem with unknown focal length. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1790–1798, (2016) https://doi.org/10.1109/CVPR.2016.198

  68. Zheng, Y., Kuang, Y., Sugimoto, S., Åström, K., Okutomi, M.: Revisiting the pnp problem: A fast, general and optimal solution. In: The IEEE International Conference on Computer Vision (ICCV), pp. 2344–2351, (2013). https://doi.org/10.1109/ICCV.2013.291

  69. Zheng, Y., Sugimoto, S., Sato, I., Okutomi, M.: A general and simple method for camera pose and focal length determination. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 430–437, (2014). https://doi.org/10.1109/CVPR.2014.62

Download references

Acknowledgements

This work was supported by The Ministry of Education, Youth and Sports of the Czech Republic from the National Programme of Sustainability (NPU II); project IT4Innovations excellence in science—LQ1602.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vojtěch Bartl.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bartl, V., Špaňhel, J., Dobeš, P. et al. Automatic camera calibration by landmarks on rigid objects. Machine Vision and Applications 32, 2 (2021). https://doi.org/10.1007/s00138-020-01125-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-020-01125-x

Keywords

Navigation