Abstract
This article presents a new method for automatic calibration of surveillance cameras. We are dealing with traffic surveillance, and therefore, the camera is calibrated by observing vehicles; however, other rigid objects can be used instead. The proposed method is using keypoints or landmarks automatically detected on the observed objects by a convolutional neural network. By using fine-grained recognition of the vehicles (calibration objects), and by knowing the 3D positions of the landmarks for the (very limited) set of known objects, the extracted keypoints are used for calibration of the camera, resulting in internal (focal length) and external (rotation, translation) parameters and scene scale of the surveillance camera. We collected a dataset in two parking lots and equipped it with a calibration ground truth by measuring multiple distances in the ground plane. This dataset seems to be more accurate than the existing comparable data (GT calibration error reduced from 4.62 % to 0.99 %). Also, the experiments show that our method overcomes the best existing alternative in terms of accuracy (error reduced from 6.56 % to \(4.03\,\%\)) and our solution is also more flexible in terms of viewpoint change and other.
Similar content being viewed by others
Notes
https://medusa.fit.vutbr.cz/traffic.
References
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24, 381–395 (1981)
Bell, S., Lawrence Zitnick, C., Bala, K., Girshick, R.: Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2874–2883 (2016)
Bhardwaj, R., Tummala, GK., Ramalingam, G., Ramjee, R., Sinha, P.: AutoCalib: Automatic traffic camera calibration at scale. In: The 4th ACM International Conference on Systems for Energy-Efficient Built Environments (BuildSys 2017) (2017)
Bukhari, F., Dailey, M.: Automatic radial distortion estimation from a single image. J. Math. Imag. Vis. (2013). https://doi.org/10.1007/s10851-012-0342-2
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)
Cathey, F., Dailey, D.: A novel technique to dynamically measure vehicle speed using uncalibrated roadway cameras. In: Intelligent Vehicles Symposium, pp. 777–782, (2005) https://doi.org/10.1109/IVS.2005.1505199
Dai, J., He, K., Sun, J.: Instance-aware semantic segmentation via multi-task network cascades. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3150–3158 (2016)
Dai, J., Li, Y., He, K., Sun, J.: R-fcn: Object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 379–387 (2016)
Dailey, D., Cathey, F., Pumrin, S.: An algorithm to estimate mean traffic speed using uncalibrated cameras. IEEE Trans. Intell. Trans. Syst. 1(2), 98–107 (2000). https://doi.org/10.1109/6979.880967
Do, V.H., Nghiem, L.H., Thi, N.P., Ngoc, N.P.: A simple camera calibration method for vehicle velocity estimation. In: 12th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), 2015, pp. 1–5 (2015)
Dubská, M., Herout, A.: Real projective plane mapping for detection of orthogonal vanishing points. In: British Machine Vision Conference (BMVC), The British Machine Vision Association and Society for Pattern Recognition, pp. 1–10 (2013)
Dubská, M., Sochor, J., Herout, A.: Automatic camera calibration for traffic understanding. In: British Machine Vision Conference (BMVC) (2014)
Filipiak, P., Golenko, B., Dolega, C.: NSGA-II based auto-calibration of automatic number plate recognition camera for vehicle speed measurement. In: EvoApplications 2016, Springer International Publishing, pp. 803–818, (2016) https://doi.org/10.1007/978-3-319-31204-0_51
Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Girshick, R.: Fast r-cnn. In: The IEEE International Conference on Computer Vision (ICCV), pp. 1440–1448 (2015)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
Grammatikopoulos, L., Karras, G., Petsa, E.: Automatic estimation of vehicle speed from uncalibrated video sequences. In: Proceedings of International Symposium on Modern Technologies, Educationand Profeesional Practice in Geodesy and Related Fields, pp. 332–338 (2005)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: The IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988, (2017). https://doi.org/10.1109/ICCV.2017.322
He, X.C., Yung, N.H.C.: A novel algorithm for estimating vehicle speed from two consecutive images. In: IEEE Workshop on Applications of Computer Vision, WACV, (2007). https://doi.org/10.1109/WACV.2007.7
Hesch, J.A., Roumeliotis, S.I.: A direct least-squares (dls) method for pnp. In: The IEEE International Conference on Computer Vision (ICCV), pp. 383–390, (2011). https://doi.org/10.1109/ICCV.2011.6126266
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint (2017). arXiv:1704.04861
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint (2015). arXiv:1502.03167
Juránek, R., Herout, A., Dubská, M., ZemČík, P.: Real-time pose estimation piggybacked on object detection. In: The IEEE International Conference on Computer Vision (ICCV) (2015)
Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014)
Kirillov, A., Girshick, R.B., He, K., Dollár, P.: Panoptic feature pyramid networks. CoRR abs/1901.02446 (2019)
Kneip, L., Li, H., Seo, Y.: Upnp: An optimal o(n) solution to the absolute pose problem with universal applicability. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) Computer Vision - ECCV 2014, pp. 127–142. Springer International Publishing, Cham (2014)
Lan, J., Li, J., Hu, G., Ran, B., Wang, L.: Vehicle speed measurement based on gray constraint optical flow algorithm. Optik - Int. J. Light Electron Opt. 125(1), 289–295 (2014). https://doi.org/10.1016/j.ijleo.2013.06.036
Lepetit, V., Moreno-Noguer, F., Fua, P.: Epnp: an accurate o(n) solution to the pnp problem. Int. J. Comput. Vis. 81, 155–166 (2008)
Lin, D., Shen, X., Lu, C., Jia, J.: Deep LAC: Deep localization, alignment and classification for fine-grained recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Lin, T.Y., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: The IEEE International Conference on Computer Vision (ICCV) (2015)
Lin, Y.L., Morariu, V.I., Hsu, W., Davis, L.S.: Jointly optimizing 3D model fitting and fine-grained classification. In: European Conference on Computer Vision (ECCV) (2014)
Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45(1), 503–528 (1989). https://doi.org/10.1007/BF01589116
Liu, H., Tian, Y., Yang, Y., Pang, L., Huang, T.: Deep relative distance learning: Tell the difference between similar vehicles. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European Conference on Computer Vision (ECCV), Springer, pp. 21–37 (2016)
Llorca, D.F., Salinas, C., Jimenez, M., Parra, I., Morcillo, A.G., Izquierdo, R., Lorenzo, J., Sotelo, M.A.: Two-camera based accurate vehicle speed measurement using average speed at a fixed point. In: 19th International IEEE Conference on Intelligent Transportation Systems (ITSC), (2016). https://doi.org/10.1109/ITSC.2014.6958187
Luvizon, D., Nassu, B., Minetto, R.: Vehicle speed estimation by license plate detection and tracking. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014, pp. 6563–6567, (2014). https://doi.org/10.1109/ICASSP.2014.6854869
Maduro, C., Batista, K., Peixoto, P., Batista, J.: Estimation of vehicle velocity and traffic intensity using rectified images. In: 15th IEEE International Conference on Image Processing (ICIP), pp. 777–780, (2008). https://doi.org/10.1109/ICIP.2008.4711870
Meng, X., Hu, Z.: A new easy camera calibration technique based on circular points. Pattern Recogn. 36, 1155–1164 (2003). https://doi.org/10.1016/S0031-3203(02)00225-X
Moulon, P., Monasse, P., Marlet, R.: Global fusion of relative motions for robust, accurate and scalable structure from motion. In: The IEEE International Conference on Computer Vision (ICCV), pp. 3248–3255, (2013). https://doi.org/10.1109/ICCV.2013.403
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: European Conference on Computer Vision (ECCV), Springer, pp. 483–499 (2016)
Nurhadiyatna, A., Hardjono, B., Wibisono, A., Sina, I., Jatmiko, W., Ma’sum, M., Mursanto, P.: Improved vehicle speed estimation using gaussian mixture model and hole filling algorithm. In: International Conference on Advanced Computer Science and Information Systems (ICACSIS), 2013, pp. 451–456, (2013). https://doi.org/10.1109/ICACSIS.2013.6761617
Pearce, G., Pears, N.: Automatic make and model recognition from frontal images of cars. In: IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 373–378, (2011). https://doi.org/10.1109/AVSS.2011.6027353
Penate-Sanchez, A., Andrade-Cetto, J., Moreno-Noguer, F.: Exhaustive linearization for robust camera pose and focal length estimation. IEEE Trans. Pattern Anal. Mach. Intell. 35(10), 2387–2400 (2013). https://doi.org/10.1109/TPAMI.2013.36
Pirsiavash, H., Ramanan, D., Fowlkes, C.C.: Bilinear classifiers for visual recognition. In: Schuurmans, D., Lafferty, J., Williams, C., Culotta, A. (eds.) Bengio Y, pp. 1482–1490. Curran Associates Inc, Advances in Neural Information Processing Systems (NIPS) (2009)
Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: Unified, real-time object detection. Computing Research Repository (CoRR) abs/1506.02640 (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems (NIPS) (2015)
Schoepflin, T., Dailey, D.: Dynamic camera calibration of roadside traffic management cameras for vehicle speed estimation. IEEE Trans. Intell. Trans. Syst. 4(2), 90–98 (2003). https://doi.org/10.1109/TITS.2003.821213
Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 761–769 (2016)
Simon, M., Rodner, E.: Neural activation constellations: Unsupervised part model discovery with convolutional networks. In: The IEEE International Conference on Computer Vision (ICCV) (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Computing Research Repository (CoRR) abs/1409.1556 (2014)
Sochor, J., Herout, A., Havel, J.: BoxCars: 3D boxes as CNN input for improved fine-grained vehicle recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Sochor, J., Juranek, R., Herout, A.: Traffic surveillance camera calibration by 3D model bounding box alignment for accurate vehicle speed measurement. Comput. Vis. Image Understand. 161, 87–98 (2017). https://doi.org/10.1016/j.cviu.2017.05.015
Sochor, J., Špaňhel, J., Herout, A.: BoxCars: Improving fine-grained recognition of vehicles using 3D bounding boxes in traffic surveillance. (2017) arXiv:1703.00686
Sochor, J., Juránek, R., Špaňhel, J., Maršík, L., Široký, A., Herout, A., Zemčík, P.: Comprehensive data set for automatic single camera visual speed measurement. IEEE Trans. Intell. Trans. Syst. (2018). https://doi.org/10.1109/TITS.2018.2825609
Song, K.T., Tai, J.C.: Dynamic calibration of Pan-Tilt-Zoom cameras for traffic monitoring. IEEE Trans. Syst. Man Cybern. B Cybern. 36(5), 1091–1103 (2006). https://doi.org/10.1109/TSMCB.2006.872271
Storn, R., Price, K.: Differential evolution - a simple and efficient heuristic for global optimization over continuous spaces. J. Global Optim. 11(4), 341–359 (1997). https://doi.org/10.1023/A:1008202821328
Szegedy, C., Reed, S., Erhan, D., Anguelov, D., Ioffe, S.: Scalable, high-quality object detection. arXiv preprint (2014) arXiv:1412.1441
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2818–2826 (2016)
de Villiers, J.P., Leuschner, F.W., Geldenhuys, R.: Centi-pixel accurate real-time inverse distortion correction. In: Wen, J.T., Hodko, D., Otani, Y., Kofman, J., Kaynak, O. (eds.) Optomechatronic Technologies 2008, International Society for Optics and Photonics, vol. 7266, pp. 320–327. SPIE, Cham (2008). https://doi.org/10.1117/12.804771
Wang, Z., Tang, L., Liu, X., Yao, Z., Yi, S., Shao, J., Yan, J., Wang, S., Li, H., Wang, X.: Orientation invariant feature embedding and spatial temporal regularization for vehicle re-identification. In: The IEEE International Conference on Computer Vision (ICCV) (2017)
You, X., Zheng, Y.: An accurate and practical calibration method for roadside camera using two vanishing points. Neurocomputing (2016). https://doi.org/10.1016/j.neucom.2015.09.132
Zagoruyko, S., Lerer, A., Lin, T.Y., Pinheiro, P.O., Gross, S., Chintala, S., Dollár, P.: A multipath network for object detection. arXiv preprint arXiv:1604.02135 (2016)
Zhang, B.: Classification and identification of vehicle type and make by cortex-like image descriptor HMAX. Int. J. Comput. Vis. Robot. 4, 195–211 (2014)
Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22 (2000)
Zhang, Z.: Camera calibration with one-dimensional objects. IEEE Trans. Pattern Anal. Mach. Intell. 26(7), 892–899 (2004)
Zheng, Y., Kneip, L.: A direct least-squares solution to the PnP problem with unknown focal length. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1790–1798, (2016) https://doi.org/10.1109/CVPR.2016.198
Zheng, Y., Kuang, Y., Sugimoto, S., Åström, K., Okutomi, M.: Revisiting the pnp problem: A fast, general and optimal solution. In: The IEEE International Conference on Computer Vision (ICCV), pp. 2344–2351, (2013). https://doi.org/10.1109/ICCV.2013.291
Zheng, Y., Sugimoto, S., Sato, I., Okutomi, M.: A general and simple method for camera pose and focal length determination. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 430–437, (2014). https://doi.org/10.1109/CVPR.2014.62
Acknowledgements
This work was supported by The Ministry of Education, Youth and Sports of the Czech Republic from the National Programme of Sustainability (NPU II); project IT4Innovations excellence in science—LQ1602.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bartl, V., Špaňhel, J., Dobeš, P. et al. Automatic camera calibration by landmarks on rigid objects. Machine Vision and Applications 32, 2 (2021). https://doi.org/10.1007/s00138-020-01125-x
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00138-020-01125-x