Skip to main content
Log in

Method for Whale Re-identification Based on Siamese Nets and Adversarial Training

  • Published:
Optical Memory and Neural Networks Aims and scope Submit manuscript

Abstract

Training Convolutional Neural Networks that do well in one-shot learning settings can have wide range of impacts on real-world datasets. In this paper, we explore an adversarial training method that learns a Siamese neural network in an end-to-end fashion for two models—ConvNets model that learns image embeddings from input image pair, and head model that further learns the distance between those embeddings. We further present an adversarial mining approach that efficiently identifies and selects harder examples during training, which significantly boosts the model performance over difficult cases. We have done a comprehensive evaluation using a public whale identification dataset hosted on Kaggle platform, and report state-of-the-art performance benchmarking against result of other 2000 participants.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.
Fig. 9.
Fig. 10.

Similar content being viewed by others

REFERENCES

  1. https://www.kaggle.com/c/humpback-whale-identification/.

  2. Person Re-Identification, Gong, S., et al., Eds., Springer Science & Business Media, 2014. https://doi.org/10.1007/978-1-4471-6296-4_1

    Book  Google Scholar 

  3. Zhang, X., Luo, H., Fan, X., Xiang, W., Sun, Y., Xiao, Q., and Sun, J., Alignedreid: Surpassing human-level performance in person re-identification, arXiv preprint arXiv:1711.08184, 2017.

  4. Farenzena, M., Bazzani, L., Perina, A., Murino, V., and Cristani, M., Person re-identification by symmetry-driven accumulation of local features, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010, pp. 2360–2367.

  5. https://www.kaggle.com/martinpiotte/whale-recognition-model-with-score-0-78563.

  6. Koch, G., Zemel, R., and Salakhutdinov, R., Siamese neural networks for one-shot image recognition, ICML Deep Learning Workshop, 2015, vol. 2.

  7. Bromley, J., Bentz, J.W., Bottou, L., Guyon, I., LeCun, Y., Moore, C., Säckinger, E., and Shah, R., Signature verification using a Siamese time delay neural network, Int. J. Pattern Recognit. Artif. Intell., 1993, vol. 7, no. 4, pp. 669–688.

    Article  Google Scholar 

  8. Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q., Densely connected convolutional networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 4700–4708.

  9. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z., Rethinking the inception architecture for computer vision, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2818–2826.

  10. He, K., Zhang, X., Ren, S., and Sun, J., Deep residual learning for image recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.

  11. Xie, S., Girshick, R., Dollár, P., Tu, Z., and He, K., Aggregated residual transformations for deep neural networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1492–1500.

  12. Jia Deng, Wei Dong, Socher, R., Li-Jia Li, Kai Li, and Li Fei-Fei, ImageNet: A large-scale hierarchical image database, 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255.

  13. Lin, T.Y., Goyal, P., Girshick, R., He, K., and Dollár, P., Focal loss for dense object detection, Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2980–2988.

  14. https://github.com/ZFTurbo/Keras-RetinaNet-for-Open-Images-Challenge-2018.

  15. Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., and Ferrari, V., The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale, arXiv preprint arXiv:1811.00982, 2018.

  16. Smith, L.N., Cyclical learning rates for training neural networks, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), 2017, pp. 464–472.

  17. Schroff, F., Kalenichenko, D., and Philbin, J., Facenet: A unified embedding for face recognition and clustering, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 815–823.

  18. Burkard, R.E. and Cela, E., Linear assignment problems and extensions, in Handbook of Combinatorial Optimization, Boston, MA: Springer, 1999, pp. 75–149.

    Google Scholar 

  19. https://www.kaggle.com/c/whale-categorization-playground.

  20. Buslaev, A., Parinov, A., Khvedchenya, E., Iglovikov, V.I., and Kalinin, A.A., Albumentations: Fast and flexible image augmentations, arXiv preprint arXiv:1809.06839, 2018.

  21. https://github.com/aaxwaz/Humpback-whale-identification-challenge.

  22. https://www.kaggle.com/zfturbo/visualisation-of-siamese-net.

  23. Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., and Tian, Q., Scalable person re-identification: A benchmark, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1116–1124.

  24. Liao, W., Ying Yang, M., Zhan, N., and Rosenhahn, B., Triplet-based deep similarity learning for person re-identification, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 385–393.

  25. Jonker, R. and Volgenant, A., A shortest augmenting path algorithm for dense and spare linear assignment problems, Computing, 1987, vol. 38, pp. 325–340.

    Article  MathSciNet  Google Scholar 

  26. Chung, D., Tahboub, K., and Delp, E.J., A two stream Siamese convolutional neural network for person re-identification, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 1983–1991.

  27. Zheng, M., Karanam, S., Wu, Z., and Radke, R.J., Re-identification with consistent attentive Siamese networks, arXiv preprint arXiv:1811.07487, 2018.

  28. Shen, C., Jin, Z., Zhao, Y., Fu, Z., Jiang, R., Chen, Y., and Hua, X.S, Deep Siamese network with multi-level similarity perception for person re-identification, Proceedings of the 25th ACM International Conference on Multimedia, 2017, pp. 1942–1950.

  29. Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., and Torralba, A., Learning deep features for discriminative localization, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2921–2929.

  30. Ioffe, S. and Szegedy, C., Batch normalization: Accelerating deep network training by reducing internal covariate shift, arXiv preprint arXiv:1502.03167, 2015.

  31. Hu, J., Shen, L., and Sun, G, Squeeze-and-excitation networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7132–7141.

  32. Sanakoyeu, A., Tschernezki, V., Büchler, U., and Ommer, B., Divide and conquer the embedding space for metric learning, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

  33. Oh Song, H., Xiang, Y., Jegelka, S., and Savarese, S., Deep metric learning via lifted structured feature embedding, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4004–4012.

  34. Schroff, F., Kalenichenko, D., and Philbin, J., Facenet: A unified embedding for face recognition and clustering, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 815–823.

  35. Sanakoyeu, A., Bautista, M.A., and Ommer, B., Deep unsupervised learning of visual similarities, Pattern Recognit., 2018, vol. 78, pp. 331–343.

    Article  Google Scholar 

  36. Bautista, M.A., Sanakoyeu, A., and Ommer, B., Deep unsupervised similarity learning using partially ordered sets, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1923–1932.

  37. Zagoruyko, S. and Komodakis, N., Learning to compare image patches via convolutional neural networks, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.

  38. Iglovikov, V. and Shvets A., U-net with VGG11 encoder pre-trained on ImageNet for image segmentation, arXiv preprint arXiv:1801.05746.33, 2018.

  39. Iglovikov, V., Seferbekov, S., Buslaev, A., and Shvets, A., TernausNetV2: Fully convolutional network for instance segmentation, arXiv:1806.00844 [cs.CV], 2018.

  40. Solovyev, R.A., Stempkovsky, A.L., and Telpukhov, D.V., Study of fault tolerance methods for hardware implementations of convolutional neural networks, Opt. Mem. Neural Networks, 2019, vol. 28, no. 2, pp. 82–88.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to W. Wang.

Ethics declarations

The authors declare that they have no conflicts of interest.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, W., Solovyev, R.A., Stempkovsky, A.L. et al. Method for Whale Re-identification Based on Siamese Nets and Adversarial Training. Opt. Mem. Neural Networks 29, 118–132 (2020). https://doi.org/10.3103/S1060992X20020058

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S1060992X20020058

Keywords:

Navigation