Skip to main content
Log in

SSD based on contour–material level for domain adaptation

  • Theoretical advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

In the field of object detection, domain migration has gradually become a hot issue. We hope that the model trained in the domain with label-rich can be applied to other domains with label-poor or without labels, which can save a lot of time and energy for annotation, but different domain distributions are always mismatched; such a distribution mismatch will lead to a sharp decline in domain transfer performance. In this work, to improve the performance of object detection for domain transfer, we tackle the domain shift on two levels: (1) the contour-level shift, such as appearance, shape, and size, and (2) the material-level shift, such as texture, shade, and color. We apply different alignments to the aforementioned levels, specifically contour-level adaptation with full alignment and material-level adaptation with selective alignment. We construct a domain adaptation framework based on the recent state-of-the-art SSD model, and SSD is the abbreviation of single shot multibox detector, which is preeminent above most of the other approaches proposed in large numbers due to its real-time performance and effectiveness. We design two domain adapters on contour level and material level, respectively, to alleviate the domain discrepancy. Recently, approaches that align distributions of source and target images employing an adversarial loss have been proven effective, so the two domain adapters are implemented by learning a domain classifier in adversarial training manner, and the domain classifiers on different levels are further reinforced with a consistency regularization in the SSD model. We empirically verify the effectiveness of our method, which outperforms the other three state-of-the-art methods by a large margin of 5–10% in terms of mean average precision (mAP) on various datasets in both similar and dissimilar domain shift scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Carlucci FM, Porzi L, Caputo B, Ricci E, Bul`o SR (2017) Autodial: automatic domain alignment layers. In international conference on computer vision

  2. Bilen H, Vedaldi A (2016) Weakly supervised deep detection networks. In CVPR

  3. Busto PP, Gall J (2017) Open set domain adaptation. In ICCV

  4. Isola P, Zhu J, Zhou T, Efros A (2017) Image-to-image translation with conditional adversarial networks. In 2017 IEEE conference on computer vision and pattern recognition(CVPR), volume 00, pages 5967–5976, July

  5. Bateni P, Goyal R, Franke U, Roth S (2020) Improved few-shot visual classification. In CVPR

  6. Deng J, Dong W, Socher R, Li K, Fei-Fei L (2018) Imagenet: a large-scale hierarchical image database. In CVPR

  7. Liu W, Anguelov D, Erhan D, Szegedy C, Ree-d S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In ECCV,

  8. Yu CH, Dwang J (2019) Transfer learning with dynamic adversarial adaptation network[C]// Proceedings of the 2019 IEEE conference on computer vision and pattern recognition. Piscataway: IEEE, 936–944

  9. Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V (2016) Domain adversarial training of neural networks. JMLR 17(59):1–35

    MathSciNet  MATH  Google Scholar 

  10. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In NIPS

  11. Richter SR, Vineet V, Roth S, Koltun V (2016) Playing for data: ground truth from computer games. In European Conference on Computer Vision, pages 102–118. Springer

  12. Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In CVPR

  13. Girshick RB, Donahue J, Darrell T, Malik J (2013) Rich feature hierarchies for accurate object detection and semantic segmentation. CoRR, abs/1311.2524

  14. Hoffman J, Tzeng E, Park T, Zhu J-Y, Isola P, Saenko K, Efros AA, Darrell T (2018) Cycada: Cycle-consistent adversarial domain adaptation. In ICML

  15. Inoue N, Furuta R, Yamasaki T, Aizawa K (2018) Cross-domain weakly-supervised object detection through progressive domain adaptation

  16. Jia Y, Shelhamer E, Donahue J, Karayev S, Lo-ng J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In ACMMM

  17. Tang Y, Wang J, Gao B, Dellandr´ea E, Gaizau-skas R, Chen L (2016) Large scale semi-supervised object detection using visual and semantic knowledge transfer. In CVPR

  18. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In NIPS

  19. Saito K, Watanabe K, Ushiku Y, Harada T (2018) Maximum classifier discrepancy for unsupervised domain adaptation. In CVPR

  20. Lin T-Y, Maire M, Belongie S, Hays J, Peron-a P, Ramanan D, Doll´ar P, Zitnick CL (2020) Visual Commonsense R-CNN. In ECCV

  21. Liu M-Y, Breuel T, Kautz J (2017) Unsupervised image-to image translation networks. In NIPS

  22. Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. IJCV 88(2):303–338

    Article  Google Scholar 

  23. Chen Y, Li W, Sakaridis C, Dai D, Van Gool L (2018) Domain adaptive faster r-cnn for object detection in the wild. In CVPR

  24. Tzeng E, Hoffman J, Saenko K, Darrell T (2017) Adversarial discriminative domain adaptation. In CVPR

  25. Long M, Zhu H, Wang J, Jordan MI (2016) Unsupervised domain adaptation with residual transfer net-works. In NIPS.

  26. Fellow G, Pouget J, Xu B, et al. (2014) Generative adversarial nets[C]//Proceedings of the 2014 IEEE conference on computer vision and pattern recognition. Piscataway: IEEE, 367–384

  27. Odena A, Olah C, Shlens J (2020) Unbiased scene graph generation from biased training

  28. Paszke A, Gross S, Chintala S, Chanan G, Yang E, De-Vito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch

  29. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In CVPR

  30. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In NIPS

  31. Saenko K, Kulis B, Fritz M, Darrell T (2020) Meta-transfer learning for zero-shot super-resolution. In ECCV

  32. Saito K, Ushiku Y, Harada T, Saenko K (2018) Adversarial dropout regularization. In ICLR

  33. Lin T.-Y, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. In ICCV

  34. Saito K, Yamamoto S, Ushiku Y, Harada T (2018) Open set domain adaptation by backpropagation. In CCV

  35. Sakaridis C, Dai D, Van Gool L (2018) Semantic foggy scene understanding with synthetic data. IJCV

  36. Sankaranarayanan S, Balaji Y, Jain A, Lim SN, Chellappa R (2018) Learning from synthetic data: addressing domain shift for semantic segmentation. In CVPR

  37. Sakaridis C, Dai D, Van Gool L (2017) Semantic foggy scene understanding with synthetic data. CoRR, abs/1708.07819

  38. Girshick R (2015) Fast r-cnn. In Proceedings of the 2015 IEEE international conference on computer vision (ICCV), ICCV’15, pages 1440–1448, Washington, DC, USA. IEEE Computer Society

  39. Johnson-Roberson M, Barto C, Mehta R, Sridhar SN, Rosaen K, Vasudevan R (2017) Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? In: IEEE International Conference on Robotics and Automation, pp 1–8

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ning Jiang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, N., Fang, J., Xu, J. et al. SSD based on contour–material level for domain adaptation. Pattern Anal Applic 24, 1221–1229 (2021). https://doi.org/10.1007/s10044-021-00986-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-021-00986-w

Keywords

Navigation