Skip to main content
Log in

Self-supervised deep subspace clustering network for faces in videos

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Video face clustering is a challenging task with wide applications. Unlike ordinary image clustering, faces in videos usually exist as a series of tracks, which provide prior knowledge. Specifically, faces from the same track are considered to be the same person while faces from the different tracks appearing in the same frame are considered to be different people. Based on this prior knowledge, we propose the self-supervised deep subspace clustering network (SDSCN). SDSCN adopts autoencoder to nonlinearly map the faces into latent space and adds the fully connected layer between the encoder and decoder to explore the self-expressiveness property. Prior knowledge is automatically incorporated into the loss function to guide the training. We further propose efficient training strategy for our network and clustering. The experiments on the two public datasets (BBT0101 and Notting-Hill) demonstrate the advantages of our method. Specifically, our method achieves about 3–17% improvement in clustering accuracy on BBT0101 and about 6–23% improvement on Notting-Hill compared to the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Cinbis, R.G., Verbeek, J.J., Schmid, C.: Unsupervised metric learning for face identification in TV video. In: 2011 IEEE International Conference on Computer Vision (2011)

  2. Elhamifar, E., Vidal, R.: Sparse subspace clustering: algorithm, theory, and applications. In: 2013 IEEE Transactions on Pattern Analysis Machine Intelligence, vol. 35, pp. 2765–2781 (2013)

  3. Liu, G., Lin, Z., Yan, S., Sun, J., Yu, Y., Ma, Y.: Robust recovery of subspace structures by low-rank representation. In: 2013 IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, pp. 171–184 (2013)

  4. Pan, J., Tong, Z., Li, H., Salzmann, M., Reid, I.: Deep subspace clustering networks. In: 2017 Conference and Workshop on Neural Information Processing Systems (2017)

  5. Roth, M., Bäuml, M., Nevatia, R., Stiefelhagen, R.: Robust multi-pose face tracking by multi-stage tracklet association. In: Proceedings of the 21st International Conference on Pattern Recognition, pp. 1012–1016 (2012)

  6. Zhang, Y., Xu, C., Lu, H., Huang, Y.: Character identification in feature-length films using global face-name matching. IEEE Trans. Multimed. 11(7), 1276–1288 (2009)

    Article  Google Scholar 

  7. Lu, C.-Y., Min, H., Zhao, Z.-Q., Zhu, L., Huang, D.-S., Yan, S.: Robust and efficient subspace segmentation via least squares regression. In: 2012 European Conference on Computer Vision, pp. 347–360 (2012)

  8. Patel, V.M., Vidal, R.: Kernel sparse subspace clustering. In: 2014 IEEE International Conference on Image Processing, pp. 2849–2853 (2014)

  9. Patel, V.M., Nguyen, H.V., Vidal, R.: Latent space sparse and low-rank subspace clustering. IEEE J. Sel. Top. Signal Process. 9, 691–701 (2015)

    Article  Google Scholar 

  10. Patel, V.M., Nguyen, H.V., Vidal, R.: Latent space sparse subspace clustering. In: 2013 IEEE International Conference on Computer Vision, pp. 225–232 (2013)

  11. Xiao, S., Tan, M., Xu, D., Dong, Z.Y.: Robust kernel low-rank representation. IEEE Trans. Neural Netw. Learn. Syst. 27, 2268–2281 (2016)

    Article  MathSciNet  Google Scholar 

  12. Yin, M., Guo, Y., Gao, J., He, Z., Xie, S.: Kernel sparse subspace clustering on symmetric positive definite manifolds. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (2016)

  13. Peng, X., Feng, J., Xiao, S., Yau, W., Zhou, J.T., Yang, S.: Structured autoencoders for subspace clustering. IEEE Trans. Image Process. 27(10), 5076–5086 (2018)

    Article  MathSciNet  Google Scholar 

  14. Cinbis, R.G., Verbeek, J., Schmid, C.: Unsupervised metric learning for face identification in TV video. In: 2011 International Conference on Computer Vision, pp. 1559–1566 (2011)

  15. Du, M., Chellappa, R.: Face association across unconstrained video frames using conditional random fields. In: 2012 European Conference on Computer Vision, pp. 167–180 (2012)

  16. Hu, Y., Mian, A.S., Owens, R.: Sparse approximated nearest points for image set classification. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition, pp. 121–128 (2011)

  17. Tapaswi, M., Law, M.T., Fidler, S.: Video face clustering with unknown number of clusters. In: International Conference on Computer Vision (2019)

  18. Wu, B., Zhang, Y., Hu, B., Ji, Q.: Constrained clustering and its application to face clustering in videos. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3507–3514 (2013)

  19. Xiao, S., Tan, M., Xu, D.: Weighted block-sparse low rank representation for face clustering in videos. In: 2014 European Conference on Computer Vision, pp. 123–138 (2014)

  20. Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Joint face representation adaptation and clustering in videos. In: Proceedings of 2016 European Conference on Computer Vision, vol. 9907, pp. 236–251 (2016)

  21. Zhang, Y., Tang, Z., Wu, B., Ji, Q., Lu, H.: A coupled hidden conditional random field model for simultaneous face clustering and naming in videos. IEEE Trans. Image Process. 25, 5780–5792 (2016)

    Article  MathSciNet  Google Scholar 

  22. Sharma, V., Tapaswi, M., Sarfraz, M.S., Stiefelhagen, R.: Self-supervised learning of face representations for video face clustering. In: IEEE International Conference on Automatic Face and Gesture Recognition (2019)

  23. Klein, D., Kamvar, S., Manning, C.: From instancelevel constraints to space-level constraints: making the most of prior knowledge in data clustering. In: Proceedings of the Nineteenth International Conference on Machine Learning, pp. 307–314 (2002)

  24. Chapelle, O., Schölkopf, B., Zien, A.: Probabilistic Semi-Supervised Clustering with Constraints, pp. 73–102. MIT Press, Cambridge (2006)

    Google Scholar 

  25. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  26. Ji, P., Salzmann, M., Li, H.: Efficient dense subspace clustering. In: IEEE Winter Conference on Applications of Computer Vision, pp. 461–468 (2014)

  27. Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: 2001 Conference on Neural Information Processing Systems (2001)

  28. Bishop, C.: Pattern Recognition and Machine Learning, pp. 140–155. Springer, Berlin (2006)

    MATH  Google Scholar 

  29. Pini, S., Cornia, M., Bolelli, F.: M-VAD names: a dataset for video captioning with naming. Multimed. Tools Appl. 78, 14007–14027 (2019)

    Article  Google Scholar 

Download references

Acknowledgements

The work was supported by Zhejiang Provincial Natural Science Foundation of China under Grant No. LY18F020034, and Natural Science Foundation of China under Grant No. 61801428.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pengyi Hao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Yunhao Qiu and Pengyi Hao have contributed equally.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qiu, Y., Hao, P. Self-supervised deep subspace clustering network for faces in videos. Vis Comput 37, 2253–2261 (2021). https://doi.org/10.1007/s00371-020-01984-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-020-01984-5

Keywords

Navigation