Skip to main content
Log in

A novel approach based on fully connected weighted bipartite graph for zero-shot learning problems

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Zero-shot learning (ZSL) is a learning paradigm that tries to develop a recognition model to recognize mutually exclusive training and testing classes. To recognize mutually exclusive classes, some kind of correlation between training and testing classes are required. This paper proposed an inductive solution of the ZSL problem in two stages: (1) a supervised multiclass classifier is trained on the training set and further asked to classify the testing images to its nearest training class. (2) A mapping function, which maps training class to testing class is used to obtain the final class for each testing image. The correlation between seen classes and unseen classes are obtained using the mapping function. We have proposed a graphical mapping function based on a fully connected bipartite graph for mapping between training and testing classes. Each edge of the bipartite graph is assigned a weight calculated by exploiting the semantic space. The proposed model is evaluated over the three well-known ZSL datasets: AWA2, CUB and aPY and obtained 66.59%, 48.95%, and 32.91% mean accuracy respectively. The obtained f1 score of the proposed method is 0.675, 0.565 and 0.492 on AWA2, CUB and aPY dataset respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Akata Z, Perronnin F, Harchaoui Z, Schmid C (2013) Label-embedding for attribute-based classification. In: 2013 IEEE conference on computer vision and pattern recognition, pp 819–826. https://doi.org/10.1109/CVPR.2013.111

  • Akata Z, Reed S, Walter D, Lee H, Schiele B (2015) Evaluation of output embeddings for fine-grained image classification. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 2927–2936. https://doi.org/10.1109/CVPR.2015.7298911

  • Akata Z, Perronnin F, Harchaoui Z, Schmid C (2016) Label-embedding for image classification. IEEE Trans Pattern Anal Mach Intell 38(7):1425–1438

    Article  Google Scholar 

  • Ba JL, Swersky K, Fidler S, Salakhutdinov R (2015) Predicting deep zero-shot convolutional neural networks using textual descriptions. In: Proceedings of the 2015 IEEE international conference on computer vision (ICCV). IEEE Computer Society, USA, pp 4247–4255. https://doi.org/10.1109/ICCV.2015.483

    Chapter  Google Scholar 

  • Behzad M, Asghari K, Eazi M, Palhang M (2009) Generalization performance of support vector machines and neural networks in runoff modeling. Expert Syst Appl 36(4):7624–7629. https://doi.org/10.1016/j.eswa.2008.09.053. http://www.sciencedirect.com/science/article/pii/S095741740800674X

  • Bhagat P, Choudhary P (2018) Image annotation: then and now. Image Vis Comput 80:1–23

    Article  Google Scholar 

  • Changpinyo S, Chao WL, Gong B, Sha F (2016) Synthesized classifiers for zero-shot learning. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 5327–5336. https://doi.org/10.1109/CVPR.2016.575

  • Chao WL, Changpinyo S, Gong B, Sha F (2016) An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer vision – ECCV 2016. Springer International Publishing, Cham, pp 52–68

  • Cheng HT, Sun FT, Griss M, Davis P, Li J, You D (2013) Nuactiv: Recognizing unseen new activities using semantic attribute-based learning. In: Proceeding of the 11th annual international conference on mobile systems, applications, and services, association for computing machinery. New York, pp 361–374. https://doi.org/10.1145/2462456.2464438

  • Duan K, Parikh D, Crandall D, Grauman K (2012) Discovering localized attributes for fine-grained recognition. In: 2012 IEEE conference on computer vision and pattern recognition, pp 3474–3481. https://doi.org/10.1109/CVPR.2012.6248089

  • Elman JL (1990) Finding structure in time. Cogn Sci 14(2):179–211

    Article  Google Scholar 

  • Farhadi A, Endres I, Hoiem D, Forsyth D (2009) Describing objects by their attributes. In: 2009 IEEE conference on computer vision and pattern recognition, pp 1778–1785

  • Frome A, Corrado GS, Shlens J, Bengio S, Dean J, Ranzato MA, Mikolov T (2013) Devise: a deep visual-semantic embedding model. In: Proceedings of the 26th international conference on neural information processing systems, vol 2. Curran Associates Inc, Red Hook, NY, USA, pp 2121–2129

  • Fu Y, Wang X, Dong H, Jiang Y-G, Wang M, Xue X, Sigal L (2020) Vocabulary-informed zero-shot and open-set learning. IEEE Trans Pattern Anal Mach Intell 42(12):3136–3152. https://doi.org/10.1109/TPAMI.2019.2922175

    Article  Google Scholar 

  • Fu Z, Xiang T, Kodirov E, Gong S (2017) Zero-shot learning on semantic class prototype graph. IEEE Trans Pattern Anal Mach Intell 40(8):2009–2022

    Article  Google Scholar 

  • Gan C, Yang T, Gong B (2016) Learning attributes equals multi-source domain generalization. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 87–97. https://doi.org/10.1109/CVPR.2016.17

  • Gao L, Song J, Shao J, Zhu X, Shen H (2015) Zero-shot image categorization by image correlation exploration. In: Proceedings of the 5th ACM on international conference on multimedia retrieval. Association for Computing Machinery, New York, NY, USA, pp 487–490. https://doi.org/10.1145/2671188.2749309

  • He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778. https://doi.org/10.1109/CVPR.2016.90

  • Huang S, Elhoseiny M, Elgammal A, Yang D (2015) Learning hypergraph-regularized attribute predictors. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 409–417. https://doi.org/10.1109/CVPR.2015.7298638

  • Jayaraman D, Sha F, Grauman K (2014) Decorrelating semantic visual attributes by resisting the urge to share. In: 2014 ieee conference on computer vision and pattern recognition, pp 1629–1636. https://doi.org/10.1109/CVPR.2014.211

  • Jia Z, Zhang Z, Wang L, Shan C, Tan T (2020) Deep unbiased embedding transfer for zero-shot learning. IEEE Trans Image Process 29:1958–1971

    Article  MathSciNet  Google Scholar 

  • Kankuekul P, Kawewong A, Tangruamsub S, Hasegawa O (2012) Online incremental attribute-based zero-shot learning. In: 2012 IEEE conference on computer vision and pattern recognition, pp 3657–3664. https://doi.org/10.1109/CVPR.2012.6248112

  • Kemp C, Tenenbaum JB, Griffiths TL, Yamada T, Ueda N (2006) Learning systems of concepts with an infinite relational model. In: Proceedings of the 21st National Conference on Artificial Intelligence, vol 1. AAAI Press, Boston, Massachusetts, pp 381–388

    Google Scholar 

  • Kodirov E, Xiang T, Gong S (2017) Semantic autoencoder for zero-shot learning. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 4447–4456. https://doi.org/10.1109/CVPR.2017.473

  • Kordumova S, Mensink T, Snoek CGM (2016) Pooling objects for recognizing scenes without examples. In: Proceedings of the 2016 ACM on international conference on multimedia retrieval. Association for Computing Machinery, New York, NY, USA, pp 143–150. https://doi.org/10.1145/2911996.2912007

    Chapter  Google Scholar 

  • Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: 2009 IEEE conference on computer vision and pattern recognition, pp 951–958. https://doi.org/10.1109/CVPR.2009.5206594

  • Lampert CH, Nickisch H, Harmeling S (2014) Attribute-based classification for zero-shot visual object categorization. IEEE Trans Pattern Anal Mach Intell 36(3):453–465

    Article  Google Scholar 

  • Lazaridou A, Dinu G, Baroni M (2015) Hubness and pollution: Delving into cross-space mapping for zero-shot learning. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing. Long papers, vol 1. Association for Computational Linguistics, Beijing, China, pp 270–280. https://doi.org/10.3115/v1/P15-1027

    Chapter  Google Scholar 

  • Mensink T, Verbeek J, Perronnin F, Csurka G (2012) Metric learning for large scale image classification: generalizing to new classes at near-zero cost. In: Fitzgibbon A, Lazebnik S, Perona P, Sato Y, Schmid C (eds) Computer vision – ECCV 2012. Springer, Berlin, Heidelberg, pp 488–501. https://doi.org/10.1007/978-3-642-33709-3_35

    Chapter  Google Scholar 

  • Mikolov T, Kopecky J, Burget L, Glembek O, Cernocky J (2009) Neural network based language models for highly inflective languages. In: 2009 IEEE international conference on acoustics, speech and signal processing, pp 4725–4728. https://doi.org/10.1109/ICASSP.2009.4960686

  • Mikolov T, Chen K, Corrado G, Dean J (2013a) Efficient estimation of word representations in vector space. arXiv:13013781

  • Mikolov T, Yih W, Zweig G (2013b) Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 Conference of the North American chapter of the association for computational linguistics: human language technologies. Association for Computational Linguistics, Atlanta, Georgia, pp 746–751. https://www.aclweb.org/anthology/N13-1090

  • Miller GA (1995) Wordnet: a lexical database for English. Commun ACM 38(11):39–41

    Article  Google Scholar 

  • Osherson DN, Stern J, Wilkie O, Stob M, Smith EE (1991) Default probability. Cogn Sci 15(2):251–269

    Article  Google Scholar 

  • Palatucci M, Pomerleau D, Hinton G, Mitchell TM (2009) Zero-shot learning with semantic output codes. In: Proceedings of the 22nd international conference on neural information processing systems. Curran Associates Inc., NIPS’09, pp 1410–1418

  • Paragios N, Deriche R (2002) Geodesic active regions and level set methods for supervised texture segmentation. Int J Comput Vision 46(3):223–247

    Article  Google Scholar 

  • Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, pp 1532–1543. https://doi.org/10.3115/v1/D14-1162

  • Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell 28(4):594–611

    Article  Google Scholar 

  • Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: ICLR

  • Rohrbach M, Stark M, Szarvas G, Gurevych I, Schiele B (2010) What helps where – and why? Semantic relatedness for knowledge transfer. In: 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 910–917. https://doi.org/10.1109/CVPR.2010.5540121

  • Rohrbach M, Stark M, Schiele B (2011) Evaluating knowledge transfer and zero-shot learning in a large-scale setting. CVPR 2011:1641–1648

    Google Scholar 

  • Romera-Paredes B, Torr PHS (2015) An embarrassingly simple approach to zero-shot learning. In: Proceedings of the 32nd international conference on international conference on machine learning, vol 37. JMLR.org., Lille, France, pp 2152–2161

    Google Scholar 

  • Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. In: Parallel distributed processing: explorations in the microstructure of cognition. Foundations, vol 1. MIT Press, Cambridge, MA, USA, pp 318–362

    Chapter  Google Scholar 

  • Sariyildiz MB, Cinbis RG (2019) Gradient matching generative networks for zero-shot learning. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 2163–2173. https://doi.org/10.1109/CVPR.2019.00227

  • Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In: Proceedings of the 17th international conference on pattern recognition, 2004. ICPR 2004, vol 3, pp 32–36. https://doi.org/10.1109/ICPR.2004.1334462

  • Socher R, Ganjoo M, Manning CD, Ng AY (2013) Zero-shot learning through cross-modal transfer. In: Proceedings of the 26th international conference on neural information processing systems, vol 1. Curran Associates Inc., Red Hook, NY, USA, pp 935–943

    Google Scholar 

  • Tsochantaridis I, Joachims T, Hofmann T, Altun Y (2005) Large margin methods for structured and interdependent output variables. J Mach Learn Res 6:1453–1484

    MathSciNet  MATH  Google Scholar 

  • Verma VK, Rai P (2017) A simple exponential family framework for zero-shot learning. In: Ceci M, Hollmén J, Todorovski L, Vens C, Džeroski S (eds) Machine learning and knowledge discovery in databases. Springer International Publishing, Cham, pp 792–808

    Chapter  Google Scholar 

  • Wang W, Zheng VW, Yu H, Miao C (2019) A survey of zero-shot learning: settings, methods, and applications. ACM Trans Intell Syst Technol. https://doi.org/10.1145/3293318

  • Welinder P, Branson S, Mita T, Wah C, Schroff F, Belongie S, Perona P (2010) Caltech-UCSD Birds 200. California Institute of Technology, CNS-TR-2010-001

  • Xian Y, Akata Z, Sharma G, Nguyen Q, Hein M, Schiele B (2016) Latent embeddings for zero-shot classification. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 69–77. https://doi.org/10.1109/CVPR.2016.15

  • Xian Y, Schiele B, Akata Z (2017) Zero-shot learning—the good, the bad and the ugly. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 3077–3086. https://doi.org/10.1109/CVPR.2017.328

  • Xian Y, Lampert CH, Schiele B, Akata Z (2019) Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans Pattern Anal Mach Intell 41(9):2251–2265. https://doi.org/10.1109/TPAMI.2018.2857768

    Article  Google Scholar 

  • Zhang L, Xiang T, Gong S (2017) Learning a deep embedding model for zero-shot learning. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 3010–3019. https://doi.org/10.1109/CVPR.2017.321

  • Zhang Y, Jin R, Zhou ZH (2010) Understanding bag-of-words model: a statistical framework. Int J Mach Learn Cybern 1(1):43–52

    Article  Google Scholar 

  • Zhang Z, Saligrama V (2015) Zero-shot learning via semantic similarity embedding. In: 2015 IEEE international conference on computer vision (ICCV), pp 4166–4174. https://doi.org/10.1109/ICCV.2015.474

  • Zhao B, Wu B, Wu T, Wang Y (2017) Zero-shot learning posed as a missing data problem. In: 2017 IEEE international conference on computer vision workshops (ICCVW), pp 2616–2622. https://doi.org/10.1109/ICCVW.2017.310

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P. K. Bhagat.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhagat, P.K., Choudhary, P. & Singh, K.M. A novel approach based on fully connected weighted bipartite graph for zero-shot learning problems. J Ambient Intell Human Comput 12, 8647–8662 (2021). https://doi.org/10.1007/s12652-020-02615-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-020-02615-6

Keywords

Navigation