Abstract
Zero-shot learning (ZSL) is a learning paradigm that tries to develop a recognition model to recognize mutually exclusive training and testing classes. To recognize mutually exclusive classes, some kind of correlation between training and testing classes are required. This paper proposed an inductive solution of the ZSL problem in two stages: (1) a supervised multiclass classifier is trained on the training set and further asked to classify the testing images to its nearest training class. (2) A mapping function, which maps training class to testing class is used to obtain the final class for each testing image. The correlation between seen classes and unseen classes are obtained using the mapping function. We have proposed a graphical mapping function based on a fully connected bipartite graph for mapping between training and testing classes. Each edge of the bipartite graph is assigned a weight calculated by exploiting the semantic space. The proposed model is evaluated over the three well-known ZSL datasets: AWA2, CUB and aPY and obtained 66.59%, 48.95%, and 32.91% mean accuracy respectively. The obtained f1 score of the proposed method is 0.675, 0.565 and 0.492 on AWA2, CUB and aPY dataset respectively.
Similar content being viewed by others
References
Akata Z, Perronnin F, Harchaoui Z, Schmid C (2013) Label-embedding for attribute-based classification. In: 2013 IEEE conference on computer vision and pattern recognition, pp 819–826. https://doi.org/10.1109/CVPR.2013.111
Akata Z, Reed S, Walter D, Lee H, Schiele B (2015) Evaluation of output embeddings for fine-grained image classification. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 2927–2936. https://doi.org/10.1109/CVPR.2015.7298911
Akata Z, Perronnin F, Harchaoui Z, Schmid C (2016) Label-embedding for image classification. IEEE Trans Pattern Anal Mach Intell 38(7):1425–1438
Ba JL, Swersky K, Fidler S, Salakhutdinov R (2015) Predicting deep zero-shot convolutional neural networks using textual descriptions. In: Proceedings of the 2015 IEEE international conference on computer vision (ICCV). IEEE Computer Society, USA, pp 4247–4255. https://doi.org/10.1109/ICCV.2015.483
Behzad M, Asghari K, Eazi M, Palhang M (2009) Generalization performance of support vector machines and neural networks in runoff modeling. Expert Syst Appl 36(4):7624–7629. https://doi.org/10.1016/j.eswa.2008.09.053. http://www.sciencedirect.com/science/article/pii/S095741740800674X
Bhagat P, Choudhary P (2018) Image annotation: then and now. Image Vis Comput 80:1–23
Changpinyo S, Chao WL, Gong B, Sha F (2016) Synthesized classifiers for zero-shot learning. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 5327–5336. https://doi.org/10.1109/CVPR.2016.575
Chao WL, Changpinyo S, Gong B, Sha F (2016) An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer vision – ECCV 2016. Springer International Publishing, Cham, pp 52–68
Cheng HT, Sun FT, Griss M, Davis P, Li J, You D (2013) Nuactiv: Recognizing unseen new activities using semantic attribute-based learning. In: Proceeding of the 11th annual international conference on mobile systems, applications, and services, association for computing machinery. New York, pp 361–374. https://doi.org/10.1145/2462456.2464438
Duan K, Parikh D, Crandall D, Grauman K (2012) Discovering localized attributes for fine-grained recognition. In: 2012 IEEE conference on computer vision and pattern recognition, pp 3474–3481. https://doi.org/10.1109/CVPR.2012.6248089
Elman JL (1990) Finding structure in time. Cogn Sci 14(2):179–211
Farhadi A, Endres I, Hoiem D, Forsyth D (2009) Describing objects by their attributes. In: 2009 IEEE conference on computer vision and pattern recognition, pp 1778–1785
Frome A, Corrado GS, Shlens J, Bengio S, Dean J, Ranzato MA, Mikolov T (2013) Devise: a deep visual-semantic embedding model. In: Proceedings of the 26th international conference on neural information processing systems, vol 2. Curran Associates Inc, Red Hook, NY, USA, pp 2121–2129
Fu Y, Wang X, Dong H, Jiang Y-G, Wang M, Xue X, Sigal L (2020) Vocabulary-informed zero-shot and open-set learning. IEEE Trans Pattern Anal Mach Intell 42(12):3136–3152. https://doi.org/10.1109/TPAMI.2019.2922175
Fu Z, Xiang T, Kodirov E, Gong S (2017) Zero-shot learning on semantic class prototype graph. IEEE Trans Pattern Anal Mach Intell 40(8):2009–2022
Gan C, Yang T, Gong B (2016) Learning attributes equals multi-source domain generalization. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 87–97. https://doi.org/10.1109/CVPR.2016.17
Gao L, Song J, Shao J, Zhu X, Shen H (2015) Zero-shot image categorization by image correlation exploration. In: Proceedings of the 5th ACM on international conference on multimedia retrieval. Association for Computing Machinery, New York, NY, USA, pp 487–490. https://doi.org/10.1145/2671188.2749309
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778. https://doi.org/10.1109/CVPR.2016.90
Huang S, Elhoseiny M, Elgammal A, Yang D (2015) Learning hypergraph-regularized attribute predictors. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 409–417. https://doi.org/10.1109/CVPR.2015.7298638
Jayaraman D, Sha F, Grauman K (2014) Decorrelating semantic visual attributes by resisting the urge to share. In: 2014 ieee conference on computer vision and pattern recognition, pp 1629–1636. https://doi.org/10.1109/CVPR.2014.211
Jia Z, Zhang Z, Wang L, Shan C, Tan T (2020) Deep unbiased embedding transfer for zero-shot learning. IEEE Trans Image Process 29:1958–1971
Kankuekul P, Kawewong A, Tangruamsub S, Hasegawa O (2012) Online incremental attribute-based zero-shot learning. In: 2012 IEEE conference on computer vision and pattern recognition, pp 3657–3664. https://doi.org/10.1109/CVPR.2012.6248112
Kemp C, Tenenbaum JB, Griffiths TL, Yamada T, Ueda N (2006) Learning systems of concepts with an infinite relational model. In: Proceedings of the 21st National Conference on Artificial Intelligence, vol 1. AAAI Press, Boston, Massachusetts, pp 381–388
Kodirov E, Xiang T, Gong S (2017) Semantic autoencoder for zero-shot learning. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 4447–4456. https://doi.org/10.1109/CVPR.2017.473
Kordumova S, Mensink T, Snoek CGM (2016) Pooling objects for recognizing scenes without examples. In: Proceedings of the 2016 ACM on international conference on multimedia retrieval. Association for Computing Machinery, New York, NY, USA, pp 143–150. https://doi.org/10.1145/2911996.2912007
Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: 2009 IEEE conference on computer vision and pattern recognition, pp 951–958. https://doi.org/10.1109/CVPR.2009.5206594
Lampert CH, Nickisch H, Harmeling S (2014) Attribute-based classification for zero-shot visual object categorization. IEEE Trans Pattern Anal Mach Intell 36(3):453–465
Lazaridou A, Dinu G, Baroni M (2015) Hubness and pollution: Delving into cross-space mapping for zero-shot learning. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing. Long papers, vol 1. Association for Computational Linguistics, Beijing, China, pp 270–280. https://doi.org/10.3115/v1/P15-1027
Mensink T, Verbeek J, Perronnin F, Csurka G (2012) Metric learning for large scale image classification: generalizing to new classes at near-zero cost. In: Fitzgibbon A, Lazebnik S, Perona P, Sato Y, Schmid C (eds) Computer vision – ECCV 2012. Springer, Berlin, Heidelberg, pp 488–501. https://doi.org/10.1007/978-3-642-33709-3_35
Mikolov T, Kopecky J, Burget L, Glembek O, Cernocky J (2009) Neural network based language models for highly inflective languages. In: 2009 IEEE international conference on acoustics, speech and signal processing, pp 4725–4728. https://doi.org/10.1109/ICASSP.2009.4960686
Mikolov T, Chen K, Corrado G, Dean J (2013a) Efficient estimation of word representations in vector space. arXiv:13013781
Mikolov T, Yih W, Zweig G (2013b) Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 Conference of the North American chapter of the association for computational linguistics: human language technologies. Association for Computational Linguistics, Atlanta, Georgia, pp 746–751. https://www.aclweb.org/anthology/N13-1090
Miller GA (1995) Wordnet: a lexical database for English. Commun ACM 38(11):39–41
Osherson DN, Stern J, Wilkie O, Stob M, Smith EE (1991) Default probability. Cogn Sci 15(2):251–269
Palatucci M, Pomerleau D, Hinton G, Mitchell TM (2009) Zero-shot learning with semantic output codes. In: Proceedings of the 22nd international conference on neural information processing systems. Curran Associates Inc., NIPS’09, pp 1410–1418
Paragios N, Deriche R (2002) Geodesic active regions and level set methods for supervised texture segmentation. Int J Comput Vision 46(3):223–247
Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, pp 1532–1543. https://doi.org/10.3115/v1/D14-1162
Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell 28(4):594–611
Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: ICLR
Rohrbach M, Stark M, Szarvas G, Gurevych I, Schiele B (2010) What helps where – and why? Semantic relatedness for knowledge transfer. In: 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 910–917. https://doi.org/10.1109/CVPR.2010.5540121
Rohrbach M, Stark M, Schiele B (2011) Evaluating knowledge transfer and zero-shot learning in a large-scale setting. CVPR 2011:1641–1648
Romera-Paredes B, Torr PHS (2015) An embarrassingly simple approach to zero-shot learning. In: Proceedings of the 32nd international conference on international conference on machine learning, vol 37. JMLR.org., Lille, France, pp 2152–2161
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. In: Parallel distributed processing: explorations in the microstructure of cognition. Foundations, vol 1. MIT Press, Cambridge, MA, USA, pp 318–362
Sariyildiz MB, Cinbis RG (2019) Gradient matching generative networks for zero-shot learning. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 2163–2173. https://doi.org/10.1109/CVPR.2019.00227
Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In: Proceedings of the 17th international conference on pattern recognition, 2004. ICPR 2004, vol 3, pp 32–36. https://doi.org/10.1109/ICPR.2004.1334462
Socher R, Ganjoo M, Manning CD, Ng AY (2013) Zero-shot learning through cross-modal transfer. In: Proceedings of the 26th international conference on neural information processing systems, vol 1. Curran Associates Inc., Red Hook, NY, USA, pp 935–943
Tsochantaridis I, Joachims T, Hofmann T, Altun Y (2005) Large margin methods for structured and interdependent output variables. J Mach Learn Res 6:1453–1484
Verma VK, Rai P (2017) A simple exponential family framework for zero-shot learning. In: Ceci M, Hollmén J, Todorovski L, Vens C, Džeroski S (eds) Machine learning and knowledge discovery in databases. Springer International Publishing, Cham, pp 792–808
Wang W, Zheng VW, Yu H, Miao C (2019) A survey of zero-shot learning: settings, methods, and applications. ACM Trans Intell Syst Technol. https://doi.org/10.1145/3293318
Welinder P, Branson S, Mita T, Wah C, Schroff F, Belongie S, Perona P (2010) Caltech-UCSD Birds 200. California Institute of Technology, CNS-TR-2010-001
Xian Y, Akata Z, Sharma G, Nguyen Q, Hein M, Schiele B (2016) Latent embeddings for zero-shot classification. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 69–77. https://doi.org/10.1109/CVPR.2016.15
Xian Y, Schiele B, Akata Z (2017) Zero-shot learning—the good, the bad and the ugly. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 3077–3086. https://doi.org/10.1109/CVPR.2017.328
Xian Y, Lampert CH, Schiele B, Akata Z (2019) Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans Pattern Anal Mach Intell 41(9):2251–2265. https://doi.org/10.1109/TPAMI.2018.2857768
Zhang L, Xiang T, Gong S (2017) Learning a deep embedding model for zero-shot learning. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 3010–3019. https://doi.org/10.1109/CVPR.2017.321
Zhang Y, Jin R, Zhou ZH (2010) Understanding bag-of-words model: a statistical framework. Int J Mach Learn Cybern 1(1):43–52
Zhang Z, Saligrama V (2015) Zero-shot learning via semantic similarity embedding. In: 2015 IEEE international conference on computer vision (ICCV), pp 4166–4174. https://doi.org/10.1109/ICCV.2015.474
Zhao B, Wu B, Wu T, Wang Y (2017) Zero-shot learning posed as a missing data problem. In: 2017 IEEE international conference on computer vision workshops (ICCVW), pp 2616–2622. https://doi.org/10.1109/ICCVW.2017.310
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bhagat, P.K., Choudhary, P. & Singh, K.M. A novel approach based on fully connected weighted bipartite graph for zero-shot learning problems. J Ambient Intell Human Comput 12, 8647–8662 (2021). https://doi.org/10.1007/s12652-020-02615-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-02615-6