Skip to main content
Log in

An innovative multi-label learning based algorithm for city data computing

  • Published:
GeoInformatica Aims and scope Submit manuscript

Abstract

Investigating correlation between example features and example labels is essential to the solving of classification problems. However, identification and calculation of the correlation between features and labels can be rather difficult in case involving high-dimensional multi-label data. Both feature embedding and label embedding have been developed to tackle this challenge, and a shared subspace for both labels and features is usually learned by applying existing embedding methods to simultaneously reduce the dimension of features and labels. By contrast, this paper suggests learning separate subspaces for features and labels by maximizing the independence between the components in each subspace, as well as maximizing the correlation between these two subspaces. The learned independent label components indicate the fundamental combinations of labels in multi-label datasets, which thus helps to reveal the correlation between labels. Furthermore, the learned independent feature components lead to a compact representation of example features. The connections between the proposed algorithm and existing embedding methods are discussed in detail. Experimental results on real-world multi-label datasets demonstrate that it is necessary for us to explore independent components from multi-label data, and further prove the effectiveness of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. http://mulan.sourceforge.net/datasets-mlc.html

  2. http://lear.inrialpes.fr/people/guillaumin/data.php

References

  1. Agrawal R, Gupta A, Prabhu Y, Varma M (2013) Multi-label learning with millions of labels: recommending advertiser bid phrases for web pages. In: Proceedings of the 22nd international conference on World Wide Web. ACM, pp 13–24

  2. Andrew G, Arora R, Bilmes J, Livescu K (2013) Deep canonical correlation analysis. In: International conference on machine learning, pp 1247–1255

  3. Barnard K, Duygulu P, Forsyth D, Freitas Nd, Blei DM, Jordan MI (2003) Matching words and pictures. J Mach Learn Res 3:1107–1135

    Google Scholar 

  4. Belghazi I, Rajeswar S, Baratin A, Hjelm RD, Courville A (2018) Mine:, mutual information neural estimation. arXiv:1801.04062

  5. Bhatia K, Jain H, Kar P, Varma M, Jain P (2015) Sparse local embeddings for extreme multi-label classification. In: Advances in neural information processing systems, pp 730–738

  6. Brakel P, Bengio Y (2017) Learning independent features with adversarial nets for non-linear ica. arXiv:1710.05050

  7. Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: Advances in neural information processing systems, pp 2172–2180

  8. Chen YN, Lin HT (2012) Feature-aware label space dimension reduction for multi-label classification. In: Advances in neural information processing systems, pp 1529–1537

  9. Du B, Wang Z, Zhang L, Zhang L, Tao D (2017) Robust and discriminative labeling for multi-label active learning based on maximum correntropy criterion. IEEE Trans Image Process 26(4):1694– 1707

    Article  Google Scholar 

  10. Elisseeff A, Weston J (2001) A kernel method for multi-labelled classification. In: International conference on neural information processing systems: natural and synthetic, pp 681–687

  11. Escalante HJ, Hernández CA, Gonzalez JA, López-López A, Montes M, Morales EF, Sucar LE, Villaseñor L, Grubinger M (2010) The segmented and annotated iapr tc-12 benchmark. Comput Vis Image Underst 114(4):419–428

    Article  Google Scholar 

  12. Guillaumin M, Mensink T, Verbeek J, Schmid C (2009) Tagprop: discriminative metric learning in nearest neighbor models for image auto-annotation. In: 2009 IEEE 12th international conference on computer vision. IEEE, pp 309–316

  13. He X (2004) Locality preserving projections. Adv Neural Informa Process Syst 16(1):186–197

    Google Scholar 

  14. Hsu DJ, Kakade SM, Langford J, Zhang T (2009) Multi-label prediction via compressed sensing. In: Advances in neural information processing systems, pp 772–780

  15. Hyvarinen A (1999) Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans Neural Netw 10(3):626–634

    Article  Google Scholar 

  16. Hyvärinen A, Karhunen J, Oja E (2004) Independent component analysis, vol 46. Wiley

  17. Jian L, Li J, Shu K, Liu H (2016) Multi-label informed feature selection. In: International joint conference on artificial intelligence, pp 1627–1633

  18. Kågebäck M, Mogren O (2018) Disentangled activations in deep networks. http://mogren.one/phd/kageback2018disentanglement.pdf

  19. Katakis I, Tsoumakas G, Vlahavas I (2008) Multilabel text classification for automated tag suggestion. In: Proceedings of the ECML/PKDD, vol 18

  20. Klimt B, Yang Y (2004) The enron corpus: a new dataset for email classification research. In: European conference on machine learning. Springer, pp 217–226

  21. Le QV, Karpenko A, Ngiam J, Ng AY (2011) Ica with reconstruction cost for efficient overcomplete feature learning. In: Advances in neural information processing systems, pp 1017–1025

  22. Lin Z, Ding G, Hu M, Wang J (2014) Multi-label classification via feature-aware implicit label space encoding. In: International conference on machine learning, pp 325–333

  23. Martin N, Maes H (1979) Multivariate analysis. Academic Press

  24. Pestian JP, Brew C, Matykiewicz P, Hovermale DJ, Johnson N, Cohen KB, Duch W (2007) A shared task involving multi-label classification of clinical free text. In: Proceedings of the workshop on BioNLP 2007: biological, translational, and clinical language processing. Association for Computational Linguistics, pp 97–104

  25. Read J, Pfahringer B, Holmes G (2009) Multi-label classification using ensembles of pruned sets. In: Eighth IEEE international conference on data mining, pp 995–1000

  26. Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333–359

    Article  Google Scholar 

  27. Shang S, Chen L, Wei Z, Jensen CS, Wen JR, Kalnis P (2015) Collective travel planning in spatial networks. IEEE Trans Knowl Data Eng 28(5):1132–1146

    Article  Google Scholar 

  28. Shang S, Chen L, Zheng K, Jensen CS, Wei Z, Kalnis P (2018) Parallel trajectory-to-location join. IEEE Trans Knowl Data Eng 31(6):1194–1207

    Article  Google Scholar 

  29. Shang S, Ding R, Zheng K, Jensen CS, Kalnis P, Zhou X (2014) Personalized trajectory matching in spatial networks. VLDB J Int J Very Large Data Bases 23(3):449–468

    Article  Google Scholar 

  30. Sun L, Ji S, Ye J (2011) Canonical correlation analysis for multilabel classification: a least-squares formulation, extensions, and analysis. IEEE Trans Pattern Anal Mach Intell 33(1):194–200

    Article  Google Scholar 

  31. Tai F, Lin HT (2012) Multilabel classification with principal label space transformation. Neural Comput 24(9):2508–2542

    Article  Google Scholar 

  32. Tschannen M, Bachem O, Lucic M (2018) Recent advances in autoencoder-based representation learning. arXiv:1812.05069

  33. Tsoumakas G, Katakis I, Vlahavas I (2009) Mining multi-label data. In: Data mining and knowledge discovery handbook. Springer, pp 667–685

  34. Wang H, Ding C, Huang H (2010) Multi-label linear discriminant analysis. In: European conference on computer vision, pp 126–139

    Chapter  Google Scholar 

  35. Wang W, Arora R, Livescu K, Bilmes J (2015) On deep multi-view representation learning. In: International conference on machine learning, pp 1083–1092

  36. Wang Z, Du B, Zhang L, Zhang L, Fang M, Tao D (2016) Multi-label active learning based on maximum correntropy criterion: towards robust and discriminative labeling. In: European conference on computer vision. Springer, pp 453–468

  37. Xu C, Liu T, Tao D, Xu C (2016) Local rademacher complexity for multi-label learning. IEEE Trans Image Process 25(3):1495–1507

    Article  Google Scholar 

  38. Xu C, Tao D, Xu C (2016) Robust extreme multi-label learning. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1275–1284

  39. Yu K, Yu S, Tresp V (2005) Multi-label informed latent semantic indexing. In: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, pp 258–265

  40. Zhang ML, Zhou ZH (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837

    Article  Google Scholar 

  41. Zhang Y, Schneider J (2011) Multi-label output codes using canonical correlation analysis. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp 873–882

  42. Zhang Y, Zhou ZH (2008) Multi-label dimensionality reduction via dependence maximization. In: National conference on artificial intelligence, pp 1503–1505

  43. Zhou WJ, Yu Y, Zhang ML (2017) Binary linear compression for multi-label classification. In: Proceedings of the 26th international joint conference on artificial intelligence. AAAI Press, pp 3546–3552

Download references

Acknowledgements

This work was supported in part by the Australian Research Council under Project DE180101438.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fazhi He.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Mengqing Mei and Yongjian Zhong contributed equally.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mei, M., Zhong, Y., He, F. et al. An innovative multi-label learning based algorithm for city data computing. Geoinformatica 24, 221–245 (2020). https://doi.org/10.1007/s10707-019-00383-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10707-019-00383-w

Keywords

Navigation