Skip to main content
Log in

A multi-label text classification method via dynamic semantic representation model and deep neural network

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The increment of new words and text categories requires more accurate and robust classification methods. In this paper, we propose a novel multi-label text classification method that combines dynamic semantic representation model and deep neural network (DSRM-DNN). DSRM-DNN first utilizes word embedding model and clustering algorithm to select semantic words. Then the selected words are designated as the elements of DSRM-DNN and quantified by the weighted combination of word attributes. Finally, we construct a text classifier by combining deep belief network and back-propagation neural network. During the classification process, the low-frequency words and new words are re-expressed by the existing semantic words under sparse constraint. We evaluate the performance of DSRM-DNN on RCV1-v2, Reuters-21578, EUR-Lex, and Bookmarks. Experimental results show that our method outperforms the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Hassan A, Mahmood A (2017) Efficient deep learning model for text classification based on recurrent and convolutional layers[C]. In: International conference on machine learning and applications (ICMLA), pp 1108–1113

  2. Pacheco A G C, Krohling R A, da Silva C A S (2018) Restricted Boltzmann machine to determine the input weights for extreme learning machines[J]. Expert Syst Appl 96:77–85

    Article  Google Scholar 

  3. Shin K, Abraham A, Han SY (2006) Improving KNN text categorization by removing outliers from training set[C]. In: International conference on intelligent text processing and computational linguistics, pp 563–566

  4. Ali S A, Sulaiman N, Mustapha A, et al. (2009) Decision tree response classification[J]. Inf Technol J 8(8):1256–1262

    Article  Google Scholar 

  5. Rinaldi AM (2008) A content-based approach for document representation and retrieval[C]. In: Proceedings of the 8th ACM symposium on document engineering, pp 106–109

  6. Shang C, Li M, Feng S et al (2013) Feature selection via maximizing global information gain for text classification[J]. Knowl-Based Syst 54:298–309

    Article  Google Scholar 

  7. Bai X, Shi B, Zhang C, et al. (2017) Text/non-text image classification in the wild with convolutional neural networks[J]. Pattern Recogn 66:437–446

    Article  Google Scholar 

  8. Zhu X, Vondrick C, Fowlkes C C, et al. (2016) Do we need more training data?[J]. Int J Comput Vis 119(1):76–92

    Article  MathSciNet  Google Scholar 

  9. Bui D D A, Fiol G D, Jonnalagadda S (2016) PDF text classification to leverage information extraction from publication reports[J]. J Biomed Inform 61:141–148

    Article  Google Scholar 

  10. Shang F, Zhang H, Sun J, et al. (2019) Semantic consistency cross-modal dictionary learning with rank constraint[J]. J Vis Commun Image Represent 62:259–266

    Article  Google Scholar 

  11. Al-Salemi B, Noah S A M, Aziz M J A (2016) RFBoost: An improved multi-label boosting algorithm and its application to text categorisation[J]. Knowl-Based Syst 103:104–117

    Article  Google Scholar 

  12. Wang P, Xu B, Xu J, et al. (2016) Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification[J]. Neurocomputing 174:806–814

    Article  Google Scholar 

  13. Abualigah L M, Khader A T (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering[J]. J Supercomput 73(11):4773–4795

    Article  Google Scholar 

  14. Wu L, Hoi S C H, Yu N (2010) Semantics-preserving bag-of-words models and applications[J]. IEEE Trans Image Process 19(7):1908–1920

    Article  MathSciNet  Google Scholar 

  15. Singh D, Singh B (2019) Hybridization of feature selection and feature weighting for high dimensional data[J]. Appl Intell 49(4):1580–1596

    Article  Google Scholar 

  16. Abualigah L M, Khader A T, Hanandeh E S (2018) Hybrid clustering analysis using improved krill herd algorithm[J]. Appl Intell 48(11):4047–4071

    Article  Google Scholar 

  17. Abualigah L M, Khader A T, Hanandeh E S (2018) A combination of objective functions and hybrid Krill herd algorithm for text document clustering analysis[J]. Eng Appl Artif Intel 73:111–125

    Article  Google Scholar 

  18. Abualigah L M, Khader A T, Hanandeh E S et al (2017) A novel hybridization strategy for krill herd algorithm applied to clustering techniques[J]. Appl Soft Comput 60:423–435

    Article  Google Scholar 

  19. Gargiulo F, Silvestri S, Ciampi M et al (2019) Deep neural network for hierarchical extreme multi-label text classification[J]. Appl Soft Comput 79:125–138

    Article  Google Scholar 

  20. Yu B, Xu Z (2008) A comparative study for content-based dynamic spam classification using four machine learning algorithms[J]. Knowl-Based Syst 21(4):355–362

    Article  Google Scholar 

  21. Liu H, Xu B, Lu D, et al. (2018) A path planning approach for crowd evacuation in buildings based on improved artificial bee colony algorithm[J]. Appl Soft Comput 68:360–376

    Article  Google Scholar 

  22. Liu H, Liu B, Zhang H, et al. (2018) Crowd evacuation simulation approach based on navigation knowledge and two-layer control mechanism[J]. Inform Sci 436:247–267

    Article  MathSciNet  Google Scholar 

  23. Shang F, Zhang H, Zhu L, et al. (2019) Adversarial cross-modal retrieval based on dictionary learning[J]. Neurocomputing 355:93–104

    Article  Google Scholar 

  24. Lee L H, Wan C H, Rajkumar R, et al. (2012) An enhanced support vector machine classification framework by using euclidean distance function for text document categorization[J]. Appl Intell 37(1):80–99

    Article  Google Scholar 

  25. Huang M, Zhuang F, Zhang X, et al. (2019) Supervised representation learning for multi-label classification[J]. Mach Learn 108(5):747–763

    Article  MathSciNet  Google Scholar 

  26. de Campos Ibáñez LM, Romero AE (2009) Bayesian network models for hierarchical text classification from a thesaurus[J]. Int J Approx Reason 50(7):932–944

    Article  Google Scholar 

  27. Benites F, Sapozhnikova E (2015) Haram: A hierarchical aram neural network for large-scale text classification[C]. In: 2015 IEEE international conference on data mining workshop (ICDMW), pp 847–854

  28. Zhang M L, Zhou Z H (2006) Multilabel neural networks with applications to functional genomics and text categorization[J]. IEEE Trans Knowl Data Eng 18(10):1338–1351

    Article  Google Scholar 

  29. Chen G, Ye D, Xing Z et al (2017) Ensemble application of convolutional and recurrent neural networks for multi-label text categorization[C]. In: 2017 international joint conference on neural networks (IJCNN), pp 2377–2383

  30. Zhang M L, Zhou Z H (2007) ML-KNN: A lazy learning approach to multi-label learning[J]. Pattern Recognit 40(7):2038–2048

    Article  Google Scholar 

  31. Clare A, King RD (2001) Knowledge discovery in multi-label phenotype data[C]. In: European conference on principles of data mining and knowledge discovery, pp 42–53

  32. Liu L, Zhang B, Zhang H et al (2019) Graph steered discriminative projections based on collaborative representation for image recognition[J]. Multimed Tools Appl 78(17):24501–24518

    Article  Google Scholar 

  33. Zhou S, Li K, Liu Y (2009) Text categorization based on topic model[J]. Int J Computat Intell Syst 2 (4):398–409

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (No. 61702310), the major fundamental research project of Shandong, China (No.ZR2019ZD03), and the Taishan Scholar Project of Shandong, China.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Liu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, T., Liu, L., Liu, N. et al. A multi-label text classification method via dynamic semantic representation model and deep neural network. Appl Intell 50, 2339–2351 (2020). https://doi.org/10.1007/s10489-020-01680-w

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-01680-w

Keywords

Navigation