Abstract
The increment of new words and text categories requires more accurate and robust classification methods. In this paper, we propose a novel multi-label text classification method that combines dynamic semantic representation model and deep neural network (DSRM-DNN). DSRM-DNN first utilizes word embedding model and clustering algorithm to select semantic words. Then the selected words are designated as the elements of DSRM-DNN and quantified by the weighted combination of word attributes. Finally, we construct a text classifier by combining deep belief network and back-propagation neural network. During the classification process, the low-frequency words and new words are re-expressed by the existing semantic words under sparse constraint. We evaluate the performance of DSRM-DNN on RCV1-v2, Reuters-21578, EUR-Lex, and Bookmarks. Experimental results show that our method outperforms the state-of-the-art methods.
Similar content being viewed by others
References
Hassan A, Mahmood A (2017) Efficient deep learning model for text classification based on recurrent and convolutional layers[C]. In: International conference on machine learning and applications (ICMLA), pp 1108–1113
Pacheco A G C, Krohling R A, da Silva C A S (2018) Restricted Boltzmann machine to determine the input weights for extreme learning machines[J]. Expert Syst Appl 96:77–85
Shin K, Abraham A, Han SY (2006) Improving KNN text categorization by removing outliers from training set[C]. In: International conference on intelligent text processing and computational linguistics, pp 563–566
Ali S A, Sulaiman N, Mustapha A, et al. (2009) Decision tree response classification[J]. Inf Technol J 8(8):1256–1262
Rinaldi AM (2008) A content-based approach for document representation and retrieval[C]. In: Proceedings of the 8th ACM symposium on document engineering, pp 106–109
Shang C, Li M, Feng S et al (2013) Feature selection via maximizing global information gain for text classification[J]. Knowl-Based Syst 54:298–309
Bai X, Shi B, Zhang C, et al. (2017) Text/non-text image classification in the wild with convolutional neural networks[J]. Pattern Recogn 66:437–446
Zhu X, Vondrick C, Fowlkes C C, et al. (2016) Do we need more training data?[J]. Int J Comput Vis 119(1):76–92
Bui D D A, Fiol G D, Jonnalagadda S (2016) PDF text classification to leverage information extraction from publication reports[J]. J Biomed Inform 61:141–148
Shang F, Zhang H, Sun J, et al. (2019) Semantic consistency cross-modal dictionary learning with rank constraint[J]. J Vis Commun Image Represent 62:259–266
Al-Salemi B, Noah S A M, Aziz M J A (2016) RFBoost: An improved multi-label boosting algorithm and its application to text categorisation[J]. Knowl-Based Syst 103:104–117
Wang P, Xu B, Xu J, et al. (2016) Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification[J]. Neurocomputing 174:806–814
Abualigah L M, Khader A T (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering[J]. J Supercomput 73(11):4773–4795
Wu L, Hoi S C H, Yu N (2010) Semantics-preserving bag-of-words models and applications[J]. IEEE Trans Image Process 19(7):1908–1920
Singh D, Singh B (2019) Hybridization of feature selection and feature weighting for high dimensional data[J]. Appl Intell 49(4):1580–1596
Abualigah L M, Khader A T, Hanandeh E S (2018) Hybrid clustering analysis using improved krill herd algorithm[J]. Appl Intell 48(11):4047–4071
Abualigah L M, Khader A T, Hanandeh E S (2018) A combination of objective functions and hybrid Krill herd algorithm for text document clustering analysis[J]. Eng Appl Artif Intel 73:111–125
Abualigah L M, Khader A T, Hanandeh E S et al (2017) A novel hybridization strategy for krill herd algorithm applied to clustering techniques[J]. Appl Soft Comput 60:423–435
Gargiulo F, Silvestri S, Ciampi M et al (2019) Deep neural network for hierarchical extreme multi-label text classification[J]. Appl Soft Comput 79:125–138
Yu B, Xu Z (2008) A comparative study for content-based dynamic spam classification using four machine learning algorithms[J]. Knowl-Based Syst 21(4):355–362
Liu H, Xu B, Lu D, et al. (2018) A path planning approach for crowd evacuation in buildings based on improved artificial bee colony algorithm[J]. Appl Soft Comput 68:360–376
Liu H, Liu B, Zhang H, et al. (2018) Crowd evacuation simulation approach based on navigation knowledge and two-layer control mechanism[J]. Inform Sci 436:247–267
Shang F, Zhang H, Zhu L, et al. (2019) Adversarial cross-modal retrieval based on dictionary learning[J]. Neurocomputing 355:93–104
Lee L H, Wan C H, Rajkumar R, et al. (2012) An enhanced support vector machine classification framework by using euclidean distance function for text document categorization[J]. Appl Intell 37(1):80–99
Huang M, Zhuang F, Zhang X, et al. (2019) Supervised representation learning for multi-label classification[J]. Mach Learn 108(5):747–763
de Campos Ibáñez LM, Romero AE (2009) Bayesian network models for hierarchical text classification from a thesaurus[J]. Int J Approx Reason 50(7):932–944
Benites F, Sapozhnikova E (2015) Haram: A hierarchical aram neural network for large-scale text classification[C]. In: 2015 IEEE international conference on data mining workshop (ICDMW), pp 847–854
Zhang M L, Zhou Z H (2006) Multilabel neural networks with applications to functional genomics and text categorization[J]. IEEE Trans Knowl Data Eng 18(10):1338–1351
Chen G, Ye D, Xing Z et al (2017) Ensemble application of convolutional and recurrent neural networks for multi-label text categorization[C]. In: 2017 international joint conference on neural networks (IJCNN), pp 2377–2383
Zhang M L, Zhou Z H (2007) ML-KNN: A lazy learning approach to multi-label learning[J]. Pattern Recognit 40(7):2038–2048
Clare A, King RD (2001) Knowledge discovery in multi-label phenotype data[C]. In: European conference on principles of data mining and knowledge discovery, pp 42–53
Liu L, Zhang B, Zhang H et al (2019) Graph steered discriminative projections based on collaborative representation for image recognition[J]. Multimed Tools Appl 78(17):24501–24518
Zhou S, Li K, Liu Y (2009) Text categorization based on topic model[J]. Int J Computat Intell Syst 2 (4):398–409
Acknowledgements
This work is supported by the National Natural Science Foundation of China (No. 61702310), the major fundamental research project of Shandong, China (No.ZR2019ZD03), and the Taishan Scholar Project of Shandong, China.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, T., Liu, L., Liu, N. et al. A multi-label text classification method via dynamic semantic representation model and deep neural network. Appl Intell 50, 2339–2351 (2020). https://doi.org/10.1007/s10489-020-01680-w
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-020-01680-w