Skip to main content
Log in

Spoken Language Identification Based on Particle Swarm Optimisation–Extreme Learning Machine Approach

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

The determination and classification of natural language based on specified content and data set involves a process known as spoken language identification (LID). To initiate the process, useful features of the given data need to be extracted first in a mature process where the standard LID features have been previously developed by employing the use of MFCC, SDC, GMM and the i-vector-based framework. Nevertheless, optimisation of the learning process is still required to enable a comprehensive capturing of the extracted features’ embedded knowledge. The training of a single hidden layer neural network can be done using the extreme learning machine (ELM), which is an effective learning model for conducting classification and regression analysis. Nevertheless, the learning process of this model is not entirely effective (i.e. optimised) due to the random selection of weights within the input hidden layer. This study employs ELM as the LID learning model centred upon the extraction of the standard features. The enhanced self-adjusting extreme learning machine (ESA–ELM) is one of the ELM’s optimisation techniques which has been chosen as the benchmark and is enhanced by adopting a new alternative optimisation approach (PSO) instead of (EATLBO) in terms of achieving high performance. The improved ESA–ELM is named particle swarm optimisation–extreme learning machine (PSO–ELM). The generated results are based on LID with the same benchmarked data set derived from eight languages, which indicated the superior performance of the particle swarm optimisation–extreme learning machine LID (PSO–ELM LID) with an accuracy of 98.75% in comparison with the ESA–ELM LID which only achieved 96.25%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Availability of Data and Materials

The data are available on Figshare via the following https://doi.org/10.6084/m9.figshare.6015173.v1.

References

  1. M.A.A. Albadr, S. Tiun, M. Ayob, F.T. AL-Dhief, Spoken language identification based on optimised genetic algorithm–extreme learning machine approach. Int. J. Speech Technol. 22(3), 711–727 (2019)

    Article  Google Scholar 

  2. M.A.A. Albadr, S. Tiun, F.T. AL-Dhief, M.A. Sammour, Spoken language identification based on the enhanced self-adjusting extreme learning machine approach. PLoS ONE 13(4), 0194770 (2018)

    Article  Google Scholar 

  3. M.A.A. Albadra, S. Tiuna, Extreme learning machine: a review. Int. J. Appl. Eng. Res. 12(14), 4610–4623 (2017)

    Google Scholar 

  4. A.N. Alfiyatin, A.M. Rizki, W.F. Mahmudy, C.F. Ananda, Extreme learning machine and particle swarm optimization for inflation forecasting. Int. J. Adv. Comput. Sci. Appl. 10(4), 473–478 (2019)

    Google Scholar 

  5. A. Alihodzic, E. Tuba, M. Tuba, An improved extreme learning machine tuning by flower pollination algorithm, in Nature-Inspired Computation in Data Mining and Machine Learning, vol. 855, ed. by X.S. Yang, X.S. He (Springer, Cham, 2020), pp. 95–112

    Chapter  Google Scholar 

  6. E. Ambikairajah, H. Li, L. Wang, B. Yin, V. Sethu, Language identification: a tutorial. IEEE Circuits Syst. Mag. 11(2), 82–108 (2011)

    Article  Google Scholar 

  7. E. Ben-Reuven, J. Goldberger, A Semisupervised Approach for Language Identification based on Ladder Networks. arXiv:1604.00317 (2016)

  8. P.-H. Chen, Particle swarm optimization for power dispatch with pumped hydro, in Particle Swarm Optimization. Department of Electrical Engineering, St. John’s University Taiwan, ed. by A. Lazinica (InTech, 2009), pp. 131–144

  9. C. Deng, G. Huang, J. Xu, J. Tang, Extreme learning machines: new trends and applications. Sci. China Inf. Sci. 58(2), 1–16 (2015)

    Article  Google Scholar 

  10. R.C. Eberhart, Y. Shi, J. Kennedy, Swarm Intelligence (Elsevier, New York, 2001)

    Google Scholar 

  11. S. Ganapathy, K.J. Han, S. Thomas, M.K. Omar, M. Van Segbroeck, S.S. Narayanan, Robust language identification using convolutional neural network features, in INTERSPEECH 2014, pp. 1846–1850

  12. A. Garg, V. Gupta, M. Jindal, A survey of language identification techniques and applications. J. Emerg. Technol. Web Intell. 6(4), 388–400 (2014)

    Google Scholar 

  13. S.K. Gupta, O.P. Singh, P.C. Pradhan, A survey on language identification system. Int. J. Innovative Sci. Eng. Technol. 2(3), 2348–7968 (2015)

    Google Scholar 

  14. R.P. Hafen, M.J. Henry, Speech information retrieval: a review. Multimedia Syst. 18(6), 499–518 (2012)

    Article  Google Scholar 

  15. K. Han, D. Yu, I. Tashev, Speech emotion recognition using deep neural network and extreme learning machine, in Fifteenth Annual Conference of the International Speech Communication Association (Interspeech, 2014), pp. 223–227

  16. G.-B. Huang, Q.-Y. Zhu, C.-K. Siew, Extreme learning machine: theory and applications. Neurocomputing 70(1–3), 489–501 (2006)

    Article  Google Scholar 

  17. G.-B. Huang, H. Zhou, X. Ding, R. Zhang, Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybernet. Part B (Cybernet.) 42(2), 513–529 (2012)

    Article  Google Scholar 

  18. G.-B. Huang, L. Chen, C.K. Siew, Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans. Neural Netw. 17(4), 879–892 (2006)

    Article  Google Scholar 

  19. G.-B. Huang, H. Zhou, X. Ding, R. Zhang, Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybernet. Part B (Cybernet.) 42(2), 513–529 (2011)

    Article  Google Scholar 

  20. B. Jiang, Y. Song, S. Wei, J.-H. Liu, I.V. McLoughlin, L.-R. Dai, Deep bottleneck features for spoken language identification. PLoS ONE 9(7), e100795 (2014)

    Article  Google Scholar 

  21. H. Kaya, A.A. Karpov, Efficient and effective strategies for cross-corpus acoustic emotion recognition. Neurocomputing 275, 1028–1034 (2018)

    Article  Google Scholar 

  22. R. Kennedy, Particle swarm optimization, in Proceedings of IEEE International Conference on Neural Networks IV, p. 1995

  23. S. Kumar, S.K. Pal, R. Singh, A novel hybrid model based on particle swarm optimisation and extreme learning machine for short-term temperature prediction using ambient sensors. Sustain. Cities Soc. 49, 101601 (2019)

    Article  Google Scholar 

  24. Y. Lan, Z. Hu, Y.C. Soh, G.-B. Huang, An extreme learning machine approach for speaker recognition. Neural Comput. Appl. 22(3–4), 417–425 (2013)

    Article  Google Scholar 

  25. K.A. Lee, H. Li, L. Deng, V. Hautamäki, W. Rao, X. Xiao, A. Larcher, H. Sun, T.H. Nguyen, G. Wang, The 2015 NIST language recognition evaluation: the shared view of I2R, Fantastic4 and SingaMS, in 2016

  26. J. Li, A. Mohamed, G. Zweig, Y. Gong, LSTM time and frequency recurrence for automatic speech recognition, in 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) (2015). IEEE, pp. 187–191

  27. N.-Y. Liang, G.-B. Huang, P. Saratchandran, N. Sundararajan, A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans. Neural Netw. 17(6), 1411–1423 (2006)

    Article  Google Scholar 

  28. T. Liu, Y. Ding, X. Cai, Y. Zhu, X. Zhang, Extreme learning machine based on particle swarm optimization for estimation of reference evapotranspiration, in 2017 36th Chinese Control Conference (CCC) 2017. IEEE, pp. 4567–4572

  29. I. Lopez-Moreno, J. Gonzalez-Dominguez, D. Martinez, O. Plchot, J. Gonzalez-Rodriguez, P.J. Moreno, On the use of deep feedforward neural networks for automatic language identification. Comput. Speech Lang. 40, 46–59 (2016)

    Article  Google Scholar 

  30. H. Muthusamy, K. Polat, S. Yaacob, Improved emotion recognition using Gaussian Mixture Model and extreme learning machine in speech and glottal signals. Math. Probl. Eng. 2015, 394083 (2015)

    Article  Google Scholar 

  31. P. Nayak, S. Mishra, P. Dash, R. Bisoi, Comparison of modified teaching–learning-based optimization and extreme learning machine for classification of multiple power signal disturbances. Neural Comput. Appl. 27(7), 2107–2122 (2016)

    Article  Google Scholar 

  32. M. Pal, A.E. Maxwell, T.A. Warner, Kernel-based extreme learning machine for remote-sensing image classification. Remote Sens. Lett. 4(9), 853–862 (2013)

    Article  Google Scholar 

  33. M. Sokolova, N. Japkowicz, S. Szpakowicz, Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation, in Australasian Joint Conference on Artificial Intelligence 2006. Springer, pp. 1015–1021

  34. M. van Heeswijk, Advances in extreme learning machines. In: Aalto University publication series, DOCTORAL DISSERTATIONS 43/2015 (2015). ISBN:1799-4942

  35. W. Wang, W. Song, C. Chen, Z. Zhang, Y. Xin, I-vector features and deep neural network modeling for language recognition. Procedia Comput. Sci. 147, 36–43 (2019)

    Article  Google Scholar 

  36. J. Xu, W.-Q. Zhang, J. Liu, S. Xia, Regularized minimum class variance extreme learning machine for language recognition. EURASIP J. Audio Speech Music Process. 2015(1), 22 (2015)

    Article  Google Scholar 

  37. Z. Yang, T. Zhang, D. Zhang, A novel algorithm with differential evolution and coral reef optimization for extreme learning machine training. Cognit. Neurodyn. 10(1), 73–83 (2016)

    Article  Google Scholar 

  38. R. Zazo, A. Lozano-Diez, J. Gonzalez-Dominguez, D.T. Toledano, J. Gonzalez-Rodriguez, Language identification in short utterances using long short-term memory (LSTM) recurrent neural networks. PLoS ONE 11(1), e0146917 (2016)

    Article  Google Scholar 

Download references

Acknowledgements

The Malaysian government had funded this project under the research code: DCP-2017-013/6

Funding

The Malaysian government had funded this project under the research code: DCP-2017-013/6.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sabrina Tiun.

Ethics declarations

Conflict of interest

All the authors declare that they have no conflict of interest.

Code Availability

The source code is not yet publicly available since this project is still ongoing.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Albadr, M.A.A., Tiun, S. Spoken Language Identification Based on Particle Swarm Optimisation–Extreme Learning Machine Approach. Circuits Syst Signal Process 39, 4596–4622 (2020). https://doi.org/10.1007/s00034-020-01388-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-020-01388-9

Keywords

Navigation