Skip to main content
Log in

A Novel Neural Network-Based Approach to Classification of Implicit Emotional Components in Ordinary Speech

  • Published:
Optical Memory and Neural Networks Aims and scope Submit manuscript

Abstract

The neural network-based approach to the classification of implicit emotional components in ordinary speech is considered. Mel-frequency cepstral coefficients were used as feature vectors, and the multilayer perceptron with one hidden layer was used as the classifier. It was shown that the neural-network system developed is able to classify these kinds of speech with the accuracy up to 99% and is not inferior to the human experts. Moreover, two model’s training approaches were suggested and tested, and the influence of the parameters for mel-frequency cepstral coefficients calculation on the resulting accuracies was studied. It was found that the personalized approach to training the classifier for each subject results in higher classification accuracy than the generalized one that is, using a mixed sample of multiple subjects. Optimal parameters for the mel-frequency cepstral coefficients calculations were found. The results of the study demonstrated high quality of the developed approach, and it can be applied to developing Brain-Computer interfaces based on inner speech patterns recognition, which will be addressed in further research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.

Similar content being viewed by others

REFERENCES

  1. Abiri, R., Borhani, S., Sellers, E.W., Jiang, Y., and Zhao, X., A comprehensive review of EEG-based brain–computer interface paradigms, J. Neural Eng., 2019, vol. 16, no. 1, p. 011001.

    Article  Google Scholar 

  2. Akbari, H., Khalighinejad, B., Herrero, J.L., Mehta, A.D., and Mesgarani, N., Towards reconstructing intelligible speech from the human auditory cortex, Sci. Rep., 2019, vol. 9, no. 1, pp. 1–12.

    Article  Google Scholar 

  3. Al Marzuqi, H.M.O., Hussain, S.M., and Frank, A., Device activation based on voice recognition using mel-frequency cepstral coefficients (MFCC’s) algorithm, Int. Res. J. Eng. Technol., 2019, vol. 6, no. 3, pp. 4297–4301.

    Google Scholar 

  4. Anumanchipalli, G.K., Chartier, J., and Chang, E.F., Speech synthesis from neural decoding of spoken sentences, Nature, 2019, vol. 568, no. 7753, pp. 493–498.

    Article  Google Scholar 

  5. Ayadia, M.E.l., Kamelb, M.S., and Karrayb, F., Survey on speech emotion recognition: Features, classification schemes, and databases, Pattern Recognit., 2011, vol. 44, no. 3, pp. 572–587.

    Article  Google Scholar 

  6. Bose, A., Roy, S.S., Balas, V.E., and Samui, P., Deep learning for brain computer interfaces, in Handbook of Deep Learning Applications, Springer, Cham, 2019, pp. 333–344.

    Google Scholar 

  7. Boquete, L., de Santiago, L., and Cavaliere, C., Induced gamma-band activity during actual and imaginary movements: EEG analysis, Sensors (Basel, Switzerland), 2020, vol. 20, no. 6.

  8. Bozkurt, E., Erzin, T., Erdem, C.E., and Erdem, T.E., Formant position based weighted spectral features for emotion recognition, Speech Commun., 2011, vol. 53, nos. 9–10, pp. 1186–1197.

    Article  Google Scholar 

  9. Brühl, A.B., Rufer, M., Delsignore, A., Kaffenberger, T., Jäncke, L., and Herwig, U., Neural correlates of altered general emotion processing in social anxiety disorder, Brain Res., 2011, vol. 1378, pp. 72–83.

    Article  Google Scholar 

  10. Chang, E.F. and Anumanchipalli, G.K., Toward a Speech Neuroprosthesis, JAMA, J. Am. Med. Assoc., 2020, vol. 323, no. 5, pp. 413–414.

    Article  Google Scholar 

  11. Chartier, J., Anumanchipalli, G.K., Johnson, K., and Chang, E.F., Encoding of articulatory kinematic trajectories in human speech sensorimotor cortex, Neuron, 2018, vol. 98, no. 5, pp. 1042–1054.

    Article  Google Scholar 

  12. Chaudhary, U., Xia, B., Silvoni, S., Cohen, L.G., and Birbaumer, N., Brain–computer interface–based communication in the completely locked-in state, PLoS Biol., 2017, vol. 15, no. 1.

  13. Chaudhary, P. and Agrawal, R., A comparative study of linear and non-linear classifiers in sensory motor imagery-based brain computer interface, J. Comput. Theor. Nanosci., 2019, vol. 16, no. 12, pp. 5134–5139.

    Article  Google Scholar 

  14. Choi, J., Kim, K.T., Lee, J., Lee, S.J., and Kim, H., Robust semi-synchronous BCI controller for brain-actuated exoskeleton system, in 8th IEEE International Winter Conference on Brain-Computer Interface (BCI), 2020, pp. 1–3.

  15. Cooney, C., Folli, R., and Coyle, D., Optimizing layers improves CNN generalization and transfer learning for imagined speech decoding from EEG, in IEEE Int. Conf. on Systems, Man and Cybernetics (SMC), 2019, pp. 1311–1316.

  16. Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., and Taylor, J.G., Emotion recognition in human–computer interaction, IEEE Signal Process. Mag., 2001, vol. 18, no. 1, pp. 32–80.

    Article  Google Scholar 

  17. Davood, G., Sheikhan, M., Nazerieh, A., and Garouc, S., Speech emotion recognition using FCBF feature selection method and GA-optimized fuzzy ARTMAP neural network, Neural Comp. Appl., 2012, vol. 21, no. 8, pp. 2115–2126.

    Article  Google Scholar 

  18. Dos Santos, E.M., Cassani, R., Falk, T.H., and Fraga, F.J., Improved motor imagery brain-computer interface performance via adaptive modulation filtering and two-stage classification, Biomed. Signal Process. Control, 2020, vol. 57, p. 101812.

    Article  Google Scholar 

  19. Dutta, P.K. and Roy, A., Approaching the clinical effectiveness of stress and depression assessment in adults with aphasia through speech waveform analysis and medical management, EC Psychol. Psychiatry, 2019, vol. 8, pp. 104–108.

    Google Scholar 

  20. Fadel, W., Kollod, C., Wahdow, M., Ibrahim, Y., and Ulbert, I., Multi-class classification of motor imagery eeg signals using image-based deep recurrent convolutional neural network, in 8th IEEE International Winter Conference on Brain-Computer Interface (BCI), 2020, pp. 1–4.

  21. Frolov, A.A., Húsek, D., Biryukova, E.V., Bobrov, P.D., Mokienko, O.A., and Alexandrov, A.V., Principles of motor recovery in post-stroke patients using hand exoskeleton controlled by the brain-computer interface based on motor imagery, Neural Network World, 2017, vol. 27, no. 1, p. 107.

    Article  Google Scholar 

  22. Garcia, M.B., A speech therapy game application for aphasia patient neurorehabilitation – A pilot study of an mHealth app, Int. J. Simul.: Systems, Sci. Technol., 2019, vol. 20, pp. 1–9.

    Google Scholar 

  23. Gaume, A., Dreyfus, G., and Vialatte, F.B., A cognitive brain–computer interface monitoring sustained attentional variations during a continuous task, Cognit. Neurodynamics, 2019, vol. 13, no. 3, pp. 257–269.

    Article  Google Scholar 

  24. Giannakakis, G., Grigoriadis, D., Giannakaki, K., Simantiraki, O., Roniotis, A., and Tsiknakis, M., Review on psychological stress detection using biosignals, IEEE Trans. Affective Comput., 2019, pp. 1–22.

  25. Gill, P.E., Murray, W., and Wright, M.H., Practical optimization, J. Soc. Ind. Appl. Math., 2019, vol. 81, p. 401.

    MATH  Google Scholar 

  26. Grupe, D.W. and Nitschke, J.B., Uncertainty and anticipation in anxiety: an integrated neurobiological and psychological perspective, Nat. Rev. Neurosci., 2013, vol. 14, pp. 488–501.

    Article  Google Scholar 

  27. Hamedi, M., Salleh, S.H., and Noor, A.M., Electroencephalographic motor imagery brain connectivity analysis for BCI: A review, Neural Comput., 2016, vol. 28, no. 6, pp. 999–1041.

    Article  MathSciNet  MATH  Google Scholar 

  28. Han, C.H., Kim, Y.W., Kim, S.H., Nenadic, Z., and Im, C.H., Electroencephalography-based endogenous brain-computer interface for online communication with a completely locked-in patient, J. Neuroeng. Rehabil., 2019, vol. 16, no. 1, p. 18.

    Article  Google Scholar 

  29. Hayakawa, T. and Kobayashi, J., Improving EEG-based BCI neural networks for mobile robot control by Bayesian optimization, J. Rob., Networking Artif. Life, 2018, vol. 5, no. 1, pp. 41–44.

    Article  Google Scholar 

  30. Haykin, S., Neural networks: a comprehensive foundation, 3rd Ed., Pearson Prentice Hall., 2009.

    MATH  Google Scholar 

  31. Hong, K.S. and Khan, M.J., Hybrid brain-computer interface techniques for improved classification accuracy and increased number of commands: a review, Frontiers Neurorobotics, 2017, vol. 11, p. 35.

    Article  Google Scholar 

  32. Jain, M., Narayan, S., Balaji, P., Bhowmick, A., and Muthu, R.K., Speech emotion recognition using support vector machine, 2020. arXiv:2002.07590.

  33. Jeong, J.H., Shim, K.H., Kim, D.J., and Lee, S.W., Brain-controlled robotic arm system based on multi-directional CNN-BiLSTM network using EEG signals, IEEE Trans. Neural Syst. Rehabil. Eng., 2020, vol. 28, no. 5, pp. 1226–1238.

    Article  Google Scholar 

  34. Kakouros, S., and Räsänen, O., Perception of sentence stress in speech correlates with the temporal unpredictability of prosodic features, Cognit. Sci., 2016, vol. 40, no. 7, pp. 1739–1774.

    Article  Google Scholar 

  35. Kamath, C., Automatic seizure detection based on Teager Energy Cepstrum and pattern recognition neural networks, QScience Connect, 2014, vol. 14, pp. 1–10.

    Google Scholar 

  36. Kampman, O., Siddique, F.B., Yang, Y., and Fung, P., Adapting a virtual agent to user personality, in Advanced Social Interaction with Agents, 2019, pp. 111–118.

  37. Kim, Y.J., Kwak, N.S., and Lee, S.W., Classification of motor imagery for Ear-EEG based brain-computer interface, in Proc. of 6th International Conference on Brain-Computer Interface (BCI), IEEE, 2018, pp. 1–2.

  38. Kim, H.J., Lee, M.H., and Lee, M., A BCI based smart home system combined with event-related potentials and speech imagery task, in 8th IEEE International Winter Conference on Brain-Computer Interface (BCI), 2020, pp. 1–6.

  39. Kiroy, V.N., Bakhtin, O.M., Minyaeva, N.R., Lazurenko, D.M., Aslanyan, E.V., Kiroy, R.I., Electrographic correlations of inner speech, Zh. Vyssh. Nervn. Deiatelnosti im. I.P. Pavlova, 2015, vol. 65, no. 5, pp. 616–625.

    Google Scholar 

  40. Kiroy, V.N., Aslanyan, E.V., Bakhtin, O.M., Minyaeva, N.R., and Lazurenko, D.M., EEG correlates of the functional state of pilots during simulated flights, Neurosci. Behav. Physiol., 2016, vol. 46, no. 4, pp. 375–381.

    Article  Google Scholar 

  41. Ko, W., Jeon, E., Lee, J., and Suk, H.I., Semi-supervised deep adversarial learning for brain-computer interface, in Proc. of 7th International Winter Conference on Brain-Computer Interface (BCI), IEEE, 2019, pp. 1–4.

  42. Kondur, A., Biryukova, E., Frolov, A., Bobrov, P., and Kotov, S., Recovery of post stroke motor function with hand exoskeleton controlled by brain-computer interface: effect of repeated courses, in The 5-th Int. Conf. BCI: Science and Practice, 2019, pp. 10–11.

  43. Kramer, D.R., Lee, M.B., Barbaro, M., Gogia, A.S., Peng, T., Liu, C., and Lee, B., Mapping of primary somatosensory cortex of the hand area using a high-density electrocorticography grid for closed-loop brain computer interface, J. Neural Eng., 2020, vol. 63, pp. 116–121.

    Google Scholar 

  44. Kryger, M., Wester, B., Pohlmeyer, E.A., Rich, M., John, B., Beaty, J., and Tyler-Kabara, E.C., Flight simulation using a Brain-Computer Interface: A pilot, pilot study, Exp. Neurol., 2017, vol. 287, pp. 473–478.

    Article  Google Scholar 

  45. Laukka, P., Neiberg, D., Forsell, M., Karlsson, I., and Elenius, K., Expression of affect in spontaneous speech: Acoustic correlates and automatic detection of irritation and resignation, Comput. Speech Lang., 2011, vol. 25, no. 1, pp. 84–104.

    Article  Google Scholar 

  46. Lazurenko, D.M., Kiroy, V.N., Aslanyan, E.V., Shepelev, I.E., Bakhtin, O.M., and Minyaeva, N.R., Electrographic properties of movement-related potentials, Neurosci. Behav. Physiol., 2018, vol. 48, no. 9, pp. 1078–1087.

    Article  Google Scholar 

  47. Lazurenko, D.M., Kiroy, V.N., Shepelev, I.E., and Podladchikova, L.N., Motor imagery-based brain-computer interface: neural network approach, Opt. Mem. Neural Networks, 2019, vol. 28, no. 2, pp. 109–117.

    Article  Google Scholar 

  48. Livingstone, S.R. and Russo, F.A., The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English, PLoS One, 2018, vol. 13, no. 5.

  49. Lugo, Z.R., Pokorny, C., Pellas, F., Noirhomme, Q., Laureys, S., Müller-Putz, G., and Kübler, A., Mental imagery for brain-computer interface control and communication in non-responsive individuals, Ann. Phys. Rehabil. Med., 2020, vol. 63, no. 1, pp. 21–27.

    Article  Google Scholar 

  50. Martin, S., Iturrate, I., Brunner, P., Millán, J.D.R., Schalk, G., Knight, R.T., and Pasley, B.N., Individual word classification during imagined speech using intracranial recordings, in Brain-Comput. Interface Res., 2019, pp. 83–91.

    Google Scholar 

  51. Maslen, H., and Rainey, S., Control and ownership of neuroprosthetic speech, Philos. Technol., 2020, pp. 1–21.

  52. Mateo, J., Torres, A.M., Sanchez-Morla, E.M., and Santos, J.L., Eye movement artefact suppression using Volterra filter for electroencephalography signals, J. Med. Biol. Eng., 2015, vol. 35, no. 3, pp. 395–405.

    Article  Google Scholar 

  53. Monaco, A., Sforza, G., Amoroso, N., Antonacci, M., Bellotti, R., de Tommaso, M., and Montemurno, A., The PERSON project: a serious brain-computer interface game for treatment in cognitive impairment, Health Technol., 2019, vol. 9, no. 2, pp. 123–133.

    Article  Google Scholar 

  54. Mora-Sánchez, A., Pulini, A.A., Gaume, A., Dreyfus, G., and Vialatte, F.B., A brain-computer interface for the continuous, real-time monitoring of working memory load in real-world environments, Cognit. Neurodyn., 2020, pp. 1–21.

    Book  Google Scholar 

  55. Nakagome K., Yamaya T., and Shimada K., Speech recognition device, speech recognition method, non-transitory recording medium, and robot, US Patent 10540972, 2020.

  56. Noel, T.C., and Snider, B.R., Utilizing deep neural networks for brain-computer interface-based prosthesis control, J. Comput. Sci. Colleges, 2019, vol. 35, no. 1, pp. 93–101.

    Google Scholar 

  57. Nourmohammadi, A., Jafari, M., and Zander, T.O., A survey on unmanned aerial vehicle remote control using brain-computer interface, IEEE Trans. Hum.-Mach. Syst., 2018, vol. 48, no. 4, pp. 337–348.

    Article  Google Scholar 

  58. Oral, E.A., Ozbek, I.Y., and Çodur, M.M., Cepstrum coefficients-based sleep stage classification, in IEEE Global Conference on Signal and Information Processing (GlobalSIP), 2017, pp. 457–461.

  59. Partila, P., Tovarek, J., Rozhon, J., and Jalowiczor, J., Human stress detection from the speech in danger situation, in Mobile Multimedia/Image Processing, Security, and Applications, 2019, vol. 10993, p. 109930U.

    Google Scholar 

  60. Paszkiel, S., Brain-computer interface technology, in Analysis and Classification of EEG Signals for Brain-Computer Interfaces, Springer, 2020, pp. 11–17.

    Book  Google Scholar 

  61. Paszkiel, S., Augmented reality (AR) technology in correlation with brain–computer interface technology, in Analysis and Classification of EEG Signals for Brain-Computer Interfaces, Springer, 2020, pp. 87–91.

  62. Pérez-Espinosa, H., Reyes-García, C.A., Villaseñor-Pineda, L., Acoustic feature selection and classification of emotions in speech using a 3D continuous emotion model, Biomed. Signal Process. Control, 2012, vol. 7, no. 1, pp. 79–87.

    Article  Google Scholar 

  63. Popolo, P.S. and Johnson, A.M., Relating cepstral peak prominence to cyclical parameters of vocal fold vibration from high-speed videoendoscopy using machine learning: a pilot study, J. Voice, 2020.

  64. Prasetio, B.H., Tamura, H., and Tanno, K., Semi-supervised deep time-delay embedded clustering for stress speech analysis, Electronics, 2019, vol. 8, no. 11, pp. 1263.

    Article  Google Scholar 

  65. Rabbani, Q., Milsap, G., and Crone, N.E., The potential for a speech brain-computer interface using chronic electrocorticography, Neurotherapeutics, 2019, vol. 16, no. 1, pp. 144–165.

    Article  Google Scholar 

  66. Ramadan, R.A. and Vasilakos, A.V. Brain computer interface: control signals review, Neurocomputing, 2017, vol. 223, pp. 26–44.

    Article  Google Scholar 

  67. Roy, S., McCreadie, K., and Prasad, G., Can a single model deep learning approach enhance classification accuracy of an EEG-based brain-computer interface?, In IEEE International Conference on Systems, Man and Cybernetics (SMC), 2019, pp. 1317–1321.

  68. Saha, P., Abdul-Mageed, and M., Fels, S., Speak your mind! towards imagined speech recognition with hierarchical deep learning, 2019. arXiv:1904.05746.

  69. San-Segundo, R., Gil-Martín, M., D’Haro-Enríquez, L.F., and Pardo, J.M., Classification of epileptic EEG recordings using signal transforms and convolutional neural networks, Comput. Biol. Med., 2019, vol. 109, pp. 148–158.

    Article  Google Scholar 

  70. Santamaría-Vázquez, E., Martínez-Cagigal, V., Gomez-Pilar, J., and Hornero, R. Deep learning architecture based on the combination of convolutional and recurrent layers for ERP-based brain-computer interfaces, Mediterranean Conference on Medical and Biological Engineering and Computing, 2019, pp. 1844–1852.

  71. Sereshkeh, A.R. and Chau, T.T.K., US Patent 16/153085, 2019.

  72. Servick, K., Computers turn neural signals into speech, Science, 2019. https://www.sciencemag.org/news/ 2019/01/artificial-intelligence-turns-brain-activity-speech.

  73. Shami, M. and Verhelst, W., An evolution of the robustness of existing supervised machine learning approaches to the classification of emotion in speech, Speech Commun., 2007, vol. 49, pp. 201–212.

    Article  Google Scholar 

  74. Shepelev, I.E., Comparative analysis of an iterative and a neural network-based city short-term load forecasting, Neurocomputers: Dev. Appl., 2016, vol. 3, pp. 21–30.

    Google Scholar 

  75. Shepelev, I.E., Lazurenko, D.M., Kiroy, V.N., Aslanyan, E.V., Bakhtin, O.M., and Minyaeva, N.R., A novel neural network approach to creating a brain-computer interface based on the EEG patterns of voluntary muscle movements, Neurosci. Behav. Physiol., 2018, vol. 48, no. 9, pp. 1145–1157.

    Article  Google Scholar 

  76. Sheth, J., Tankus, A., Tran, M., Comstock, L., Fried, I., and Speier, W., Identifying input features for development of real-time translation of neural signals to text, in Proc. of Interspeech, 2019, pp. 869–873.

  77. Singh, A. and Gumaste, A., Decoding imagined speech and computer control using brain waves, 2019. arXiv:1911.04255. 

  78. Sun, Y., Wen, G., and Wang, J., Weighted spectral features based on local Hu moments for speech motion recognition, Biomed. Signal Process. Control, 2015, vol. 18, pp. 80–90.

    Article  Google Scholar 

  79. Terrasa, J.L., Alba, G., Cifre, I., Rey, B., Montoya, P., and Muñoz, M.A., Power spectral density and functional connectivity changes due to a sensorimotor neurofeedback training: a preliminary study, Neural Plast., 2019, vol. 2019, p. 7647204.

    Article  Google Scholar 

  80. Vansteensel, M.J. and Jarosiewicz, B., Brain-computer interfaces for communication, Handb. Clin. Neurol., 2020, vol. 168, pp. 67–85.

    Article  Google Scholar 

  81. Venkataraman, K. and Rengaraj, R.H., Emotion recognition from speech, 2019. arXiv:1912.10458v1.

  82. Vernekar, K., Kumar, H., and Gangadharan, K., Fault detection of gear using spectrum and cepstrum analysis, in Proc. of Indian Natl. Sci. Acad., 2015, vol. 81, no. 5, pp. 1177–1182.

  83. Vourvopoulos, A., Pardo, O.M., Lefebvre, S., Neureither, M., Saldana, D., Jahng, E., and Liew, S.L., Effects of a brain-computer interface with virtual reality (VR) neurofeedback: A pilot study in chronic stroke patients, Front. Hum. Neurosci., 2019, vol. 13, p. 210.

    Article  Google Scholar 

  84. Wang, L., Huang, W., Yang, Z., and Zhang, C., Temporal-spatial-frequency depth extraction of brain-computer interface based on mental tasks, Biomed. Signal Process. Control, 2020, vol. 58, p. 101845.

    Article  Google Scholar 

  85. Wolpaw, J.R., Bedlack, R.S., Reda, D.J., Ringer, R.J., Banks, P.G., Vaughan, T.M., and McFarland, D.J., Independent home use of a brain-computer interface by people with amyotrophic lateral sclerosis, Neurology, 2018, vol. 91, no. 3, pp. e258–e267.

    Article  Google Scholar 

  86. Wolpaw, J.R., Millán, J.D.R., and Ramsey, N.F., Brain-computer interfaces: Definitions and principles, Handb. Clin. Neurol., 2020, vol. 168, pp. 15–23.

    Article  Google Scholar 

  87. Xu, Y., Ding, C., Shu, X., Gui, K., Bezsudnova, Y., Sheng, X., and Zhang, D., Shared control of a robotic arm using non-invasive brain-computer interface and computer vision guidance, Rob. Auton. Syst., 2019, vol. 115, pp. 121–129.

    Article  Google Scholar 

  88. Zhang, Y., Zhang, X., Sun, H., Fan, Z., and Zhong, X., Portable brain-computer interface based on novel convolutional neural network, Comput. Biol. Med., 2019, vol. 107, pp. 248–256.

    Article  Google Scholar 

Download references

Funding

This research was supported by the Russian Science Foundation (RSF) in the framework of “Development of stimulus-unrelated brain-computer interface for disabled people rehabilitation” project (no. 20-19-00627, 2020-2022).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to D. G. Shaposhnikov.

Ethics declarations

The authors declare that they have no conflict of interest.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shepelev, I.E., Bakhtin, O.M., Lazurenko, D.M. et al. A Novel Neural Network-Based Approach to Classification of Implicit Emotional Components in Ordinary Speech. Opt. Mem. Neural Networks 30, 26–36 (2021). https://doi.org/10.3103/S1060992X21010057

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S1060992X21010057

Keywords:

Navigation