ISSN 0021-3454 (print version)
ISSN 2500-0381 (online version)
Menu

11
Issue
vol 67 / November, 2024
Article

DOI 10.17586/0021-3454-2023-66-10-818-827

UDC 004.934.2

APPROACH TO AUTOMATIC RECOGNITION OF EMOTIONS IN SPEECH TRANSCRIPTIONS

A. A. Dvoynikova
St. Petersburg Federal Research Center of the RAS, Speech and Multimodal Interfaces Laboratory ; Junior Researcher


K. O. Kondratenko
St. Petersburg State University, Department of Phonetics and Methods of Teaching Foreign Languages ;


Read the full article 
Reference for citation: Dvoynikova A. A., Kondratenko K. K. Approach to automatic recognition of emotions in speech transcriptions. Journal of Instrument Engineering. 2023. Vol. 66, N 10. P. 818—827 (in Russian). DOI: 10.17586/0021-3454-2023-66-10-818-827.

Abstract. The issue of recognizing emotions in speech transcriptions, which is relevant in various fields, is studied. The influence of preprocessing methods (stop word removal, lemmatization, stemming) on the accuracy of emotion recognition in text data in Russian and English is analyzed. To conduct experimental studies, orthographic transcriptions of dialogues from the multimodal corpora RAMAS and CMU-MOSEI in Russian and English, respectively, are used. These corpora are annotated for the following emotions: joy, surprise, fear, anger, sadness, disgust and neutral. Preprocessing of text data includes removal of punctuation marks and stop words, tokenization, lemmatization and stemming. Vectorization of the resulting material is carried out using the TF-IDF, BoW, Word2Vec methods. The used classifiers are support vector machines and logistic regression. An approach is developed that is a combination of the above methods. For the Russian language, the highest accuracy of emotion recognition achieved using a weighted F-measure is 92.63 %, for the English language – 47.21 %. In addition, studies are conducted to identify the number of remote stops for effective emotion recognition from text data. Experimental results show that storing stop words in the source text allows to achieve the highest accuracy of text classification.
Keywords: emotion recognition, text data preprocessing methods, stop-words removal, multiclass classification, text data analysis

Acknowledgement: The work was carried out within the framework of a project of the Russian Science Foundation (section “An approach to classifying text data by emotions” was carried out within the framework of project No. 22-11-00321), the rest of the research was carried out partially within the framework of the leading scientific school of the Russian Federation (grant No. NSh-17.2022.1.6) and the budget theme of the St. Petersburg Federal Research Center of the Russian Academy of Sciences (No. FFZF-2022-0005).

References:
  1. Acheampong F.A., Wenyu C., Nunoo-Mensah H. Engineering Reports, 2020, no. 7(2), pp. e12189, DOI: 10.1002/eng2.12189.
  2. Dzedzickis A., Kaklauskas A., Bucinskas V. Sensors, 2020, no. 3(20), pp. 592, DOI: 10.3390/s20030592.
  3. Ryumina E.V., Karpov A.A. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2020, no. 2(20), pp. 163–176, DOI: 10.17586/2226-1494-2020-20-2-163-176. (in Russ.)
  4. Mubarakshina R.T., Yakovenko R.T. Sistemnyy analiz v proyektirovanii i upravlenii (System Analysis in Design and Management), 2019, no. 1(23), pp. 392–397. (in Russ.)
  5. Bogdanov A.L., Dulya I.S. Tomsk State University Journal of Economics, 2019, no. 47, pp. 220–241, DOI: 10.17223/19988648/47/17. (in Russ.)
  6. Dyulicheva Yu. Voprosy obrazovaniya (Educational Studies), 2021, no. 4, pp. 243–265, DOI: 10.17323/1814-9545-2021-4-243-265. (in Russ.)
  7. Adoma A.F., Henry N.M., Chen W. 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), 2020. рр. 117–121, DOI: 10.1109/iccwamtip51612.2020.9317379.
  8. Verkholyak O., Dvoynikova A., Karpov A. J. Internet Serv. Inf. Secur., 2021, no. 1(11), pp. 80–96.
  9. Liu Y., Fu G. Future Generation Computer Systems, 2021, vol. 119, рр. 1–6, DOI: 10.1016/j.future.2021.01.010.
  10. Ovsyannikova V.V. Financial Analytics: Science and Experience, 2013, no. 175(37), pp. 43–48. (in Russ.)
  11. Ekman P. Handbook of cognition and emotion, 1999, pp. 45–60.
  12. Izard C.E. The psychology of emotions, NY, London, Plenum Press, 1991.
  13. Sogancioglu G., Verkholyak O., Kaya H., Fedotov D., Cadée T., Salah A. A., Karpov A. INTERSPEECH, 2020, рр. 2097–2101, DOI: 10.21437/interspeech.2020-3160.
  14. Russell J.A. Psychological bulletin, 1991, no. 3(110), pp. 426–450, DOI: 10.1037/0033-2909.110.3.426.
  15. Dvoynikova A.A., Karpov А.А. Information and Control Systems, 2020, no. 4(107), pp. 20–30, DOI:10.31799/1684-8853-2020-4-20-30. (in Russ.)
  16. Henry E.R., Hofrichter J. Methods in enzymology, Academic Press, 1992, vol. 210, рр. 129–192, DOI: 10.1016/0076-6879(92)10010-B.
  17. Pennington J., Socher R., Manning C.D. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, рр. 1532–1543, DOI: 10.3115/v1/d14-1162.
  18. Bojanowski P., Grave E., Joulin A., Mikolov T. Transactions of the association for computational linguistics, 2017, vol. 5, рр. 135–146, DOI: 10.1162/tacl_a_00051.
  19. Mikolov T., Sutskever I., Chen K., Corrado G.S., Dean J. Advances in neural information processing systems, 2013, vol. 26, рр. 1–9.
  20. Devlin J., Chang M.W., Lee K., Toutanova K. arXiv preprint arXiv:1810.04805, 2018, DOI: 10.48550/arXiv.1810.04805.
  21. Peters M., Neumann M., Iyyer M., Gardner M., Clark C., Lee K., Zettle-moyer L. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018, vol. 1, рр. 2227–2237.
  22. Halim L.R., Suryadibrata A. IJNMT (International Journal of New Media Technology), 2021, no. 1(8), pp. 57–64, DOI: 10.31937/ijnmt.v8i1.2047.
  23. Duong H.T., Nguyen-Thi T.A. Computational Social Networks, 2021, no. 1(8), pp. 1–16, DOI: 10.1186/s40649-020-00080-x.
  24. Perepelkina O., Kazimirova E., Konstantinova M. International Conference on Speech and Computer, Springer, Cham, 2018, рр. 501–510, DOI: 10.1007/978-3-319-99579-3_52.
  25. Dvoynikova A.A., Verkholyak О.V., Karpov А.А. Almanac of Scientific Works of Young Scientists of ITMO University, 2020, vol. 3, рр. 75–80. (in Russ.)
  26. Zadeh A.B., Liang P.P., Poria S., Cambria E., Morency L.P. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018, vol. 1, Long Papers, рр. 2236–2246, DOI: 10.18653/v1/p18-1208.