ISSN 0021-3454 (print version)
ISSN 2500-0381 (online version)

vol 68 / January, 2025

DOI 10.17586/0021-3454-2023-66-10-818-827

UDC 004.934.2


A. A. Dvoynikova
St. Petersburg Federal Research Center of the RAS, Speech and Multimodal Interfaces Laboratory ; Junior Researcher

K. O. Kondratenko
St. Petersburg State University, Department of Phonetics and Methods of Teaching Foreign Languages ;

Reference for citation: Dvoynikova A. A., Kondratenko K. K. Approach to automatic recognition of emotions in speech transcriptions. Journal of Instrument Engineering. 2023. Vol. 66, N 10. P. 818—827 (in Russian). DOI: 10.17586/0021-3454-2023-66-10-818-827.

Abstract. The issue of recognizing emotions in speech transcriptions, which is relevant in various fields, is studied. The influence of preprocessing methods (stop word removal, lemmatization, stemming) on the accuracy of emotion recognition in text data in Russian and English is analyzed. To conduct experimental studies, orthographic transcriptions of dialogues from the multimodal corpora RAMAS and CMU-MOSEI in Russian and English, respectively, are used. These corpora are annotated for the following emotions: joy, surprise, fear, anger, sadness, disgust and neutral. Preprocessing of text data includes removal of punctuation marks and stop words, tokenization, lemmatization and stemming. Vectorization of the resulting material is carried out using the TF-IDF, BoW, Word2Vec methods. The used classifiers are support vector machines and logistic regression. An approach is developed that is a combination of the above methods. For the Russian language, the highest accuracy of emotion recognition achieved using a weighted F-measure is 92.63 %, for the English language – 47.21 %. In addition, studies are conducted to identify the number of remote stops for effective emotion recognition from text data. Experimental results show that storing stop words in the source text allows to achieve the highest accuracy of text classification.
Keywords: emotion recognition, text data preprocessing methods, stop-words removal, multiclass classification, text data analysis

Acknowledgement: The work was carried out within the framework of a project of the Russian Science Foundation (section “An approach to classifying text data by emotions” was carried out within the framework of project No. 22-11-00321), the rest of the research was carried out partially within the framework of the leading scientific school of the Russian Federation (grant No. NSh-17.2022.1.6) and the budget theme of the St. Petersburg Federal Research Center of the Russian Academy of Sciences (No. FFZF-2022-0005).

