ISSN 0021-3454 (печатная версия)
ISSN 2500-0381 (онлайн версия)
Меню

11
Содержание
том 67 / Ноябрь, 2024
СТАТЬЯ

METHOD OF CREATING MULTIMODAL DATABASES FOR AUDIOVISUAL ANALYSIS OF ENGAGEMENT AND EMOTIONS OF VIRTUAL COMMUNICATION PARTICIPANTS


Ссылка для цитирования : Dvoynikova A. A., Karpov A. A. Method of creating multimodal databases for audiovisual analysis of engagement and emotions of virtual communication participants. Journal of Instrument Engineering. 2024. Vol. 67, N 11. P. 984–993 (in Russian). DOI: 10.17586/0021-3454-2024-67-11-984-993.

Аннотация. A method is presented for creating multimodal data bases designed to analyze behavioral manifestations of virtual communication participants. The proposed methodology is aimed at developing database of group communication (more than two interlocutors) using teleconference systems. The technique also takes into account the peculiarities of the natural manifestations of behavioral aspects (engagement and emotions) of the participants in the conversation. The identified features constitute the novelty of the proposed technique. The technique consists of three main stages — preparatory, recording, and annotation of data. The technique is tested and validated when creating a new multimodal data corpus ENERGI, containing Russian-language audiovisual recordings of group communication of participants using teleconferencing systems. The created corpus is designed to solve the problems of recognizing the involvement of participants in communication, as well as analyzing the manifestation of emotions during a dialogue. The proposed technique is universal and can be applied to collecting various corpora of virtual communication data.
Ключевые слова: methodology for database creating, multimodal database, engagement analysis, emotion analysis, data annotation, virtual communication

Благодарность: the work was carried out within the framework of budget topic No. FFZF-2022-0005.

Список литературы:
  1. Tkachenya A.V., Davydov A.G., Kiselev V.V., Khitrov M.V. Journal of Instrument Engineering, 2013, no. 2(56), pp. 61–66. (in Russ.)
  2. Cafaro A., Wagne J., Baur T., Dermouche S., Torres Torres M. et al. Proc. of the 19th ACM Intern. Conf. on Multimodal Interaction, 2017, рр. 350–359, DOI: 10.1145/3136755.313678.
  3. Guhan P., Agarwal M., Awasthi N., Reeves G., Manocha D. et al. arXiv preprint arXiv:2011.08690, 2020.
  4. Celiktutan O., Skordos E., Gunes H. IEEE Transactions on Affective Computing, 2017, no. 4(10), pp. 484–497, DOI: 10.1109/TAFFC.2017.2737019.
  5. Ringeval F., Sonderegger A., Sauer J., Lalanne D. Proc. of the 10th IEEE Intern. Conf. and Workshops on Automatic Face and Gesture Recognition, 2013, рр. 1–8, DOI: 10.1109/FG.2013.6553805.
  6. Kaur A., Mustafa A., Mehta L., Dhall A. 2018 Digital Image Computing: Techniques and Applications (DICTA), 2018, рр. 1–8, DOI: 10.1109/DICTA.2018.8615851.
  7. Gupta A., D'Cunha A., Awasthi K., Balasubramanian V. arXiv preprint arXiv:1609.01885, 2016.
  8. Sümer Ö., Goldberg P., D’Mello S., Gerjets P., Trautwein U., Kasneci E. IEEE Transactions on Affective Computing, 2021, no. 2(14), pp. 1012–1027, DOI: 10.1109/TAFFC.2021.3127692.
  9. Whitehill J., Serpell Z., Lin Y.C., Foster A., Movellan J.R. IEEE Transactions on Affective Computing, 2014, no. 1(5), pp. 86–98, DOI: 10.1109/TAFFC.2014.2316163.
  10. Psaltis A., Apostolakis K. C., Dimitropoulos K., Daras P. IEEE Transactions on Games, 2017, no. 3(10), pp. 292–303, DOI: 10.1109/TCIAIG.2017.2743341.
  11. Dvoynikova A.A., Kagirov I.A., Karpov A.A. Information and Control Systems, 2022, no. 5(120), pp. 12–22, DOI: 10.31799/1684-8853-2022-5-12-22. (in Russ.)
  12. Dvoynikova A.A., Markitantov M.V., Ryumina E.V., Uzdyaev M.Yu., Velichko A.N. et al. Informatics and Automation, 2022, no. 6(21), pp. 1097–1144, DOI: 10.15622/ia.21.6.2 (in Russ.)
  13. Dhall A., Goecke R., Gedeon T. Journal of latex class files, 2007, no. 1(6).
  14. Kollias D., Zafeiriou S. arXiv preprint arXiv:1811.07770, 2018.
  15. Busso C., Bulut M., Lee C.C., Kazemzadeh A., Mower E. et al. Language Resources and Evaluation, 2008, no. 4(42), pp. 335–359, DOI: 10.1007/s10579-008-9076-6.
  16. Poria S., Hazarika D., Majumder N., Naik G., Cambria E. et al. Proc. of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, рр. 527–536.
  17. Zadeh A.B., Liang P.P., Poria S., Cambria E., Morency L.P. Proc. of the 56th Annual Meeting of the Association for Computational Linguistics, 2018, рр. 2236–2246, DOI: 10.18653/v1/P18-1208.
  18. Perepelkina O., Kazimirova E., Konstantinova M. Proc. of the Intern. Conf. on Speech and Computer, 2018, рр. 501– 510, DOI: 10.1007/978-3-319-99579-3_52.
  19. Jones S.R.G. American Journal of sociology, 1992, no. 3(98), pp. 451–468.
  20. Viola P., Jones M. Proc. of the 2001 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR), 2001, vol. 1, рр. I-I, DOI: 10.1109/CVPR.2001.990517.
  21. Patent USA 3069654, Method and means for recognizing complex, P.V.C. Hough, Priority 1962.
  22. Lausberg H., Sloetjes H. Behavior research methods, 2009, no. 3(41), pp. 841–849, DOI: 10.3758/BRM.41.3.841
  23. Lyusin D.V. Psychological diagnostics, 2006, vol. 4, рр. 3–22. (in Russ.)
  24. Lyusin D.V., Ovsyannikova V.V. Psychological journal, 2013, no. 6(34), pp. 82–94. (in Russ.)
  25. Certificate of registration of the database 2023624954, Baza dannykh proyavleniy vovlechennosti i emotsiy russkoyazychnykh uchastnikov telekonferentsiy (ENERGI — ENgagement and Emotion Russian Gathering Interlocutors) (Database of Manifestations of Engagement and Emotions of Russian-Speaking Participants in Teleconferences (ENERGI - ENgagement and Emotion Russian Gathering Interlocutors)), A.A. Dvoynikova, A.A. Karpov, Priority 25.12.2023. (in Russ.)