<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">pribor</journal-id><journal-title-group><journal-title xml:lang="ru">Известия высших учебных заведений. Приборостроение</journal-title><trans-title-group xml:lang="en"><trans-title>Journal of Instrument Engineering</trans-title></trans-title-group></journal-title-group><issn pub-type="ppub">0021-3454</issn><issn pub-type="epub">2500-0381</issn><publisher><publisher-name>Национальный исследовательский университет ИТМО</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.17586/0021-3454-2025-68-12-1011-1019</article-id><article-id custom-type="elpub" pub-id-type="custom">pribor-436</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>ИНФОРМАТИКА И ИНФОРМАЦИОННЫЕ ПРОЦЕССЫ</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="en"><subject>INFORMATICS AND INFORMATION PROCESSES</subject></subj-group></article-categories><title-group><article-title>Многомодальный корпус данных взаимодействия участников виртуальной коммуникации ENERGI</article-title><trans-title-group xml:lang="en"><trans-title>ENERGI: a multimodal data corpus of interaction of participants in virtual communication</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Двойникова</surname><given-names>А. А.</given-names></name><name name-style="western" xml:lang="en"><surname>Dvoynikova</surname><given-names>A. A.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Анастасия Александровна Двойникова - лаборатория речевых и многомодальных интерфейсов, младший научный сотрудник</p><p>Санкт-Петербург</p></bio><bio xml:lang="en"><p>Anastasia A. Dvoynikova — St. Petersburg Institute for Informatics and Automation of the RAS, Laboratory of Speech and Multimodal Interfaces, Junior Researcher</p><p>St. Petersburg</p></bio><email xlink:type="simple">dvoynikova.a@iias.spb.su</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Величко</surname><given-names>А. Н.</given-names></name><name name-style="western" xml:lang="en"><surname>Velichko</surname><given-names>A. N.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Алёна Николаевна Величко — канд. техн. наук, лаборатория речевых и многомодальных интерфейсов; старший научный сотрудник</p><p>Санкт-Петербург</p></bio><bio xml:lang="en"><p>lena N. Velichko — PhD; St. Petersburg Institute for Informatics and Automation of the RAS, Laboratory of Speech and Multimodal Interfaces; Senior Researcher</p><p>St. Petersburg</p></bio><email xlink:type="simple">velichko.a@iias.spb.su</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Карпов</surname><given-names>А. А.</given-names></name><name name-style="western" xml:lang="en"><surname>Karpov</surname><given-names>A. A.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Алексей Анатольевич Карпов — д-р техн. наук, профессор; лаборатория речевых и многомодальных интерфейсов; руководитель лаборатории</p><p>Санкт-Петербург</p></bio><bio xml:lang="en"><p>Alexey A. Karpov — Dr. Sci., Professor; St. Petersburg Institute for Informatics and Automation of the RAS, Laboratory of Speech and Multimodal Interfaces; Head of the Laboratory</p><p>St. Petersburg</p></bio><email xlink:type="simple">karpov@iias.spb.su</email><xref ref-type="aff" rid="aff-1"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru"><institution>Санкт-Петербургский Федеральный исследовательский центр Российской академии наук</institution></aff><aff xml:lang="en"><institution>St. Petersburg Federal Research Center of the RAS</institution></aff></aff-alternatives><pub-date pub-type="collection"><year>2025</year></pub-date><pub-date pub-type="epub"><day>19</day><month>01</month><year>2026</year></pub-date><volume>68</volume><issue>12</issue><fpage>1011</fpage><lpage>1019</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Национальный исследовательский университет ИТМО, 2026</copyright-statement><copyright-year>2026</copyright-year><copyright-holder xml:lang="ru">Национальный исследовательский университет ИТМО</copyright-holder><copyright-holder xml:lang="en">Национальный исследовательский университет ИТМО</copyright-holder><license xlink:href="https://pribor.ifmo.ru/jour/about/submissions#copyrightNotice" xlink:type="simple"><license-p>https://pribor.ifmo.ru/jour/about/submissions#copyrightNotice</license-p></license></permissions><self-uri xlink:href="https://pribor.ifmo.ru/jour/article/view/436">https://pribor.ifmo.ru/jour/article/view/436</self-uri><abstract><p>Выполнен статистический анализ многомодального корпуса данных ENERGI (ENgagement and Emotion Russian Gathering Interlocutors), содержащего аудиовидеозаписи коммуникации на русском языке группы людей, полученные с использованием системы телеконференций Zoom. Данные корпуса размечены по трем классам: вовлеченности (высокий, средний, низкий) участников в разговор, эмоционального возбуждения (высокий, средний, низкий) и валентности эмоций (положительный, нейтральный, негативный), а также десяти классам коммуникативных жестов. Корпус содержит 6,4 часов видеозаписей групповых коммуникаций участников, всего 18 уникальных дикторов; разметка данных выполнена на 10-секундных временных интервалах. Преимущества ENERGI относительно других корпусов заключаются в многомодальности, русскоязычности, разнообразии дикторов, естественных условиях записи данных и расширенной аннотации по нескольким параметрам поведения участников коммуникации. Корпус может быть использован для разработки многомодальной автоматической системы анализа поведенческих аспектов участников групповой виртуальной коммуникации.</p></abstract><trans-abstract xml:lang="en"><p>A statistical analysis of the multimodal ENERGI (ENgagement and Emotion Russian Gathering Interlocutors) data corpus containing audio-video recordings of communication in Russian by a group of people obtained using the Zoom teleconference system has been performed. The corpus data is annotated into three classes: participant engagement (high, medium, low), emotional arousal (high, medium, low), and emotional valence (positive, neutral, negative), as well as ten classes of communicative gestures. The corpus contains 6.4 hours of video recordings of group communications, with a total of 18 unique speakers; the data is annotated using 10-second time intervals. ENERGI’s advantages over other corpora include its multimodality, Russian language support, speaker diversity, natural recording conditions, and extensive annotation across several behavioral parameters of communication participants. The corpus can be used to develop a multimodal automated system for analyzing the behavioral aspects of participants in virtual group communications.</p></trans-abstract><kwd-group xml:lang="ru"><kwd>корпус данных</kwd><kwd>вовлеченность участников</kwd><kwd>эмоциональное возбуждение</kwd><kwd>валентность эмоций</kwd><kwd>коммуникативные жесты</kwd></kwd-group><kwd-group xml:lang="en"><kwd>data corpus</kwd><kwd>engagement of participants</kwd><kwd>emotional arousal</kwd><kwd>valence of emotions</kwd><kwd>communicative gestures</kwd></kwd-group><funding-group><funding-statement xml:lang="ru">Работа выполнена в рамках бюджетной темы СПб ФИЦ РАН № FFZF-2025-0003.</funding-statement><funding-statement xml:lang="en">The work was carried out within the framework of the budget theme of the St. Petersburg Federal Research Center of the RAS, No. FFZF-2025-0003.</funding-statement></funding-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Уздяев М. Ю., Карпов А. А. Создание и анализ многомодального корпуса данных для автоматического распознавания агрессивного поведения людей // Научно-технический вестник информационных технологий, механики и оптики. 2024. Т. 24, № 5. С. 834–842.</mixed-citation><mixed-citation xml:lang="en">Uzdiaev M.Yu., Karpov A.A. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2024, no. 5(24), pp. 834–842. (in Russ.)</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Gupta A., Balasubramanian V. Daisee: Towards user engagement recognition in the wild // arXiv preprint arXiv:1609.01885. 2016.</mixed-citation><mixed-citation xml:lang="en">Gupta A., Balasubramanian V. arXiv preprint, arXiv:1609.01885, 2016.</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Ben-Youssef A., Clavel C., Essid S. et al. UE-HRI: a new dataset for the study of user engagement in spontaneous human-robot interactions // Proc. of the 19th ACM Intern. Conf. on Multimodal Interaction (ICMI). 2017. P. 464–472. DOI: 10.1145/3136755.3136814.</mixed-citation><mixed-citation xml:lang="en">Ben-Youssef A., Clavel C., Essid S. et al. Proceedings of the 19th ACM International Conference on Multimodal Interaction (ICMI), 2017, рр. 464–472, DOI: 10.1145/3136755.3136814.</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Del Duchetto F., Baxter P., Hanheide M. Are you still with me? Continuous engagement assessment from a robot’s point of view // Frontiers in Robotics and AI. 2020. Vol. 7. DOI: 10.3389/frobt.2020.00116.</mixed-citation><mixed-citation xml:lang="en">Del Duchetto F., Baxter P., Hanheide M. Frontiers in Robotics and AI, 2020, vol. 7, DOI: 10.3389/frobt.2020.00116.</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Kaur A., Mustafa A., Mehta L., Dhall A. Prediction and localization of student engagement in the wild // 2018 Digital Image Computing: Techniques and Applications (DICTA). 2018. P. 1–8. DOI: 10.1109/DICTA.2018.8615851.</mixed-citation><mixed-citation xml:lang="en">Kaur A., Mustafa A., Mehta L., Dhall A. 2018 Digital Image Computing: Techniques and Applications (DICTA), 2018, рр. 1–8, DOI: 10.1109/DICTA.2018.8615851.</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Delgado K., Origgi J. M., Hasanpoor T. et al. Student engagement dataset // Proc. of the IEEE/CVF Intern. Conf. on Computer Vision. 2021. P. 3628–3636.</mixed-citation><mixed-citation xml:lang="en">Delgado K., Origgi J.M., Hasanpoor T. et al. Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, рр. 3628–3636.</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Чураев Е. Н. Персонализированные модели распознавания психоэмоционального состояния и вовлеченности лиц по видео: автореф. дис. … канд. тех. наук. СПб, 2025. 134 с.</mixed-citation><mixed-citation xml:lang="en">Churaev E.N. Personalizirovannyye modeli raspoznavaniya psikhoemotsional’nogo sostoyaniya i vovlechonnosti lits po video (Personalized Models for Recognizing Psycho-Emotional State and Facial Engagement from Video), Candidate’s thesis, St. Petersburg, 2025, 134 р. (in Russ.)</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Karimah S. N., Hasegawa S. Automatic engagement estimation in smart education/learning settings: a systematic review of engagement definitions, datasets, and methods // Smart Learning Environments. 2022. Vol. 9, N 1. P. 31. DOI: 10.1186/s40561-022-00212-y.</mixed-citation><mixed-citation xml:lang="en">Karimah S.N., Hasegawa S. Smart Learning Environments, 2022, no. 1(9), pp. 31, DOI: 10.1186/s40561-022-00212-y.</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Celiktutan O., Skordos E., Gunes H. Multimodal human-human-robot interactions (mhhri) dataset for studying personality and engagement // IEEE Transactions on Affective Computing. 2017. Vol. 10, N 4. P. 484–497. DOI: 10.1109/TAFFC.2017.2737019.</mixed-citation><mixed-citation xml:lang="en">Celiktutan O., Skordos E., Gunes H. IEEE Transactions on Affective Computing, 2017, no. 4(10), pp. 484–497, DOI: 10.1109/TAFFC.2017.2737019.</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Pabba C., Kumar P. An intelligent system for monitoring students’ engagement in large classroom teaching through facial expression recognition // Expert Systems. 2022. Vol. 39, N 1. P. e12839. DOI: 10.1111/exsy.12839.</mixed-citation><mixed-citation xml:lang="en">Pabba C., Kumar P. Expert Systems, 2022, no. 1(39), pp. e12839, DOI: 10.1111/exsy.12839.</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">Chatterjee I., Goršič M., Clapp J. D., Novak D. Automatic estimation of interpersonal engagement during naturalistic conversation using dyadic physiological measurements // Frontiers in Neuroscience. 2021. Vol. 15. P. 757381. DOI: 10.3389/fnins.2021.757381.</mixed-citation><mixed-citation xml:lang="en">Chatterjee I., Goršič M., Clapp J. D., Novak D. Frontiers in Neuroscience, 2021, vol. 15, рр. 757381, DOI: 10.3389/fnins.2021.757381.</mixed-citation></citation-alternatives></ref><ref id="cit12"><label>12</label><citation-alternatives><mixed-citation xml:lang="ru">Sümer Ö., Goldberg P., D Mello S. et al. Multimodal engagement analysis from facial videos in the classroom // IEEE Transactions on Affective Computing. 2021. Vol. 14, N 2. P. 1012–1027. DOI: 10.1109/TAFFC.2021.3127692.</mixed-citation><mixed-citation xml:lang="en">Sümer Ö., Goldberg P., D’Mello S. et al. IEEE Transactions on Affective Computing, 2021, no. 2(14), pp. 1012–1027, DOI: 10.1109/TAFFC.2021.3127692.</mixed-citation></citation-alternatives></ref><ref id="cit13"><label>13</label><citation-alternatives><mixed-citation xml:lang="ru">Vanneste P., Oramas J., Verelst T. et al. Computer vision and human behaviour, emotion and cognition detection: A use case on student engagement // Mathematics. 2021. Vol. 9, N 3. P. 287. DOI: 10.3390/math9030287.</mixed-citation><mixed-citation xml:lang="en">Vanneste P., Oramas J., Verelst T. et al. Mathematics, 2021, no. 3(9), pp. 287, DOI: 10.3390/math9030287.</mixed-citation></citation-alternatives></ref><ref id="cit14"><label>14</label><citation-alternatives><mixed-citation xml:lang="ru">Dresvyanskiy D., Sinha Y., Busch M. et al. DyCoDa: A multi-modal data collection of multi-user remote survival game recordings // Speech and Computer. SPECOM 2022. Lecture Notes in Computer Science. 2022. P. 163–177. DOI: 10.1007/978-3-031-20980-2_15.</mixed-citation><mixed-citation xml:lang="en">Dresvyanskiy D., Sinha Y., Busch M. et al. Speech and Computer. SPECOM 2022. Lecture Notes in Computer Science, 2022, рр. 163–177, DOI: 10.1007/978-3-031-20980-2_15.</mixed-citation></citation-alternatives></ref><ref id="cit15"><label>15</label><citation-alternatives><mixed-citation xml:lang="ru">Cafaro A., Wagner J., Baur T. et al. The NoXi database: multimodal recordings of mediated novice-expert interactions // Proc. of the ICMI. 2017. P. 350–359. DOI: 10.1145/3136755.3136780.</mixed-citation><mixed-citation xml:lang="en">Cafaro A., Wagner J., Baur T. et al. Proceedings of the ICMI, 2017, рр. 350–359, DOI: 10.1145/3136755.3136780.</mixed-citation></citation-alternatives></ref><ref id="cit16"><label>16</label><citation-alternatives><mixed-citation xml:lang="ru">Busso C., Bulut M., Lee C. C. et al. IEMOCAP: Interactive emotional dyadic motion capture database // Language resources and evaluation. 2008. Vol. 42, N 4. P. 335–359. DOI: 10.1007/s10579-008-9076-6.</mixed-citation><mixed-citation xml:lang="en">Busso C., Bulut M., Lee C.C. et al. Language resources and evaluation, 2008, no. 4(42), pp. 335–359, DOI: 10.1007/s10579-008-9076-6.</mixed-citation></citation-alternatives></ref><ref id="cit17"><label>17</label><citation-alternatives><mixed-citation xml:lang="ru">Ringeval F., Sonderegger A., Sauer J., Lalanne D. Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions // 10th IEEE Intern. Conf. and Workshops on Automatic Face and Gesture Recognition (FG). 2013. P. 1–8. DOI: 10.1109/FG.2013.6553805.</mixed-citation><mixed-citation xml:lang="en">Ringeval F., Sonderegger A., Sauer J., Lalanne D. 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), 2013, рр. 1–8, DOI: 10.1109/FG.2013.6553805.</mixed-citation></citation-alternatives></ref><ref id="cit18"><label>18</label><citation-alternatives><mixed-citation xml:lang="ru">Kossaiﬁ J., Walecki R., Panagakis Y. et al. Sewa db: A rich database for audio-visual emotion and sentiment research in the wild // IEEE Transactions on Pattern Analysis and Machine Intelligence. 2019. Vol. 43, N 3. P. 1022–1040. DOI: 10.1109/TPAMI.2019.2944808.</mixed-citation><mixed-citation xml:lang="en">Kossaifi J., Walecki R., Panagakis Y. et al. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, no. 3(43), pp. 1022–1040. DOI: 10.1109/TPAMI.2019.2944808.</mixed-citation></citation-alternatives></ref><ref id="cit19"><label>19</label><citation-alternatives><mixed-citation xml:lang="ru">Двойникова А. А. Аналитический обзор многомодальных корпусов данных для распознавания эмоций // Альманах научных работ молодых ученых Университета ИТМО. 2023. Т. 1. С. 251–256.</mixed-citation><mixed-citation xml:lang="en">Dvoynikova A.A. Almanac of scientific works of young scientists of ITMO University, 2023, vol. 1, рр. 251–256. (in Russ.)</mixed-citation></citation-alternatives></ref><ref id="cit20"><label>20</label><citation-alternatives><mixed-citation xml:lang="ru">Свид. о рег. № 2023624954. База данных проявлений вовлеченности и эмоций русскоязычных участников телеконференций (ENERGI — ENgagement and Emotion Russian Gathering Interlocutors), А. А. Карпов, А. А. Двойникова. 03.11.2023.</mixed-citation><mixed-citation xml:lang="en">Certificate of registration of the database 2023624954, Baza dannykh proyavleniy vovlechennosti i emotsiy russkoyazychnykh uchastnikov telekonferentsiy (ENERGI — ENgagement and Emotion Russian Gathering Interlocutors) (Database of Manifestations of Engagement and Emotions of Russian-Speaking Participants in Teleconferences (ENERGI - ENgagement and Emotion Russian Gathering Interlocutors)), A.A. Dvoynikova, A.A. Karpov, Priority 25.12.2023. (in Russ.)</mixed-citation></citation-alternatives></ref><ref id="cit21"><label>21</label><citation-alternatives><mixed-citation xml:lang="ru">Двойникова А. А., Карпов А. А. Методика создания многомодальных корпусов данных для аудиовизуального анализа вовлеченности и эмоций участников виртуальной коммуникации // Изв. вузов. Приборостроение. 2024. Т. 67, № 11. С. 984–993. DOI: 10.17586/0021-3454-2024-67-11-984-993.</mixed-citation><mixed-citation xml:lang="en">Dvoynikova A.A., Karpov A.A. Journal of Instrument Engineering, 2024, no. 11(67), pp. 984–993, DOI: 10.17586/0021-3454-2024-67-11-984-993. (in Russ.)</mixed-citation></citation-alternatives></ref><ref id="cit22"><label>22</label><citation-alternatives><mixed-citation xml:lang="ru">Sloetjes H., Wittenburg P. Annotation by category-ELAN and ISO DCR // Proc. of the 6th Intern. Conf. on Language Resources and Evaluation (LREC 2008). 2008.</mixed-citation><mixed-citation xml:lang="en">Sloetjes H., Wittenburg P. Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008), 2008.</mixed-citation></citation-alternatives></ref><ref id="cit23"><label>23</label><citation-alternatives><mixed-citation xml:lang="ru">Люсин Д. В. Новая методика для измерения эмоционального интеллекта: опросник ЭмИн // Психологическая диагностика. 2006. Т. 4. С. 3–22.</mixed-citation><mixed-citation xml:lang="en">Lyusin D.V. Psychological diagnostics, 2006, vol. 4, рр. 3–22. (in Russ.)</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
