2025.Jan.28
EVENTSThe 4th BAIRAL Research Meeting for Fiscal Year 2024
“Advancing Mental Health Monitoring: A Deep Learning Framework for Multimodal Emotion Recognition”
◇BAIRAL(B’AI RA League)
BAIRAL is a study group by young research assistants (RA) of the B’AI Global Forum of the Institute for AI and Beyond at the University of Tokyo. Aiming to achieve gender equality and a guarantee of rights for minorities in the AI era, this study group examines relationships between digital information technology and society. BAIRAL organizes research meetings every other month with guest speakers in a variety of fields.
◇Date & Venue
・Date: Thursday, Feb 20, 2025, 17:00 to 18:30 (JST)
・Language: English
・Format: Online – Zoom meeting (No registration required)
https://u-tokyo-ac-jp.zoom.us/j/81781587440?pwd=VG49FYhndanKOXPIxNkEUmelAz5qAp.1
Meeting ID: 817 8158 7440/Passcode: 250220
◇Guest Speaker
Meishu Song (PhD Candidate, Graduate School of Education, University of Tokyo)
◇Speaker Bio
Meishu Song is a Research Scientist specializing in multimodal AI systems and mental health informatics. Her research lies at the intersection of deep learning, multimodal understanding, and healthcare applications, focusing on developing innovative solutions for mental health monitoring. She pioneered the development of personalized macro-micro frameworks for emotion recognition, achieving significant improvements in real-world applications.
Song’s work has made substantial contributions to multitask learning and multimodal fusion techniques, notably by developing the Dynamic Restrained Uncertainty Weighting methodology. Her research has been published in prestigious venues including ICASSP and JMIR Mental Health. She has successfully translated her academic research into practical applications, leading to the development of mental healthcare AI product.
As a Cofounder and Research Scientist at SemoAI, she leads efforts to bridge the gap between advanced AI technologies and accessible mental healthcare solutions.
◇Abstract
The lecture will present an innovative approach to daily mental health monitoring through multimodal deep learning analysis of speech and physiological signals. The speaker will introduce two comprehensive datasets: the Japanese Daily Speech Dataset (JDSD), which will comprise 20,827 speech samples from 342 participants, and the Japanese Daily Multimodal Dataset (JDMD), which will contain 6,200 records of Zero Crossing Mode (ZCM) and Proportional Integration Mode (PIM) signals from 298 participants. These datasets will have been collected in naturalistic settings using non-intrusive wearable devices and smartphones.
The core innovation to be discussed will be a macro-micro framework that synthesizes global emotional patterns with individual-specific characteristics through a personalized crossmodal transformer mechanism. This architecture will also incorporate a novel Dynamic Restrained Uncertainty Weighting technique for multimodal fusion and loss balancing. The framework is expected to demonstrate substantial improvement in emotion recognition accuracy, achieving a Concordance Correlation Coefficient (CCC) of 0.503, significantly outperforming the baseline of 0.281.
By leveraging advanced deep learning techniques and multimodal data integration, the proposed system will aim to provide continuous, personalized emotional state assessment while maintaining ecological validity. This work will address critical challenges in mental health monitoring by offering a scalable, data-driven approach that bridges the gap between laboratory-based assessment and real-world applications, potentially transforming the approach to mental healthcare delivery.
◇Organizer
B’AI Global Forum, Institute for AI and Beyond at the University of Tokyo
◇Inquiry
Priya Mu (Research assistant of the B’AI Global Forum)
priya-mu[at]g.ecc.u-tokyo.ac.jp (Please change [at] to @)