Second Place in the CVPR 2024 Affective Behavior Analysis In-The-Wild Competition

15 Juli 2024

Emotional mimicry represents a fundamental aspect of human interaction, enabling individuals to replicate the emotional expressions of others and thereby foster empathy and social bonding. This process entails the replication of a range of expressions, including facial gestures, vocal tones, and body language, which are instrumental for effective communication, particularly in therapeutic contexts. The data material included video recordings with image and sound of people attempting to mimic emotions that were visually and acoustically conveyed to them. Understanding how emotional mimicry occurs can be helpful in therapeutic settings so that therapists can better understand their patients and mirror their emotions to gain their trust. Limited processing power and memory required us to optimize our models to run effectively on the available hardware. To address these limitations while maintaining a focus on efficiency, we chose to analyze only the audio modality.

Our research employed the Wav2Vec 2.0 architecture, which has been pre-trained on a vast array of podcast data. The podcast data, which encompasses a wide range of colloquial and spontaneous speech, enables the model to effectively capture linguistic and paralinguistic features, which are essential for the analysis of emotional expressions. A key aspect of our approach is the multi-task fusion strategy, which combines these audio features with a pre-trained Valence-Arousal-Dominance (VAD) model. This method enhances the accuracy of our emotion intensity predictions by processing multiple emotional dimensions simultaneously.

This research was conducted in collaboration with Tobias Hallmen and Elisabeth André from the University of Augsburg, as well as Fabian Deuser and Norbert Oswald from the University of the Bundeswehr. In collaboration, we devised a novel methodology for the more precise assessment of emotional mimicry in web videos. Our joint efforts resulted in a second-place finish in the Emotional Mimicry Intensity challenge at the 6th Workshop and Competition on Affective Behaviour Analysis in the Wild (ABAW). The findings were presented at the CVPR 2024 conference in Seattle.

 

Unimodal Multi-Task Fusion for Emotional Mimicry Intensity Prediction
Tobias Hallmen, Fabian Deuser, Norbert Oswald, Elisabeth André

[arXiv] [CVPR 2024]