Multimodal Emotion Recognition Using BERT and ANN: A Hybrid Deep Learning Approach

Uncategorized

Multimodal Emotion Recognition Using BERT and ANN: A Hybrid Deep Learning Approach/strong>
Authors:- Research Scholar Avasheen Shishir Temurkar, Professor Anuradha Purohit

Abstract- Emotion recognition plays a vital role in enhancing human-computer interaction systems by enabling empathetic and context-aware AI solutions. This study introduces a hybrid deep learning architecture that integrates BERT for extracting contextual text features and an Artificial Neural Network (ANN) for processing MFCC-based acoustic features. By combining textual and audio modalities, the proposed model effectively addresses the limitations of single-modality approaches. The model is evaluated on the USC-IEMOCAP dataset, encompassing six emotion categories: ‘Happy’, ‘Sad’, ‘Angry’, ‘Neutral’, ‘Frus- trated’, and ‘Excited’. It achieves competitive performance with a weighted F1-score of 0.91 and an accuracy of 86%, outperforming several state-of-the-art methods. The fusion of text and audio features enhances the model’s ability to capture subtle emotional nuances, demonstrating the potential of multimodal learning for robust emotion classification. This research underscores the value of hybrid architectures in advancing emotion recognition for real- world applications.

DOI: 10.61137/ijsret.vol.10.issue6.382