Enhancing Automatic Emotion Recognition for Clinical Applications: A Multimodal, Personalized Approach and Quantification of Emotional Reaction Intensity with Transformers

dc.contributor.advisor Washington, Peter
dc.contributor.author Qian, Yang
dc.contributor.department Computer Science
dc.date.accessioned 2024-03-11T22:19:58Z
dc.date.available 2024-03-11T22:19:58Z
dc.date.issued 2023
dc.description.degree M.S.
dc.identifier.uri https://hdl.handle.net/10125/107982
dc.subject Computer science
dc.subject Deep learning
dc.subject Emotion Recognition
dc.subject Facial Expression Recognition
dc.subject Multimodal Learning
dc.title Enhancing Automatic Emotion Recognition for Clinical Applications: A Multimodal, Personalized Approach and Quantification of Emotional Reaction Intensity with Transformers
dc.type Thesis
dcterms.abstract In the realm of artificial intelligence, Automated Emotion Recognition (AER) has emerged as a pivotal research area, intersecting computer vision, natural language processing, and human- computer interaction. This research is particularly relevant to fields such as healthcare, education, and entertainment. This thesis is primarily concerned with enhancing Facial Expression Recogni- tion (FER), a crucial aspect of AER. In contrast to typical multimodal and Transformer models’ methodologies, this work explores the potential of personalization and the quantification of emo- tional reaction intensity in the pursuit of improving AER.This research draws upon the encouraging advancements of deep learning techniques, such as Convolutional Neural Networks (CNNs) and Attention Mechanisms, to improve FER. A com- prehensive review of the literature is presented, encompassing topics like emotion recognition, affective computing, face detection, and the role of deep learning in multimodal communication. The research methodology and experimental design, which involve the use of emotion recognition datasets and integrated network methodologies such as CNN-LSTM (Long Short-Term Memory) and CNN-Transformers, are delineated in subsequent chapters. In the methodology chapter, a suite of independent experiments is designed to probe different facets of emotion recognition. The first experiment investigates the model architecture, feature ex- traction, and data preprocessing techniques for FER. A comparative analysis is conducted among traditional CNN, transfer learning with ImageNet pre-trained models, and Vision Transformers (ViT) for FER tasks, with the goal of deciphering the causes of their performance differences. This research further investigates the potential of personalization in emotion recognition to en- hance AER performance. This is demonstrated by developing a personalized CNN-LSTM emo- tion recognition model trained on individual-specific data. Additionally, an Emotional Reaction Intensity Estimation experiment is performed, utilizing CNN-Transformer model approaches. The results reveal that a focus on personalization and quantification of emotional reaction intensity contributes to significant improvements in emotion recognition. Notably, the performance metrics recorded a marginal gain of 1.5% and 1% in accuracy and an 8% improvement in the Pearson Correlation Coefficient, respectively. These findings underscore the relevance of personalization and emotional reaction intensity quantification in AER, highlighting the necessity for more precise and robust emotion recognition systems.
dcterms.extent 50 pages
dcterms.language en
dcterms.publisher University of Hawai'i at Manoa
dcterms.rights All UHM dissertations and theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission from the copyright owner.
dcterms.type Text
local.identifier.alturi http://dissertations.umi.com/hawii:11924
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Qian_hawii_0085O_11924.pdf
Size:
11.32 MB
Format:
Adobe Portable Document Format
Description: