Enhancing Automatic Emotion Recognition for Clinical Applications: A Multimodal, Personalized Approach and Quantification of Emotional Reaction Intensity with Transformers

Qian, Yang

Enhancing Automatic Emotion Recognition for Clinical Applications: A Multimodal, Personalized Approach and Quantification of Emotional Reaction Intensity with Transformers

Files

Qian_hawii_0085O_11924.pdf (11.32 MB)

Date

2023

Authors

Qian, Yang

Advisor

Washington, Peter

Department

Computer Science

Abstract

In the realm of artificial intelligence, Automated Emotion Recognition (AER) has emerged as a pivotal research area, intersecting computer vision, natural language processing, and human- computer interaction. This research is particularly relevant to fields such as healthcare, education, and entertainment. This thesis is primarily concerned with enhancing Facial Expression Recogni- tion (FER), a crucial aspect of AER. In contrast to typical multimodal and Transformer models’ methodologies, this work explores the potential of personalization and the quantification of emo- tional reaction intensity in the pursuit of improving AER.This research draws upon the encouraging advancements of deep learning techniques, such as Convolutional Neural Networks (CNNs) and Attention Mechanisms, to improve FER. A com- prehensive review of the literature is presented, encompassing topics like emotion recognition, affective computing, face detection, and the role of deep learning in multimodal communication. The research methodology and experimental design, which involve the use of emotion recognition datasets and integrated network methodologies such as CNN-LSTM (Long Short-Term Memory) and CNN-Transformers, are delineated in subsequent chapters. In the methodology chapter, a suite of independent experiments is designed to probe different facets of emotion recognition. The first experiment investigates the model architecture, feature ex- traction, and data preprocessing techniques for FER. A comparative analysis is conducted among traditional CNN, transfer learning with ImageNet pre-trained models, and Vision Transformers (ViT) for FER tasks, with the goal of deciphering the causes of their performance differences. This research further investigates the potential of personalization in emotion recognition to en- hance AER performance. This is demonstrated by developing a personalized CNN-LSTM emo- tion recognition model trained on individual-specific data. Additionally, an Emotional Reaction Intensity Estimation experiment is performed, utilizing CNN-Transformer model approaches. The results reveal that a focus on personalization and quantification of emotional reaction intensity contributes to significant improvements in emotion recognition. Notably, the performance metrics recorded a marginal gain of 1.5% and 1% in accuracy and an 8% improvement in the Pearson Correlation Coefficient, respectively. These findings underscore the relevance of personalization and emotional reaction intensity quantification in AER, highlighting the necessity for more precise and robust emotion recognition systems.

Keywords

Computer science, Deep learning, Emotion Recognition, Facial Expression Recognition, Multimodal Learning

URI

https://hdl.handle.net/10125/107982

Extent

50 pages

Rights

All UHM dissertations and theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission from the copyright owner.

Collections

M.S. - Computer Science

Full item page

Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.

Enhancing Automatic Emotion Recognition for Clinical Applications: A Multimodal, Personalized Approach and Quantification of Emotional Reaction Intensity with Transformers

Files

Date

Authors

Contributor

Advisor

Department

Instructor

Depositor

Speaker

Researcher

Consultant

Interviewer

Narrator

Transcriber

Annotator

Journal Title

Journal ISSN

Volume Title

Publisher

Volume

Number/Issue

Starting Page

Ending Page

Alternative Title

Abstract

Description

Keywords

Citation

URI

Extent

Format

Geographic Location

Time Period

Related To

Related To (URI)

Table of Contents

Rights

Rights Holder

Local Contexts

Collections