1003

3 views

Create Account or Sign In to post comments

In human-computer interaction, it is an urgent problem to use facial expressions and speech information to identify the user's continuous emotions, and the key factors affecting the recognition accuracy are the data deficiencies during the fusion of speech and facial information, and the abnormal frames in the video. In order to solve these problems, a user emotion recognition system based on the fusion of facial expressions and speech multimodality is designed. In the part of facial expressions, Gabor transform continuous emotion recognition method based on data increments is proposed. In the part of speech information, Mel-scale Frequency Cepstral Coefficients (MFCC) is used to extract speech features, and user emotions are recognize through transfer learning. Finally, in the late fusion, multiple linear regression is used for multi-modality to verify the method in this paper. This paper uses the AVEC2013 dataset with Arousal-Valence label to conduct a valid experiment on the proposed method. The experimental results prove that the method improves the accuracy of user emotion recognition.

User Emotion Recognition Method Based on Facial Expression and Speech Signal Fusion Fei Lu, Long Zhang, Guohui Tian

July 9, 2021