AUDIOVISUAL EMOTION RECOGNITION VIA CROSS-MODAL ASSOCIATION IN KERNEL SPACE
Yongjin Wang, Ling Guan, Anastasios VenetsanopoulosAbstract
In this paper, we introduce a new method for audiovisual based multimodal emotion recognition. The proposed method identifies the optimal transformations that are capable of representing the coupled patterns between audio and visual information through cross-modal association. Specifically, kernel machine technique is utilized for capturing the nonlinear relationship between two different subsets of features. A hidden Markov model is subsequently applied for characterizing the statistical dependence across successive time segments, and identifying the inherent temporal structure of the features in the transformed domain. Information fusion at the feature and score levels are examined and compared. The effectiveness of the introduced solution is demonstrated through extensive experimentation.
Read Submission [471]