Multimodal Semi-automated Affect Detection from Conversational Cues, Gross Body Language, and Facial Features

, and . User Modeling and User-Adapted Interaction 20 (2): 147--187 (June 2010)


We developed and evaluated a multimodal affect detector that combines conversational cues, gross body language, and facial features. The multimodal affect detector uses feature-level fusion to combine the sensory channels and linear discriminant analyses to discriminate between naturally occurring experiences of boredom, engagement/flow, confusion, frustration, delight, and neutral. Training and validation data for the affect detector were collected in a study where 28 learners completed a 32- min. tutorial session with AutoTutor, an intelligent tutoring system with conversational dialogue. Classification results supported a channel x judgment type interaction, where the face was the most diagnostic channel for spontaneous affect judgments (i.e., at any time in the tutorial session), while conversational cues were superior for fixed judgments (i.e., every 20 s in the session). The analyses also indicated that the accuracy of the multichannel model (face, dialogue, and posture) was statistically higher than the best single-channel model for the fixed but not spontaneous affect expressions. However, multichannel models reduced the discrepancy (i.e., variance in the precision of the different emotions) of the discriminant models for both judgment types. The results also indicated that the combination of channels yielded superadditive effects for some affective states, but additive, redundant, and inhibitory effects for others. We explore the structure of the multimodal linear discriminant models and discuss the implications of some of our major findings.

Links and resources

BibTeX key:
search on:

Comments and Reviews  

There is no review or comment yet. You can write one!


Cite this publication