JMIR Preprints #68942: Facial Emotion Recognition of 16 Distinct Emotions from Smartphone Video: Comparing Machine-Learning vs Human Performance

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Facial Emotion Recognition of 16 Distinct Emotions from Smartphone Video: Comparing Machine-Learning vs Human Performance

Marie Keinert;
Simon Pistrosch;
Adria Mallol-Ragolta;
Björn W. Schuller;
Matthias Berking

ABSTRACT

Background:

The development of automatic emotion recognition models from smartphone videos is a crucial step toward the dissemination of psychotherapeutic app-interventions that encourage emotional expressions. Existing models focus mainly on the six basic emotions while neglecting other, therapeutically relevant emotions. To support this research, we introduce the novel Stress reduction Training through the Recognition of Emotions Wizard-of-Oz (STREs WoZ) dataset, which contains 14 412 smartphone videos of 63 individuals displaying 16 distinct, therapeutically relevant emotions.

Objective:

The aim of the present research is to develop automatic Facial Emotion Recognition (FER) models for binary (positive vs. negative) and mulit-class emotion classification tasks, to assess the model’s performance and to compare it to human observers in two studies.

Methods:

In Study 1, automatic FER models using both appearance and deep-learnt features for binary and multi-class emotion classification are developed. In Study 2, three human observers are trained on the same task. A test set of 3018 facial emotion videos is completed by both the automatic FER model and human observers. The performance is assessed with unweighted average recall.

Results:

Results show that appearance features outperform deep-learnt features in both tasks, with the attention network using appearance features emerging as the best-performing model. The attention network achieves an accuracy of 92.2 % in the binary classification task, comparable to human performance, but shows lower accuracy (59.0-90.0 %) in the multi-class task, falling short of human accuracy.

Conclusions:

Future studies are needed to enhance the performance of automatic FER models for practical use in psychotherapeutic apps. Nevertheless, this study makes an important first step towards advancing emotion-focused psychotherapeutic interventions via smartphone apps.

Citation

Please cite as:

Keinert M, Pistrosch S, Mallol-Ragolta A, Schuller BW, Berking M

Facial Emotion Recognition of 16 Distinct Emotions From Smartphone Videos: Comparative Study of Machine Learning and Human Performance

J Med Internet Res 2025;27:e68942

DOI: 10.2196/68942

PMID: 40601921

PMCID: 12268218

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Nov 18, 2024

Date Accepted: May 2, 2025

Facial Emotion Recognition of 16 Distinct Emotions from Smartphone Video: Comparing Machine-Learning vs Human Performance

ABSTRACT

Citation

Copyright