Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Formative Research

Date Submitted: May 29, 2022
Open Peer Review Period: May 27, 2022 - Jun 8, 2022
Date Accepted: Aug 9, 2022
Date Submitted to PubMed: Aug 19, 2022
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Training and Profiling a Pediatric Facial Expression Classifier for Children on Mobile Devices: Machine Learning Study

Banerjee A, Mutlu OC, Kline A, Washington P, Wall D, Surabhi S

Training and Profiling a Pediatric Facial Expression Classifier for Children on Mobile Devices: Machine Learning Study

JMIR Form Res 2023;7:e39917

DOI: 10.2196/39917

PMID: 35962462

PMCID: 10131663

Training and Profiling a Pediatric Facial Expression Classifier for Children on Mobile Devices: Machine Learning Study

  • Agnik Banerjee; 
  • Onur Cezmi Mutlu; 
  • Aaron Kline; 
  • Peter Washington; 
  • Dennis Wall; 
  • Saimourya Surabhi

ABSTRACT

Background:

Implementing automated emotion recognition on mobile devices could provide an accessible diagnostic and therapeutic tool for those who struggle to recognize emotion, including children with developmental behavioral conditions such as autism. Although recent advances have been made in building more accurate emotion classifiers, existing models are too computationally expensive to be deployed on smartphones.

Objective:

In this study, we explored the deployment of several state-of-the-art emotion classifiers designed for usage on mobile devices. We additionally explored various post-training optimization techniques for both classification performance and efficiency on a Motorola Moto G6 phone.

Methods:

We collected images from twelve public datasets and used video frames crowdsourced from the GuessWhat app to train our classifiers. All images were annotated for 7 emotions: neutrality, fear, happiness, sadness, surprise, anger, and disgust. We tested two copies for each of five different convolutional neural network architectures: MobileNetV3-Small 1.0x, MobileNetV2 1.0x, EfficientNetB0, MobileNetV3-Large 1.0x, and NASNetMobile. The first copy trained on images from all public datasets but excluded the GuessWhat frames, which the second copy included. We evaluated each model against the Child Affective Facial Expression set. We then performed weight pruning, weight clustering, and quantize-aware training when possible and profiled the performance of each model on the Moto G6.

Results:

Our best model, a MobileNetV3-Large network pre-trained on ImageNet, achieved 65.21% balanced accuracy and 64.50% F1-score on CAFE while achieving a 90-millisecond inference latency on a Motorola Moto G6 phone when trained on all public datasets and the GuessWhat images. This balanced accuracy is only 1.89% lower than the current state of the art for CAFE, a model with 13.91x more parameters and which was unable to run on the Moto G6 due to its size, even when fully optimized.

Conclusions:

This work demonstrates that with specialized design and optimization techniques, machine learning models can become lightweight enough to run on mobile devices while achieving high performance on difficult image classification tasks. The models developed in this study can be integrated into mobile health therapies to diagnose ASD and to provide targeted therapeutic treatment to children.


 Citation

Please cite as:

Banerjee A, Mutlu OC, Kline A, Washington P, Wall D, Surabhi S

Training and Profiling a Pediatric Facial Expression Classifier for Children on Mobile Devices: Machine Learning Study

JMIR Form Res 2023;7:e39917

DOI: 10.2196/39917

PMID: 35962462

PMCID: 10131663

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.