JMIR Preprints #42904: Validation of Three Computer-aided Facial Phenotyping Tools (DeepGestalt, GestaltMatcher, D-Score): A Comparative Diagnostic Accuracy Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Validation of Three Computer-aided Facial Phenotyping Tools (DeepGestalt, GestaltMatcher, D-Score): A Comparative Diagnostic Accuracy Study

Alisa Maria Vittoria Reiter;
Jean Tori Pantel;
Magdalena Danyel;
Denise Horn;
Claus-Eric Ott;
Martin Atta Mensah

ABSTRACT

Background:

While characteristic facial features provide important clues for finding the correct diagnosis in genetic syndromes, valid assessment can be challenging. The next-generation phenotyping algorithm DeepGestalt analyses patient images and provides syndrome suggestions. GestaltMatcher matches patient images with similar facial features. The new D-Score provides a score for the degree of facial dysmorphism.

Objective:

We aimed to benchmark GestaltMatcher and D-Score and compare them to DeepGestalt.

Methods:

Using a retrospective convenience sample of 4796 images of patients with 486 different genetic syndromes (London Medical Database, GestaltMatcher Database, and literature images) and 323 inconspicuous control images, we determined the clinical utility of D-Score, GestaltMatcher, and DeepGestalt, evaluating sensitivity, specificity, accuracy, number of supported diagnoses, and potential biases such as age, sex, and ethnicity.

Results:

DeepGestalt suggested 340 distinct syndromes, GestaltMatcher 1101 syndromes. The top-30 sensitivity was higher for DeepGestalt (87%) than for GestaltMatcher (76%). DeepGestalt generally assigned lower scores but provided higher scores for patient images than for inconspicuous control images, thus allowing the two cohorts to be separated with an AUROC of 0.73. GestaltMatcher could not separate the two classes (AUROC 0.55). Trained for this purpose, D-Score achieved the highest discriminatory power (AUROC 0.86). D-Score’s levels increased with the age of the depicted individuals. Males yielded higher D-scores than females. Ethnicity did not appear to influence D-scores.

Conclusions:

The systems can be used according to specific needs. Algorithms such as D-score could help clinicians with constrained resources or limited experience in syndromology decide whether a patient needs further genetic evaluation. Algorithms such as DeepGestalt could support diagnosing rather common genetic syndromes with facial abnormalities, whereas algorithms like GestaltMatcher could aid in identifying rare diagnoses with a characteristic face that are unknown to the clinician or not yet defined.

Citation

Please cite as:

Reiter AMV, Pantel JT, Danyel M, Horn D, Ott CE, Mensah MA

Validation of 3 Computer-Aided Facial Phenotyping Tools (DeepGestalt, GestaltMatcher, and D-Score): Comparative Diagnostic Accuracy Study

J Med Internet Res 2024;26:e42904

DOI: 10.2196/42904

PMID: 38477981

PMCID: 10973953

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Sep 27, 2022

Date Accepted: Nov 17, 2023

Validation of Three Computer-aided Facial Phenotyping Tools (DeepGestalt, GestaltMatcher, D-Score): A Comparative Diagnostic Accuracy Study

ABSTRACT

Citation

Copyright