Cross-sectional Analysis of the Efficacy of an Artificial Intelligence Application in Dermatological Diagnosis
ABSTRACT
Background:
Dermatology is an ideal specialty for artificial intelligence (AI)-driven image recognition to improve diagnostic accuracy and patient care. This is especially important due to the lack of dermatologists in many parts of the world and the high frequency of cutaneous disorders and malignancies. AI-based applications for the identification of dermatological conditions are widely available, but research assessing their reliability and accuracy is lacking.
Objective:
To analyze the efficacy of an AI application as a preliminary diagnostic tool.
Methods:
This observational cross-sectional study comprised patients over the age of two years who visited the dermatology clinic. Images of lesions from individuals with various skin disorders were uploaded to the app after obtaining informed consent. The app was used to make a patient profile, identify lesion morphology, plot the location on a human model, and answer questions regarding duration and symptoms. It presented eight differential diagnoses, which were compared with the clinical diagnosis. The model’s performance was evaluated using sensitivity, specificity, accuracy, positive predictive value, negative predictive value, and F1 score. Comparison of categorical variables was performed with Chi-square test, and statistical significance was considered with P<0.05.
Results:
A total of 700 patients were part of the study. A wide variety of skin conditions were grouped into 12 categories. The AI model had a mean top-1 sensitivity of 71% (95% CI 61.5-74.3%), top-3 and all-8 sensitivities of 86.1% (95% CI 83.4-88.6%), and 95.1% (95% CI 93.3-96.6%), respectively. The top-1 sensitivities of skin infestations, disorders of keratinization, other inflammatory conditions, and bacterial infections were 85.7%, 85.7%, 82.7%, and 81.8%, respectively. In the case of photodermatoses and malignant tumors, the top-1 sensitivities were 33.3% and 10%, respectively. Each category had a strong correlation between the clinical diagnosis and the probable top diagnosis (P<0.001).
Conclusions:
The app successfully identified most dermatoses but failed in case of photodermatoses and malignancies.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.