Testing Artificial Intelligence algorithms in the real world: Learnings from the SMARTI trial
ABSTRACT
Background:
A number of studies have shown promising performance of artificial intelligence (AI) algorithms for lesion diagnosis in skin cancer. To date, none of these have assessed algorithm performance in the real-world setting.
Objective:
To evaluate practical issues of implementing a convolutional neural network developed by MoleMap Ltd and Monash University eResearch in the clinical setting.
Methods:
Participants were recruited from the Alfred Hospital and Skin Health Institute, Melbourne from 1 November 2019 to 30 May 2021. Any skin lesions of concern and at least two additional lesions were imaged using a proprietary dermoscopic camera. Images were uploaded directly to the study database by the research nurse via a custom interface installed on a clinic laptop. Doctors recorded their diagnosis and management plan for each lesion in real time. A pre-post study design was used. In the pre-intervention period, treating doctors were blinded to AI lesion assessment. An interim safety analysis for AI accuracy was then performed. In the post-intervention period, the AI algorithm classified lesions as benign, malignant or uncertain after the doctors’ initial assessment had been made. Doctors then had the opportunity to record an updated diagnosis and management plan. After discussing the AI diagnosis with the patient, a final management plan was agreed upon (Figure 1). Two doctors saw each patient and entered their diagnoses independently; a dermatology trainee followed by a consultant dermatologist. Each lesion was later assessed remotely by a teledermatologist.
Results:
Participants at both sites were high risk; 79% had a previous melanoma and 14% were transplant recipients. 743 lesions were imaged in 214 participants. 28 dermatology trainees and 17 consultant dermatologists provided diagnoses and management decisions and 3 experienced teledermatologists provided remote assessments. 45 melanomas were confirmed on histopathology. A dedicated research nurse was essential to oversee study processes, maintain study documents and assist with clinical workflow. In cases where AI algorithm and consultant dermatologist diagnoses were discordant, participant anxiety was an important factor in the final agreed management plan to biopsy or not.
Conclusions:
Whilst AI algorithms are likely to be of most use in the primary care setting, higher event rates in specialist settings are important for the initial assessment of algorithm safety and accuracy. This study highlighted the importance of considering workflow issues and doctor-patient-AI interactions prior to larger scale trials in community-based practices. Clinical Trial: ClinicalTrials.gov identifier: NCT04040114 Trial Coordinating Centre: Melanoma and Skin Cancer Trials Limited
Citation
Per the author's request the PDF is not available.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.