Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: JMIR Medical Education

Date Submitted: Mar 16, 2026
Open Peer Review Period: Mar 18, 2026 - May 13, 2026
(currently open for review)

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Using Generative Artificial Intelligence to Aid in Surgery Resident Selection: A Retrospective Comparative Study

  • Anastasia Turner; 
  • Kwadjo Nyarko; 
  • Nada Gawad; 
  • Isabelle Raiche

ABSTRACT

Background:

Surgery resident selection is a resource-intensive process. The advent of generative artificial intelligence (GAI) offers a new possibility to aid in resident selection, increasing the efficiency of file review without the burden of creating a customized machine-learning algorithm.

Objective:

Our study aimed to compare file review of general surgery applicants by GAI to file review by our program’s residency selection committee (RSC).

Methods:

GPT-4o, an open access GAI software, was used to score deidentified 2023-2024 Canadian Resident Matching Service (CaRMS) application files to our program based on our RSC’s file review scoresheet. GAI scores were compared to RSC-assigned scores for each application element including CVs, personal letters, and reference letters. Rank lists generated from both sets of scores were compared using Spearman’s rank correlation. GPT-4o was then used to create ten generic application files. These were scored by GAI and compared to GAI scores for the 2023-2024 CaRMS applicants using the Wilcoxon rank-sum test.

Results:

A total of 124 application files were included. Median GAI file review scores were consistently higher than RSC-assigned scores (24.46 vs. 17.54 y, p<0.05) and had less variance between applicants (6.96 vs. 20.80, p<0.05). The interrater reliability between GAI scores and RSC scores was poor across all application elements (0.16). Rank lists generated by GAI and RSC scores demonstrated a weakly positive correlation for each application element (0.25 to 0.37, p<0.05). Rank lists based on total file review scores demonstrated a moderately positive correlation (0.44, p<0.05). Median scores for GAI-created files compared to CaRMS applicant files were statistically similar for application CVs (6.88, p=0.25), but were significantly higher for other application elements and global scores (27.51 vs. 24.46, p<0.05).

Conclusions:

GAI in its current form cannot reliably replicate human file review. Further research is needed to determine the potential role for GAI in residency selection.


 Citation

Please cite as:

Turner A, Nyarko K, Gawad N, Raiche I

Using Generative Artificial Intelligence to Aid in Surgery Resident Selection: A Retrospective Comparative Study

JMIR Preprints. 16/03/2026:93908

DOI: 10.2196/preprints.93908

URL: https://preprints.jmir.org/preprint/93908

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.