Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.
Who will be affected?
Readers: No access to all 28 journals. We recommend accessing our articles via PubMed Central
Authors: No access to the submission form or your user account.
Reviewers: No access to your user account. Please download manuscripts you are reviewing for offline reading before Wednesday, July 01, 2020 at 7:00 PM.
Editors: No access to your user account to assign reviewers or make decisions.
Copyeditors: No access to user account. Please download manuscripts you are copyediting before Wednesday, July 01, 2020 at 7:00 PM.
Assessment of Large Language Model Performance on Medical School Essay-Style Concept Appraisal Questions: An Exploratory Study
Seysha Mehta;
Eliot N. Haddad;
Indira Bhavsar Burke;
Alana K. Majors;
Rie Maeda;
Sean M. Burke;
Abhishek Deshpande;
Amy Nowacki;
Christina C. Lindenmeyer;
Neil Mehta
ABSTRACT
Microsoft Copilot, a ChatGPT 4.0 based Large Language Model, demonstrated comparable performance to medical students in answering essay-style CAPPs, while assessors struggled to differentiate AI from human responses. These results highlight the need to prepare students and educators for a future world of AI by fostering reflective learning practices and critical thinking.
Citation
Please cite as:
Mehta S, Haddad EN, Burke IB, Majors AK, Maeda R, Burke SM, Deshpande A, Nowacki A, Lindenmeyer CC, Mehta N
Assessment of Large Language Model Performance on Medical School Essay-Style Concept Appraisal Questions: Exploratory Study