Detecting Artificial Intelligence–Generated Versus Human-Written Medical Student Essays: Semirandomized Controlled Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

AI-Generated or Human-Written? - Detecting non-human authorship in medical student papers: Controlled Trial

Berin Doru;
Christoph Maier;
Johanna Sophie Busse;
Thomas Lücke;
Judith Schönhoff;
Elena Enax Krumova;
Maria Berger;
Marianne Tokic

ABSTRACT

Background:

Large Language Models (LLMs), exemplified by ChatGPT, have reached a level of sophistication where distinguishing between human and AI-generated texts is challenging.

Objective:

To assess the implications for medical texts, the aim of this experimental study was to investigate the ability of two blinded expert groups, one medical and one humanistic, to differentiate between texts written by medical students and those generated by ChatGPT.

Methods:

The medical experts (n=22) were characterised by content familiarity and the humanities experts (n=13) by linguistic and formal textual analysis expertise. All experts were presented with two pairs of texts on two dif-ferent topics, each pair similar in content and structure - one text written by a medical student and the other generated by ChatGPT. They were requested to identify the texts as human-generated or AI-generated and to reason their decision. They were also requested to rate some characteristics of a text: linguistic quality, style, logical coherence, scientific quality, recognition of knowledge limitations, formulation of future research questions, and spelling and grammatical errors.

Results:

About 70% of all participants correctly identified the text written by ChatGPT. No significant difference was found between the two groups in terms of correct identification. Only 14% of participants misidentified the author in both text pairs. Familiarity with the content did not play a major role, but certain features of the writ-ing style were more important in the decision-making process. In particular, characteristics in the linguistic categories of redundancy, repetition, and thread/coherence proved to be decisive for the acceptance of a ChatGPT text.

Conclusions:

Authoring style and personal writing features should further be investigated, especially in view of the major change in academic writing emerging by the presence of LLMs. Clinical Trial: The project was submitted to the ethics committee of the Ruhr University Bochum, Germany, in April 2023. As this is not a clinical trial on human subjects, no study or trial registration was required.

Citation

Please cite as:

Doru B, Maier C, Busse JS, Lücke T, Schönhoff J, Enax Krumova E, Berger M, Tokic M

Detecting Artificial Intelligence–Generated Versus Human-Written Medical Student Essays: Semirandomized Controlled Study

JMIR Med Educ 2025;11:e62779

DOI: 10.2196/62779

PMID: 40053752

PMCID: 11914838

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Education

Date Submitted: May 31, 2024

Date Accepted: Jan 16, 2025

AI-Generated or Human-Written? - Detecting non-human authorship in medical student papers: Controlled Trial

ABSTRACT

Citation

Copyright