Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Education

Date Submitted: May 31, 2024
Date Accepted: Jan 16, 2025

The final, peer-reviewed published version of this preprint can be found here:

Detecting Artificial Intelligence–Generated Versus Human-Written Medical Student Essays: Semirandomized Controlled Study

Doru B, Maier C, Busse JS, Lücke T, Schönhoff J, Enax Krumova E, Berger M, Tokic M

Detecting Artificial Intelligence–Generated Versus Human-Written Medical Student Essays: Semirandomized Controlled Study

JMIR Med Educ 2025;11:e62779

DOI: 10.2196/62779

PMID: 40053752

PMCID: 11914838

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

AI-Generated or Human-Written? - Detecting non-human authorship in medical student papers: Controlled Trial

  • Berin Doru; 
  • Christoph Maier; 
  • Johanna Sophie Busse; 
  • Thomas Lücke; 
  • Judith Schönhoff; 
  • Elena Enax Krumova; 
  • Maria Berger; 
  • Marianne Tokic

ABSTRACT

Background:

Large Language Models (LLMs), exemplified by ChatGPT, have reached a level of sophistication where distinguishing between human and AI-generated texts is challenging.

Objective:

To assess the implications for medical texts, the aim of this experimental study was to investigate the ability of two blinded expert groups, one medical and one humanistic, to differentiate between texts written by medical students and those generated by ChatGPT.

Methods:

The medical experts (n=22) were characterised by content familiarity and the humanities experts (n=13) by linguistic and formal textual analysis expertise. All experts were presented with two pairs of texts on two dif-ferent topics, each pair similar in content and structure - one text written by a medical student and the other generated by ChatGPT. They were requested to identify the texts as human-generated or AI-generated and to reason their decision. They were also requested to rate some characteristics of a text: linguistic quality, style, logical coherence, scientific quality, recognition of knowledge limitations, formulation of future research questions, and spelling and grammatical errors.

Results:

About 70% of all participants correctly identified the text written by ChatGPT. No significant difference was found between the two groups in terms of correct identification. Only 14% of participants misidentified the author in both text pairs. Familiarity with the content did not play a major role, but certain features of the writ-ing style were more important in the decision-making process. In particular, characteristics in the linguistic categories of redundancy, repetition, and thread/coherence proved to be decisive for the acceptance of a ChatGPT text.

Conclusions:

Authoring style and personal writing features should further be investigated, especially in view of the major change in academic writing emerging by the presence of LLMs. Clinical Trial: The project was submitted to the ethics committee of the Ruhr University Bochum, Germany, in April 2023. As this is not a clinical trial on human subjects, no study or trial registration was required.


 Citation

Please cite as:

Doru B, Maier C, Busse JS, Lücke T, Schönhoff J, Enax Krumova E, Berger M, Tokic M

Detecting Artificial Intelligence–Generated Versus Human-Written Medical Student Essays: Semirandomized Controlled Study

JMIR Med Educ 2025;11:e62779

DOI: 10.2196/62779

PMID: 40053752

PMCID: 11914838

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.