Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Aug 4, 2023
Open Peer Review Period: Aug 4, 2023 - Sep 29, 2023
Date Accepted: Nov 20, 2023
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

How Can the Clinical Aptitude of AI Assistants Be Assayed?

Thirunavukarasu A

How Can the Clinical Aptitude of AI Assistants Be Assayed?

J Med Internet Res 2023;25:e51603

DOI: 10.2196/51603

PMID: 38051572

PMCID: 10731545

How can the clinical aptitude of artificial intelligence assistants be assayed?

  • Arun Thirunavukarasu

ABSTRACT

Large language models are exhibiting remarkable performance in clinical contexts, with exemplar results ranging from expert-level attainment in medical examinations to superior accuracy and relevance when responding to patient queries than real doctors on a social media website. Deployment of large language models in conventional healthcare settings is yet to be reported, and there remains an open question as to what evidence should be required before such deployment is warranted. Early validation studies use unvalidated surrogate variables to represent clinical aptitude, and it may be necessary to conduct prospective randomised-control trials to justify use of a large language model for clinical advice or assistance as potential pitfalls and pain-points cannot be exhaustively predicted. As large language models continue to revolutionise the field, there is an opportunity to improve the rigour of artificial intelligence research to reward innovation resulting in real benefit to real patients.


 Citation

Please cite as:

Thirunavukarasu A

How Can the Clinical Aptitude of AI Assistants Be Assayed?

J Med Internet Res 2023;25:e51603

DOI: 10.2196/51603

PMID: 38051572

PMCID: 10731545

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.