JMIR Preprints #47737: Evaluating ChatGPT's Performance on UK Standardized Admission Tests: Insights from the BMAT, TMUA, LNAT, and TSA examinations

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Evaluating ChatGPT's Performance on UK Standardized Admission Tests: Insights from the BMAT, TMUA, LNAT, and TSA examinations

Panagiotis Giannos;
Orestis Delardas

ABSTRACT

Background:

Large language models, such as ChatGPT by OpenAI, have demonstrated potential in various applications, including education and test preparation. Previous studies have primarily assessed ChatGPT's performance in medical education and professional settings. However, the model's potential in the context of standardized admission tests remains unexplored.

Objective:

This study evaluated ChatGPT's performance on various UK standardized admission tests, including the BioMedical Admissions Test (BMAT), Test of Mathematics for University Admission (TMUA), Law National Aptitude Test (LNAT), and Thinking Skills Assessment (TSA), to understand its potential as an innovative tool for education and test preparation in the UK.

Methods:

We sourced publicly available resources and official materials to create a dataset of 509 questions from the BMAT, TMUA, LNAT, and TSA, assessing ChatGPT's performance using the legacy GPT-3.5 model. The evaluation focused on multiple-choice questions to ensure consistency.

Results:

ChatGPT's performance varied across different tests and sections. In BMAT, the model showed stronger performance in Section 1 (up to 65.4% correct) than Section 2 (as low as 4.5% correct), with a maximum candidate percentile of ≤62% in Section 1 and a minimum of ≤1% in Section 2. In TMUA, despite high engagement, the correct answer percentages were low, ranging from 10.5% to 22.2% in Paper 1 and 11.1% to 20.0% in Paper 2, with corresponding candidate percentiles generally below 10%. In LNAT, the model achieved moderately successful performance, with 35.7% and 52.4% correct answer percentages in Paper 1 and Paper 2, respectively. The TSA performance fluctuated across years, with correct answer percentages ranging from 41.9% to 59.5% and the lowest estimated candidate percentile at 9%.

Conclusions:

ChatGPT shows promise as a supplemental tool for education and test preparation in certain subject areas and test formats assessing aptitude, skills, and reading comprehension, such as BMAT Section 1 and LNAT. However, limitations in other areas, such as scientific knowledge and applications in BMAT Section 2 and mathematics in TMUA, indicate the need for continuous development and integration with traditional learning strategies to fully harness its potential. This study contributes to the current application of AI-driven language models in education and encourages further research to optimize their performance and assess their effectiveness in broader educational contexts.

Citation

Please cite as:

Giannos P, Delardas O

Performance of ChatGPT on UK Standardized Admission Tests: Insights From the BMAT, TMUA, LNAT, and TSA Examinations

JMIR Med Educ 2023;9:e47737

DOI: 10.2196/47737

PMID: 37099373

PMCID: 10173042

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Education

Date Submitted: Mar 30, 2023

Date Accepted: Apr 9, 2023

Evaluating ChatGPT's Performance on UK Standardized Admission Tests: Insights from the BMAT, TMUA, LNAT, and TSA examinations

ABSTRACT

Citation

Copyright

JMIR Preprints

Accepted for/Published in: JMIR Medical Education

Date Submitted: Mar 30, 2023

Date Accepted: Apr 9, 2023

Evaluating ChatGPT's Performance on UK Standardized Admission Tests: Insights from the BMAT, TMUA, LNAT, and TSA examinations

ABSTRACT

Citation

Per the author's request the PDF is not available.

Copyright