JMIR Preprints #94142: Artificial Intelligence for Predicting Patient Reported Outcome Measures (PROMs) Scores from Free Text: A Proof-of-Concept Study with the EuroQol-5D-3L and Transformer Models

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Artificial Intelligence for Predicting Patient Reported Outcome Measures (PROMs) Scores from Free Text: A Proof-of-Concept Study with the EuroQol-5D-3L and Transformer Models

Arivazhagan Karunakaran;
Mehul Motani;
Evangelos Kontopantelis;
Jose Valderas

ABSTRACT

Background:

Patient-reported outcomes measures (PROMs) have become an important tool in measuring a patient’s health status from their own perspective; however, they are typically measured using standardized questionnaires which do not account for each patient's unique experience of health. Recent improvements in Natural Language Processing (NLP) provide new possibilities to extract PROM scores from unstructured or free-text patient narratives; however, the feasibility and minimal data requirements needed to accomplish this task remain uncertain.

Objective:

To assess the practicality of transformer-based models for predicting EuroQol EQ-5D-3L scores from patient narratives and to evaluate minimum data requirements, narrative length and data augmentation effects.

Methods:

This proof-of-concept study used synthetically generated patient narratives to evaluate methodological feasibility. Three transformer models (BERT, BioBERT, DistilBERT) were fine-tuned for regression from patient narratives representing all 243 EQ-5D-3L health states. The performance of the models in various scenarios including a range of sample sizes (n=100–850), narrative length (100–1000 words), and data augmentation conditions were compared. The performance of the models was assessed through fivefold cross-validation and additional validation on datasets created by ChatGPT and DeepSeek.

Results:

Each model was able to predict EQ-5D-3L scores using each of the different configurations of data (n=100-850 patients; 100-1000-word narratives). However, optimal results were obtained when training the models with 100-word narratives derived from the largest number of people (n=850), where mean squared error=0.03 (95% CI: 0.02-0.04), mean absolute error=0.13 (95% CI: 0.13-0.15), explained variance=0.77 (95% CI: 0.64-0.77), and intraclass correlation coefficient=0.85 (95% CI: 0.81-0.87). Furthermore, it was found that the shorter narratives (100 words) performed better than longer narratives (100-1000 words). Additionally, the use of data augmentation improved the predictive performance.

Conclusions:

Transformer models show promise in predicting EQ-5D-3L PROM scores from synthetic patient generated narratives, with a minimum of 250 patients providing around 100-word narratives required for reliable performance. The work provides both a methodological basis and empirical standards for AI-based PROM systems. However, clinical implementation will require validation using real patient-authored narratives prior to adoption. If validated, the use of this approach could provide evidence to support the inclusion of a patient's experience as a narrative into standardized outcome measures and support patient-centred healthcare evaluations.

Citation

Please cite as:

Karunakaran A, Motani M, Kontopantelis E, Valderas J

Artificial Intelligence for Predicting Patient Reported Outcome Measures (PROMs) Scores from Free Text: A Proof-of-Concept Study with the EuroQol-5D-3L and Transformer Models

JMIR Preprints. 26/02/2026:94142

DOI: 10.2196/preprints.94142

URL: https://preprints.jmir.org/preprint/94142

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Currently submitted to: Journal of Medical Internet Research

Date Submitted: Feb 26, 2026

Open Peer Review Period: Feb 27, 2026 - Apr 24, 2026

(closed for review but you can still tweet)

NOTE: This is an unreviewed Preprint

Artificial Intelligence for Predicting Patient Reported Outcome Measures (PROMs) Scores from Free Text: A Proof-of-Concept Study with the EuroQol-5D-3L and Transformer Models

ABSTRACT

Citation

Copyright