Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: Journal of Medical Internet Research

Date Submitted: Feb 26, 2026
Open Peer Review Period: Feb 27, 2026 - Apr 24, 2026
(currently open for review)

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Artificial Intelligence for Predicting Patient Reported Outcome Measures (PROMs) Scores from Free Text: A Proof-of-Concept Study with the EuroQol-5D-3L and Transformer Models

  • Arivazhagan Karunakaran; 
  • Mehul Motani; 
  • Evangelos Kontopantelis; 
  • Jose Valderas

ABSTRACT

Background:

Patient-reported outcomes measures (PROMs) have become an important tool in measuring a patient’s health status from their own perspective; however, they are typically measured using standardized questionnaires which do not account for each patient's unique experience of health. Recent improvements in Natural Language Processing (NLP) provide new possibilities to extract PROM scores from unstructured or free-text patient narratives; however, the feasibility and minimal data requirements needed to accomplish this task remain uncertain.

Objective:

To assess the practicality of transformer-based models for predicting EuroQol EQ-5D-3L scores from patient narratives and to evaluate minimum data requirements, narrative length and data augmentation effects.

Methods:

This proof-of-concept study used synthetically generated patient narratives to evaluate methodological feasibility. Three transformer models (BERT, BioBERT, DistilBERT) were fine-tuned for regression from patient narratives representing all 243 EQ-5D-3L health states. The performance of the models in various scenarios including a range of sample sizes (n=100–850), narrative length (100–1000 words), and data augmentation conditions were compared. The performance of the models was assessed through fivefold cross-validation and additional validation on datasets created by ChatGPT and DeepSeek.

Results:

Each model was able to predict EQ-5D-3L scores using each of the different configurations of data (n=100-850 patients; 100-1000-word narratives). However, optimal results were obtained when training the models with 100-word narratives derived from the largest number of people (n=850), where mean squared error=0.03 (95% CI: 0.02-0.04), mean absolute error=0.13 (95% CI: 0.13-0.15), explained variance=0.77 (95% CI: 0.64-0.77), and intraclass correlation coefficient=0.85 (95% CI: 0.81-0.87). Furthermore, it was found that the shorter narratives (100 words) performed better than longer narratives (100-1000 words). Additionally, the use of data augmentation improved the predictive performance.

Conclusions:

Transformer models show promise in predicting EQ-5D-3L PROM scores from synthetic patient generated narratives, with a minimum of 250 patients providing around 100-word narratives required for reliable performance. The work provides both a methodological basis and empirical standards for AI-based PROM systems. However, clinical implementation will require validation using real patient-authored narratives prior to adoption. If validated, the use of this approach could provide evidence to support the inclusion of a patient's experience as a narrative into standardized outcome measures and support patient-centred healthcare evaluations.


 Citation

Please cite as:

Karunakaran A, Motani M, Kontopantelis E, Valderas J

Artificial Intelligence for Predicting Patient Reported Outcome Measures (PROMs) Scores from Free Text: A Proof-of-Concept Study with the EuroQol-5D-3L and Transformer Models

JMIR Preprints. 26/02/2026:94142

DOI: 10.2196/preprints.94142

URL: https://preprints.jmir.org/preprint/94142

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.