Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Sep 21, 2025
Open Peer Review Period: Sep 22, 2025 - Nov 17, 2025
Date Accepted: Feb 14, 2026
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Artificial Intelligence for Identifying Patient-Reported Outcome and Experience Measures in Oncology: Retrospective Cross-Sectional Study Using ClinicalTrials.gov

Soyer J, Hecini A, Juchet S, Desvignes-Gleizes C, Thiebaut M, Bertocchio JP

Artificial Intelligence for Identifying Patient-Reported Outcome and Experience Measures in Oncology: Retrospective Cross-Sectional Study Using ClinicalTrials.gov

J Med Internet Res 2026;28:e84533

DOI: 10.2196/84533

PMID: 15886485

Artificial Intelligence for Identifying Patient-Reported Outcome and Experience Measures in Oncology: Retrospective Cross-Sectional Study Using ClinicalTrials.gov

  • Jessica Soyer; 
  • Akram Hecini; 
  • Sylvain Juchet; 
  • Céline Desvignes-Gleizes; 
  • Maxime Thiebaut; 
  • Jean-Philippe Bertocchio

ABSTRACT

Background:

Traditional healthcare models have evolved to increasingly recognize patients perspectives as key to improving the quality of care, especially in oncology. In such, Patient-Centered Measures (PCMs), including Patient-Reported Outcome Measures (PROMs) and Patient-Reported Experience Measures (PREMs), can enhance Patient-Healthcare Provider (HCP) communication while facilitating individualized care. This tailored approach not only improves patient outcomes but also underscores the importance of research methodologies that actively account for variability in patient experiences, especially across different sociodemographic and clinical backgrounds. However, identifying PCMs in clinical research has traditionally been time-consuming.

Objective:

We hypothesized that Artificial intelligence (AI) could automate and accelerate this identification. We conducted a study that aimed to estimate the proportion of clinical studies that included PROMs/PREMs, using either a traditional expert-based identification method or an AI-enriched approach.

Methods:

In a retrospective cross-sectional study using the ClinicalTrial.gov database, we focused on oncology studies between 2012 and 2022. Two methods were assessed for identifying PROMs/PREMs: i) a traditional expert-based method, where an algorithm identified PROMs/PREMs from a list of 346 oncology-specific PROMs/PREMs (extracted from the PROQOLID™ database, Mapi Research Trust) and/or 11 PROMs/PREMs-specific terms; and ii) an AI-enriched method using a Bidirectional Encoder Representations from Transformers (BERT) model, trained on 2,399 outcomes labeled by experts. To evaluate the performance of the algorithms, we compared the results returned by each method to the gold standard (expert decision). Each study was classified as reporting using (or not) at least one PROM/PREM and was described accordingly (logistic regression). To better identify which PROMs/PREMs were the most frequently used in clinical research, a Named Entity Recognition (NER) model was then used.

Results:

A total of 24,491 studies were included. According to the traditional expert-based algorithm, 7,549 studies (31%) used at least one PROM/PREM, as compared to 8,029 studies (33%) identified by the AI-enriched algorithm, increasing from 2012 to 2022 (Chi-squared test, p<0.001). With 90% accuracy, the AI-enriched algorithm outperformed (Chi-squared test, p<0.001) the traditional algorithm (83%) in identifying PROMs/PREMs. Breast and digestive cancers accounted for nearly 50% of all oncology studies using PROMs/PREMs, with the EORTC QLQ-C30 being the most frequently used. As expected, phases 2 to 4 trials more frequently included PROMs/PREMs than preclinical or early-phase 1 studies (OR[95% CI]=1.8[1.1-2.8] for phase 2; 3.6[2.3-5.8] for phase 3; 2.6,[1.6-4.4] for phase 4). In observational studies, cross-sectional and prospective studies incorporated PROMs/PREMs three times more frequently than retrospective studies (OR[95% CI]=4.6[3.3–6.4] and 3.2[2.5–4.1], respectively).

Conclusions:

Both approaches relied on human expertise; however, the one that involved training AI has proven to be more effective than the traditional method in identifying PROMs/PREMs in clinical studies. Future studies should explore the applicability of this approach to a broader field of pathologies and assess its applicability to other additional open-access databases. Clinical Trial: NA


 Citation

Please cite as:

Soyer J, Hecini A, Juchet S, Desvignes-Gleizes C, Thiebaut M, Bertocchio JP

Artificial Intelligence for Identifying Patient-Reported Outcome and Experience Measures in Oncology: Retrospective Cross-Sectional Study Using ClinicalTrials.gov

J Med Internet Res 2026;28:e84533

DOI: 10.2196/84533

PMID: 15886485

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.