Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Jan 14, 2025
Date Accepted: Aug 5, 2025

The final, peer-reviewed published version of this preprint can be found here:

Large Language Models for Automating Clinical Trial Criteria Conversion to Observational Medical Outcomes Partnership Common Data Model Queries: Validation and Evaluation Study

Lee KH, Jang S, Kim GJ, Park S, Kim D, Kwon OJ, Lee JH, Kim YH

Large Language Models for Automating Clinical Trial Criteria Conversion to Observational Medical Outcomes Partnership Common Data Model Queries: Validation and Evaluation Study

JMIR Med Inform 2025;13:e71252

DOI: 10.2196/71252

PMID: 41100527

PMCID: 12530336

Large Language Models for Automating Clinical Trial Criteria Conversion to OMOP CDM Queries: Accuracy and Efficiency Evaluation

  • Kye Hwa Lee; 
  • Sujung Jang; 
  • Grace Juyun Kim; 
  • Sukyoung Park; 
  • Doeun Kim; 
  • Oh Jin Kwon; 
  • Jae-Ho Lee; 
  • Young-Hak Kim

ABSTRACT

Background:

Clinical trials are vital for advancing medical knowledge but often face recruitment challenges. Real World Data (RWD)-based feasibility assessments show promise in improving trial design, but automating eligibility criteria conversion to database queries remains limited by accuracy and readability issues.

Objective:

This study aimed to develop an automated system to convert free-text eligibility criteria from ClinicalTrials.gov into SQL queries compatible with hospital clinical data in OMOP CDM version 5.3. By leveraging GPT-4, we sought to enhance accuracy and efficiency while minimizing human intervention.

Methods:

An automated system was developed to process free-text eligibility criteria from ClinicalTrials.gov into OMOP CDM-compliant SQL queries. The workflow included text segmentation, exclusion of non-clinical elements (e.g., consent requirements), text simplification, mapping to standardized concepts, and SQL generation. Using a development set of 30 clinical trials (10 each from breast cancer, diabetes, and cardiovascular disease domains), GPT-4's mapping accuracy was compared against expert mapping. The system was further validated with inclusion criteria from seven highly cited trials, and SQL outputs were tested on Asan Medical Center's OMOP CDM database. Two domain experts assessed performance using predefined criteria.

Results:

GPT-4-generated SQL queries demonstrated high structural accuracy (3.99/4.0) and schema compliance (3.89/4.0). However, contextual accuracy scored lower (3.19/4.0), reflecting difficulties in handling complex eligibility conditions. Concept mapping achieved moderate domain-specific accuracy (3.43/4.0) and higher ID correctness (3.79/4.0). These results highlighted GPT-4’s strengths in syntax adherence and schema alignment but revealed challenges in mapping less common terms and executing queries efficiently.

Conclusions:

This study demonstrated GPT-4’s potential to automate clinical trial eligibility criteria conversion into OMOP CDM-based SQL queries, improving structural accuracy and schema compliance. Despite notable successes in preprocessing and concept mapping, challenges remain in capturing nuanced clinical conditions and optimizing execution efficiency. Further refinements, such as domain-specific fine-tuning and hybrid approaches, are needed to address these gaps and enhance contextual accuracy.


 Citation

Please cite as:

Lee KH, Jang S, Kim GJ, Park S, Kim D, Kwon OJ, Lee JH, Kim YH

Large Language Models for Automating Clinical Trial Criteria Conversion to Observational Medical Outcomes Partnership Common Data Model Queries: Validation and Evaluation Study

JMIR Med Inform 2025;13:e71252

DOI: 10.2196/71252

PMID: 41100527

PMCID: 12530336

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.