Currently submitted to: JMIR Preprints
Date Submitted: Mar 24, 2026
Open Peer Review Period: Mar 24, 2026 - Mar 9, 2027
(currently open for review)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
The Matchmaker's Dilemma: A Survey of Retrieval, Ranking, and LLM Systems in Clinical Trial Matching
ABSTRACT
Background:
Clinical trials are essential for evaluating medical interventions, yet approximately 80% fail to meet enrollment targets on time. The challenge of matching patients to suitable trials involves reviewing unstructured clinical notes against complex eligibility criteria containing dozens of inclusion and exclusion conditions. AI methods promise to automate this matching at scale, but each approach involves fundamental tradeoffs between speed, accuracy, interpretability, and auditability.
Objective:
This technical narrative review surveys the landscape of AI approaches to patient-trial matching, tracing the evolution from rule-based systems through sparse retrieval, dense retrieval, cross-encoder reranking, knowledge graphs, and large language model approaches. We identify critical gaps and provide an integrative framework for understanding tradeoffs across methods.
Methods:
We searched PubMed, IEEE Xplore, ACL Anthology, and arXiv using terms including "clinical trial matching," "patient recruitment AI," "eligibility criteria NLP," and "trial-patient retrieval" for literature published 2015 to 2026. We synthesized algorithmic approaches, commercial platforms, evaluation methodologies, and explainability requirements through a technical narrative review.
Results:
We identified seven algorithmic paradigms: rule-based systems, machine learning classifiers, BM25 sparse retrieval, BERT-based dense retrieval, cross-encoder reranking, large language models (including TrialGPT achieving 87.3% criterion-level accuracy), and knowledge graph approaches. We catalogued sixteen commercial platforms and documented the TREC Clinical Trials Track and n2c2 2018 benchmarks. Each method resolves the core tradeoff differently: BM25 offers speed without semantic understanding; LLMs offer flexibility without auditability; hybrid architectures distribute tradeoffs across components.
Conclusions:
Six critical gaps define the frontier: absence of operational-scale benchmarks, temporal and Boolean reasoning limitations, tension between LLM flexibility and deterministic auditability, multi-method explainability disclosure requirements, proxy-label governance, and data heterogeneity across EHR systems. The matchmaker's dilemma of balancing competing goods with no perfect solution frames both progress and unresolved challenges in this rapidly evolving field.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.