JMIR Preprints #98279: A Recursive Learning Architecture for Zero-Shot Automated Clinical Coding, a methodological study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

A Recursive Learning Architecture for Zero-Shot Automated Clinical Coding, a methodological study

Natalia Castaño Villegas;
Raul Escandon;
Katherine Monsalve;
Jose Zea;
Laura Velasquez

ABSTRACT

Background:

Automated clinical coding with large language models has shown promise, but most approaches depend on supervised fine-tuning, static label spaces, or opaque prediction mechanisms that are difficult to audit and update. These limitations are particularly relevant in ICD-10-CM coding, where models must navigate complex documentation patterns, ambiguity, and evolving coding rules. Recursive learning architectures may offer an alternative by enabling systems to improve through explicit natural-language memory rather than parameter updates.

Objective:

This study evaluated whether a recursive learning architecture with an externalized Learning File could improve zero-shot ICD-10-CM coding performance on discharge summaries, while preserving interpretability and enabling analysis of longitudinal learning dynamics.

Methods:

We developed PANDORA, a zero-shot coding system composed of a Coder, a Reviewer, and a persistent natural-language Learning File derived from prior coding errors. Using discharge summaries from MIMIC-IV and a Top-50 ICD-10-CM benchmark, we compared a no-memory baseline (Phase 1) against a memory-augmented configuration (Phase 4). Performance was assessed across 20 recursive training iterations and on a held-out testing set of 500 cases, using micro-F1, macro-F1, precision, and recall at both exact-code and ICD-3 levels. Error composition, representative memory-guided decisions, and temporal degradation associated with memory growth were also analyzed.

Results:

In the held-out testing set, the memory-augmented system improved exact-code micro-F1 from 0.307 to 0.527 and precision from 0.203 to 0.515, while recall decreased from 0.630 to 0.540. At the ICD-3 level, micro-F1 improved from 0.372 to 0.560. Across training iterations, the memory-augmented condition achieved an exact-code micro-F1 of 0.605 versus 0.318 in the no-memory baseline. Gains were driven primarily by large reductions in false positives, indicating that the Learning File improved precision more than recall. A qualitative review showed that the system used accumulated rules to suppress unsupported codes and to recover context-sensitive diagnoses. However, performance declined after iteration 10 as the Learning File grew larger and less discriminative, suggesting that memory bloat is an important failure mode of recursive learning.

Conclusions:

A recursive learning architecture with explicit natural-language memory substantially improved zero-shot ICD-10-CM coding performance, primarily through better precision and more controlled code assignment. The approach offers transparency benefits because improvements can be traced to human-readable learned rules rather than hidden parameter changes. However, recursive systems require active memory governance, as unchecked rule accumulation may degrade performance over time. These findings support memory-based adaptation as a promising direction for interpretable clinical coding systems and other high-stakes clinical NLP tasks.

Citation

Please cite as:

Castaño Villegas N, Escandon R, Monsalve K, Zea J, Velasquez L

A Recursive Learning Architecture for Zero-Shot Automated Clinical Coding, a methodological study

JMIR Preprints. 16/04/2026:98279

DOI: 10.2196/preprints.98279

URL: https://preprints.jmir.org/preprint/98279

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Currently submitted to: JMIR AI

Date Submitted: Apr 16, 2026

Open Peer Review Period: May 25, 2026 - Jul 20, 2026

(currently open for review)

A Recursive Learning Architecture for Zero-Shot Automated Clinical Coding, a methodological study

ABSTRACT

Citation

Copyright