Currently submitted to: JMIR AI
Date Submitted: Apr 16, 2026
Open Peer Review Period: May 25, 2026 - Jul 20, 2026
(currently open for review)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
A Recursive Learning Architecture for Zero-Shot Automated Clinical Coding, a methodological study
ABSTRACT
Background:
Automated clinical coding with large language models has shown promise, but most approaches depend on supervised fine-tuning, static label spaces, or opaque prediction mechanisms that are difficult to audit and update. These limitations are particularly relevant in ICD-10-CM coding, where models must navigate complex documentation patterns, ambiguity, and evolving coding rules. Recursive learning architectures may offer an alternative by enabling systems to improve through explicit natural-language memory rather than parameter updates.
Objective:
This study evaluated whether a recursive learning architecture with an externalized Learning File could improve zero-shot ICD-10-CM coding performance on discharge summaries, while preserving interpretability and enabling analysis of longitudinal learning dynamics.
Methods:
We developed PANDORA, a zero-shot coding system composed of a Coder, a Reviewer, and a persistent natural-language Learning File derived from prior coding errors. Using discharge summaries from MIMIC-IV and a Top-50 ICD-10-CM benchmark, we compared a no-memory baseline (Phase 1) against a memory-augmented configuration (Phase 4). Performance was assessed across 20 recursive training iterations and on a held-out testing set of 500 cases, using micro-F1, macro-F1, precision, and recall at both exact-code and ICD-3 levels. Error composition, representative memory-guided decisions, and temporal degradation associated with memory growth were also analyzed.
Results:
In the held-out testing set, the memory-augmented system improved exact-code micro-F1 from 0.307 to 0.527 and precision from 0.203 to 0.515, while recall decreased from 0.630 to 0.540. At the ICD-3 level, micro-F1 improved from 0.372 to 0.560. Across training iterations, the memory-augmented condition achieved an exact-code micro-F1 of 0.605 versus 0.318 in the no-memory baseline. Gains were driven primarily by large reductions in false positives, indicating that the Learning File improved precision more than recall. A qualitative review showed that the system used accumulated rules to suppress unsupported codes and to recover context-sensitive diagnoses. However, performance declined after iteration 10 as the Learning File grew larger and less discriminative, suggesting that memory bloat is an important failure mode of recursive learning.
Conclusions:
A recursive learning architecture with explicit natural-language memory substantially improved zero-shot ICD-10-CM coding performance, primarily through better precision and more controlled code assignment. The approach offers transparency benefits because improvements can be traced to human-readable learned rules rather than hidden parameter changes. However, recursive systems require active memory governance, as unchecked rule accumulation may degrade performance over time. These findings support memory-based adaptation as a promising direction for interpretable clinical coding systems and other high-stakes clinical NLP tasks.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.