JMIR Preprints #91399: Adaptive Fast-Slow Large Language Model Framework for Multi-Dimensional Classification of Prenatal Ultrasound Reports: Comparative Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Adaptive Fast-Slow Large Language Model Framework for Multi-Dimensional Classification of Prenatal Ultrasound Reports: Comparative Study

Wei Zhong;
Huihui Yan;
Yifan Liu;
Yan Liu;
Kai Yang;
Huimin Gao;
Zhengyang Yao;
Wenjing Hao;
Yousheng Yan;
Chenghong Yin

ABSTRACT

Background:

Phenotype-driven prenatal diagnosis relies on the precise correlation between ultrasound findings and genetic outcomes, yet this process is hindered by the unstructured nature of clinical ultrasound reports. While Large Language Models (LLMs) hold the potential to address this challenge, their specific application in this domain remains systematically underexplored.

Objective:

To establish an effective LLM implementation framework for the clinical multi-dimensional classification of prenatal ultrasound reports, we evaluated the open-source DeepSeek-V3.2 family on real-world anomalous reports—covering both factual and subjective categories—while integrating Retrieval-Augmented Generation (RAG) and Chain-of-Thought (CoT) reasoning.

Methods:

From a cohort of 4,256 pregnancies, we extracted 254 reports with fetal anomalies. We deployed a high-speed base model (DeepSeek-V3.2-B) for four factual extraction tasks—primary classification, standardized terminology, anatomical system, and abnormality count—and a reasoning-enhanced model (DeepSeek-V3.2-R) for subjective severity assessment, explicitly evaluating the efficacy of RAG for subjective tasks. Finally, to validate the clinical utility of this approach, we performed a correlation analysis between the expert-validated multi-dimensional phenotypic profiles and definitive genetic outcomes derived from amniocentesis.

Results:

While V3.2-B achieved high efficiency in factual tasks (accuracy and F1 > 90%), it underperformed in subjective severity grading (56.6% accuracy), exhibiting a recall of 0 for minor anomalies. Crucially, while RAG significantly improved both models' performance on internal retrieval datasets (P<.05), this benefit did not generalize to external test datasets (P>.05). In contrast, the V3.2-R model utilizing CoT reasoning achieved superior robustness (86% accuracy, F1=0.75) on external data without RAG; notably, introducing RAG to V3.2-R degraded performance to 81%, suggesting potential noise interference. Clinical validation against amniocentesis outcomes confirmed that accurate multi-dimensional phenotypic profiles significantly stratified pathogenic genetic risks.

Conclusions:

The rapid base models are efficient for factual classification and RAG enhances performance on data similar to the knowledge base, whereas CoT is indispensable for subjective assessment. We recommend clinically adopting this adaptive "fast-slow" LLM framework to efficiently perform multi-dimensional classification of prenatal ultrasound anomalies. This privacy-preserving, locally deployable solution provides a scalable path to accelerate phenotype-genotype research and optimize invasive diagnostic decision-making. Clinical Trial: Medical Research Registration and Filing Information System of the National Health Security Information Platform of China (registration no. MR-11-24-002508)

Citation

Please cite as:

Zhong W, Yan H, Liu Y, Liu Y, Yang K, Gao H, Yao Z, Hao W, Yan Y, Yin C

Adaptive Fast-Slow Large Language Model Framework for Multidimensional Classification of Prenatal Ultrasound Reports: Comparative Study

J Med Internet Res 2026;28:e91399

DOI: 10.2196/91399

PMID: 42207158

PMCID: 13218277

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Jan 14, 2026

Date Accepted: May 4, 2026

Adaptive Fast-Slow Large Language Model Framework for Multi-Dimensional Classification of Prenatal Ultrasound Reports: Comparative Study

ABSTRACT

Citation

Copyright