Currently submitted to: JMIR Medical Informatics
Date Submitted: May 18, 2026
Open Peer Review Period: Jun 2, 2026 - Jul 28, 2026
(currently open for review)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Natural Language Processing–Based Severity Phenotyping of Metabolic Dysfunction–Associated Steatotic Liver Disease in Health Check-Up Records: Cross-Sectional Study
ABSTRACT
Background:
Hepatic steatosis in MASLD-oriented health check-up settings is commonly linked to cardiometabolic risk. Routine ultrasound reports often describe steatosis severity, but this information is usually embedded in free text. Many datasets therefore use a simple yes/no definition, which can miss stage-related differences across metabolic measures.
Objective:
We aimed to (1) extract ultrasound-reported hepatic steatosis severity stages from routine ultrasound narratives using natural language processing (NLP), (2) describe severity-associated metabolic patterns across stages, (3) examine whether these patterns differ by sex, and (4) test whether routine clinical indicators can identify moderate-to-severe ultrasound-reported steatosis.
Methods:
We conducted a cross-sectional analysis of 107,120 health check-up records from Shanghai Health and Medical Center (Oct 2024–Oct 2025). A rule-based NLP pipeline classified ultrasound narratives into five steatosis-severity stages (Normal, Trend, Mild, Moderate, Severe) and was validated against 450 physician-annotated narratives. We summarized metabolic indicators by stage and compared adjacent stages using bootstrap-based nonparametric methods. Analyses were repeated by sex, and women were further stratified by age (<50 vs ≥50 years). We also built a multivariable logistic regression model to identify moderate-to-severe ultrasound-reported steatosis and evaluated it by stratified 10-fold cross-validation using aggregated out-of-fold predictions.
Results:
Metabolic burden increased across NLP-defined stages. Adjacent-stage bootstrap comparisons showed larger Mild-to-Moderate increases for BMI and ALT than the preceding Mild-to-Trend increases. In contrast, FBG, SBP, and TG changed more gradually across stages. UA showed a similar direction but without statistical support for an inflection. Men had higher absolute levels, but stage-associated patterns were broadly similar between sexes and did not suggest a sex-by-severity interaction. In women, age-stratified analyses showed marker-specific severity-by-age heterogeneity; at the Moderate stage, interaction coefficients were mostly negative, indicating that the Moderate-versus-Normal contrast was not larger in women aged ≥50 years than in women aged <50 years. The prediction model for moderate-to-severe ultrasound-reported steatosis showed stable internal performance (AUC 0.898±0.008; AP 0.239; Brier 0.025).
Conclusions:
Severity staging derived from ultrasound narratives can be recovered at scale using NLP and supports severity-graded risk stratification of ultrasound-reported steatosis in MASLD-oriented screening data. This approach moves beyond binary classification and highlights the Mild-to-Moderate transition for selected markers, especially BMI and ALT, using routinely collected measures without specialized imaging or invasive testing.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.