Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Feb 6, 2025
Date Accepted: Aug 29, 2025
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Risk factors, their interaction patterns, and scoring systems for liver cancer: comparison between patients with and without diabetes using tree-structured algorithms
ABSTRACT
Background:
Patients with diabetes are at higher risk of developing liver cancer. Nevertheless, risk factors and their interaction patterns have rarely been compared between patients with and without diabetes, nor have their interactions been incorporated into scoring system development.
Objective:
This study aims to compare risk factors, their interaction patterns, and resulting scoring systems for liver cancer risk according to diabetes and liver disease status using tree-structured algorithms.
Methods:
A retrospective cohort study was conducted using electronic health records of Hong Kong. Patients who had utilized public healthcare services between 1997 and 2021 without cancer history were identified and followed up until December 31st, 2021. Scoring systems were developed based on aggregate results from individual survival trees in random survival forest, and interaction patterns among factors were separately examined using conditional inference survival tree.
Results:
Of the 190,971 patients included, 1,275 developed liver cancer during follow-up (median: 6.25 years). Across four scoring systems, alanine aminotransferase (ALT), age, sex, and triglycerides were commonly chosen as predictors irrespective of diabetes and liver disease status. In the overall systems, liver cirrhosis was additionally selected as predictor, with chronic viral hepatitis uniquely chosen in diabetes. In the absence of liver disease, fasting glucose and smoking were uniquely selected for diabetes and non-diabetes respectively. Chronic viral hepatitis appeared as strongest risk factor in diabetes but not in non-diabetes. Among diabetes subpopulation, in the absence of chronic viral hepatitis, sex became the most important factor, followed by age, statins use, and ALT levels. Among non-diabetes subpopulation, age became the most dominant risk factor. For older patients (>55 years), uncontrolled lipids and male sex became key risk factors in statin and non-statin users respectively when ALT was higher (>43.4 U/L), while smoking became a key risk factor when ALT was lower (≤43.4 U/L). For younger patients (≤55 years), sex remained as most significant factor.
Conclusions:
Patients with and without diabetes exhibit distinctive interaction patterns among key factors on liver cancer risk. The resulting scoring systems reflect interaction patterns among predictors in individual survival trees. This study may help identify targets for public health interventions, and provide clinical cancer risk prediction according to diabetes status.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.