Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Feb 2, 2022
Open Peer Review Period: Feb 2, 2022 - Mar 30, 2022
Date Accepted: Apr 22, 2022
(closed for review but you can still tweet)
Non-invasive Diagnosis of Nonalcoholic Steatohepatitis and Advanced Liver Fibrosis: using Machine Learning Methods
ABSTRACT
Background:
Non-alcoholic steatohepatitis (NASH), advanced fibrosis, and subsequent cirrhosis and hepatocellular carcinoma are becoming the most common etiology for liver failure and liver transplantation and yet they can only be diagnosed at these potentially reversible stages with a liver biopsy, which is associated with various complications and high expenses. Knowing the difference between the more benign isolated steatosis and the more severe NASH and cirrhosis informs the physician on the need for more aggressive management.
Objective:
To explore the feasibility of using machine learning methods for non-invasive diagnosis of non-alcoholic steatohepatitis and advanced liver fibrosis and compare machine learning methods with existing quantitative risk scores.
Methods:
We conducted a retrospective analysis of clinical data from a cohort of 492 patients with biopsy-proven non-alcoholic fatty liver disease (NAFLD), non-alcoholic steatohepatitis, or advanced fibrosis. We systematically compared five widely used machine learning algorithms for the prediction of NAFLD, NASH, and fibrosis using two variable encoding strategies. Then, we compared the machine learning methods with three existing quantitative scores and identified the important features for prediction.
Results:
The best machine learning method, gradient boosting, achieved the best AUC scores of 0.9043, 0.8166, and 0.8360 for non-alcoholic fatty liver disease, non-alcoholic steatohepatitis, and advanced fibrosis, respectively. Gradient boosting also outperformed three existing risk scores for fibrosis. Among the variables, body mass index (BMI), alanine aminotransferase (ALT) and platelets were the most important variables for prediction of non-alcoholic fatty liver disease, whereas Aspartate aminotransferase (AST), ALT and platelets were the most important variable for prediction of non-alcoholic steatohepatitis and AST and then A1c were the most important variables for advanced fibrosis.
Conclusions:
It is feasible to use machine learning methods for prediction of non-alcoholic fatty liver disease, non-alcoholic steatohepatitis, and advanced fibrosis using routine clinical data, which potentially can be used better identify patients who still need liver biopsy. Additionally, understanding the relative importance and differences in predictors could lead to improved understanding of the disease process as well as supporting the identification of novel treatment options. Clinical Trial: N/A.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.