Accepted for/Published in: JMIR Formative Research
Date Submitted: Dec 17, 2024
Open Peer Review Period: Dec 17, 2024 - Feb 11, 2025
Date Accepted: May 12, 2025
(closed for review but you can still tweet)
Prediction of One-Year Activity in Systemic Lupus Erythematosus: A Hierarchical Machine Learning Approach
ABSTRACT
Background:
Systemic lupus erythematosus (SLE) is a chronic disease characterized by a broad spectrum of involved organs, including neurological, renal, and vascular domains, with disease activity manifesting through unpredictable patterns that vary across individuals and over time, making the prediction of activity events particularly challenging.
Objective:
This paper proposes a hierarchical machine learning model to predict a 12-months SLE activity, defined as the occurrence of at least an event among SLE hospitalization, new organ-involved domain and neurological, renal, or vascular manifestation within the following year. At each patient’s visit, the model considers all the features at the current time-point, and the information about the patient’s clinical history and about its last 12 months, to predict the outcome for the next 12 months.
Methods:
The study cohort consists of 262 patients with at least an outpatient visit and a SLE admission from 2012 to 2020, at the Italian Gemelli Hospital, comprising a retrospective longitudinal dataset of 5962 contacts. The data include demographics, laboratory, clinical features (e.g., domain involvements and manifestations), treatments, and pathways (e.g., contact types as outpatients, hospitalizations, and day hospitals, and visit frequency). The variables consider three time ranges: features about the current contact and the last 12 months, and the previous patient’s clinical history. The main model was developed by testing different machine learning approaches within a cross-validation setup. The predicted probability outputs were used in a risk stratification analysis, identifying three groups of predictions: strong, moderate, and mild. Mild samples were then passed through a second cascade model. The integration of the main model (applied to strong and moderate samples) with the cascade model (applied to mild contacts) forms our final hierarchical model.
Results:
The hierarchical model, resulting from the ensemble of the main Random Forest and cascade Decision Tree, demonstrated enhanced performance, increasing the AUC from 0.696 (95 % CI 0.672 - 0.719) in the original main model to 0.743 (95 % CI 0.717; 0.769), particularly for specific patient characteristics. Through the application of Explainable Artificial Intelligence (XAI) methods, we also identified the key features that significantly influence the model’s predictions. Among the 185 collected features, 15 emerged as the most impactful, including age at contact, response to therapy modifications, abnormal laboratory tests, and clinical manifestations. This analysis plays a crucial role in enhancing model transparency, which is essential for fostering the adoption of AI in healthcare settings.
Conclusions:
Our study introduces an explainable and reliable tool for predicting one-year SLE activity, supporting physicians with an advanced decision-support system to improve patient management. The model identifies key features that may help characterize patient phenotypes, enabling personalized treatment plans and better outcomes. Additionally, the methodology can be generalized for predictive analytics in other chronic autoimmune diseases.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.