Accepted for/Published in: JMIR Diabetes
Date Submitted: Aug 8, 2025
Open Peer Review Period: Aug 19, 2025 - Oct 14, 2025
Date Accepted: Dec 29, 2025
(closed for review but you can still tweet)
Cardiorespiratory Markers for Early Detection of Type 2 Diabetes: Machine Learning Models
ABSTRACT
Background:
The global prevalence of type 2 diabetes mellitus (T2DM) poses significant challenges due to its association with increased cardiovascular risk and complications like cardiovascular autonomic neuropathy (CAN). Early identification of autonomic dysfunction in T2DM is important for timely interventions and improved clinical outcomes.
Objective:
This study investigates heart rate variability (HRV), frequency response function (FRF), and impulse response (IR) metrics as physiological markers for machine learning-based early prediction of T2DM-associated autonomic dysfunction.
Methods:
Using ECG and respiratory signals from two Physionet datasets, we derived complementary indices of cardiac autonomic nervous system function—HRV, FRF, and IR— for machine learning (ML) analysis. ML classifiers—logistic regression, linear SVM, and SVM with RBF kernel—assessed the predictive value of individual and combined feature sets under NearMiss-1 undersampling and SMOTE oversampling. While HRV derives autonomic metrics from ECG alone, FRF and IR utilize paired cardiorespiratory signals (ECG and respiratory signals), enabling modeling of frequency-domain (FRF) and causal time-domain (IR) interactions between cardiac and respiratory systems. This systems-based approach may capture subtle autonomic dysfunction in T2DM more effectively than HRV alone by reflecting integrated cardiorespiratory coupling.
Results:
IR metrics were the most informative standalone feature set, capturing causal cardiorespiratory interactions, achieving accuracy of 0.770 ± 0.179 (mean ± SD), precision of 0.783 ± 0.217, recall of 0.900 ± 0.224, and F1 score of 0.798 ± 0.140 with logistic regression and NearMiss-1. While HRV metrics were the least informative standalone feature set, the combined HRV + FRF feature set with NearMiss-1 achieved the highest performance, with accuracy of 0.830 ± 0.172, precision of 0.800 ± 0.183, recall of 0.933 ± 0.149, and F1 score of 0.853 ± 0.145 (SVM RBF). In SMOTE, the HRV + IR feature set performed best, yielding accuracy of 0.700 ± 0.128, precision of 0.783 ± 0.217, recall of 0.683 ± 0.207, and F1 score of 0.691 ± 0.097 with SVM RBF, surpassing standalone IR in most metrics, though IR alone retained superior recall (0.950 ± 0.112) and F1 score (0.708 ± 0.038).
Conclusions:
As the strongest individual feature set, IR offered robust, interpretable prediction with lower feature complexity. By modeling causal cardiorespiratory interactions, IR outperformed HRV in detecting early autonomic dysfunction in T2DM. Combining HRV with IR or FRF, which captures frequency-specific cardiorespiratory coupling, further enhanced predictive performance, integrating complementary autonomic insights. These results emphasize systems-based IR metrics as objective markers for early T2DM-associated cardiovascular autonomic dysfunction detection and risk assessment, supporting their use in proactive diabetes management. By integrating physiologically relevant features into ML classification, this study advances noninvasive tools for early disease detection, personalized risk stratification, and targeted interventions.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.