Accepted for/Published in: JMIR Medical Informatics
Date Submitted: May 24, 2025
Date Accepted: Dec 17, 2025
Resource Utilization Patterns in US Telehealth Services: A Machine Learning and Clustering Analysis Across Four Specialties
ABSTRACT
Background:
The expansion of telehealth services, particularly during the COVID-19 pandemic, has transformed healthcare delivery in the U.S. Telehealth promises greater access and resource efficiency by reducing wait times and appointment lengths, especially in specialties like Psychiatry, Behavioral Health, Bariatrics, and Sleep Medicine. However, disparities exist in adoption based on demographics, geography, and socioeconomic status, raising concerns about equitable access and optimal resource use.
Objective:
This study aims to evaluate how telehealth impacts healthcare resource utilization across four specialties by examining two key metrics: patient-to-provider ratios and appointment durations. It seeks to understand how factors such as patient demographics, facility characteristics, and social determinants influence telehealth adoption and efficiency, using a national dataset spanning from 2018 to 2023.
Methods:
We analyzed a deidentified dataset from Epic Cosmos, covering outpatient visits across 48 U.S. states (2018–2023). After data preprocessing and feature engineering, we applied three machine learning models (random forest, XGBoost, deep neural networks) to predict resource utilization. Using the model performing the best, feature importance was assessed using SHAP values. We then used k-means clustering to group facilities into clusters per specialty. Comparative analyses were conducted to evaluate differences in utilization among clusters, during and after the pandemic.
Results:
Telehealth use peaked in 2020 and has remained above pre-pandemic levels since then. In 2018-2023, telehealth adoption reached 36.9% in Psychiatry, 23.9% in Behavioral Health, 21.2% in Bariatrics, and 16.8% in Sleep Medicine. Telehealth visits were consistently shorter than office visits (mean reduction: 10–15 minutes, P < .05), while patient-to-provider ratios varied significantly across specialties. Among machine learning models, XGBoost regression achieved the best performance (R-squared = 0.96-0.99 for patient-to-provider ratios; R-squared = 0.61-0.69 for appointment durations). SHAP analysis identified visit type, telehealth use, facility size, rurality, and SVI household vulnerability as the strongest predictors. Comparative analyses showed significant differences across clusters (all P < .05).
Conclusions:
Telehealth has become a sustainable component of healthcare, enhancing access and efficiency across both rural and urban areas. However, its impact varies across specialties and regions, highlighting the need for targeted strategies such as staffing support for vulnerable populations, infrastructure investments in rural facilities, and reimbursement models that reflect telehealth’s resource use. This study provides robust evidence from machine learning and clustering analyses, demonstrating how telehealth shapes resource utilization and offering actionable insights for equitable and sustainable integration.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.