JMIR Preprints #54956: Patient is NOT all you need : Enhancing Machine Learning-based COVID-19 Screening Models with Epidemiological and Mobility Features

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Patient is NOT all you need : Enhancing Machine Learning-based COVID-19 Screening Models with Epidemiological and Mobility Features

Hyunwoo Choo;
Hyo Jung Kim;
Dohyung Lee;
Soo-Yong Shin;
Jiwoo Lee;
Duhun Lee;
Eonji Kim;
Namsoo Oh;
Christina Kim;
Ahreum Jang;
Hyejung Kim;
Hae-Lee Park;
Sungtae Kim;
Myeongchan Kim

ABSTRACT

Background:

Despite the significant post-coronavirus disease 2019 (COVID-19) pandemic surge in research using symptom data and machine learning for patient screening, data on patient trajectories and epidemiological conditions, although crucial, have remained underutilized.

Objective:

This study aimed to improve the screening performance of machine learning models by incorporating mobility and epidemic information, to patient symptom data.

Methods:

Data, including daily self-reported symptoms, location information, and test results, were collected from 48,798 individuals using a smartphone application. These data were then combined with Our World in Data (OWID) and national government epidemic information to train five machine-learning-based screening models to classify patient infection status. The models were logistic regression, XGBoost, LGBM, TabNet, and Google AutoML.

Results:

The addition of mobility and epidemic data significantly improved the performance of all the five models. The highest AUROC score increased from 0·8712 without mobility and epidemic data to 0·9104 with mobility and epidemic data. This highlights the considerable impact of external information on enhancing the performance of machine learning models.

Conclusions:

This study demonstrated the potential of using mobility and epidemic data, such as location information and epidemic data, in combination with patient symptom data to improve the accuracy of machine learning models for diagnosing COVID-19. Considering additional contextual information can enhance the ability to screen COVID-19.

Citation

Please cite as:

Choo H, Kim HJ, Lee D, Shin SY, Lee J, Lee D, Kim E, Oh N, Kim C, Jang A, Kim H, Park HL, Kim S, Kim M

Enhancing COVID-19 Screening Models With Epidemiological and Mobility Features: Machine-Learning Model Study

JMIR AI 2026;5:e54956

DOI: 10.2196/54956

PMID: 41812080

PMCID: 12978548

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR AI

Date Submitted: Dec 11, 2023

Date Accepted: Aug 8, 2025

Patient is NOT all you need : Enhancing Machine Learning-based COVID-19 Screening Models with Epidemiological and Mobility Features

ABSTRACT

Citation