JMIR Preprints #11990: Detecting Hypoglycemia Incidents Reported in Patients’ Secure Messages: Using Cost-sensitive Learning and Oversampling to Reduce Data Imbalance

Current Preprint Settings

(as selected by the authors)

1. Allow access to the preprint PDF upon submission to:

(a) Open peer-review purposes
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) Nobody

2. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) Nobody

3. When a final paper is published in a JMIR journal, display the preprint as follows:

(a) Allow download
(b) Show abstract only
(c) Do not display anything

4. If the paper is rejected from JMIR journals, display the preprint to:

(a) Logged-in users only
(b) Anybody, anytime
(c) Nobody

Detecting Hypoglycemia Incidents Reported in Patients’ Secure Messages: Using Cost-sensitive Learning and Oversampling to Reduce Data Imbalance

Jinying Chen;
John Lalor;
Weisong Liu;
Emily Druhl;
Edgard Granillo;
Varsha G. Vimalananda;
Hong Yu

ABSTRACT

Background:

Improper dosing of medications like insulin can cause hypoglycemic episodes, which may lead to severe morbidity or even death. Although secure messaging was designed for exchanging non-urgent messages, patients sometimes reported hypoglycemia events through secure messaging. Detecting these patient-reported adverse events may help alert clinical teams and enable early corrective actions to improve patient safety.

Objective:

We aimed to develop a natural language processing system, called HypoDetect (Hypoglycemia Detector) to automatically identify hypoglycemia incidents reported in patients’ secure messages.

Methods:

An expert in public health annotated 3,000 secure message threads as containing patient-reported hypoglycemia incidents or not. A physician independently annotated 100 threads randomly selected from this dataset for inter-annotator agreement. We used this dataset to develop and evaluate HypoDetect. HypoDetect incorporates three machine learning algorithms widely used for text classification: Linear Support Vector Machines, Random Forest, and Logistic Regression. We explored different learning features, including new knowledge-driven features. Because only 114 (3.8%) messages were annotated as positive, we investigated cost-sensitive learning and over-sampling methods to mitigate the challenge of imbalanced data.

Results:

The inter-annotator agreement was 0.976 Cohen’s Kappa. Using cross-validation, Logistic Regression with cost-sensitive learning achieved the best performance (Area Under ROC Curve score =0.954, Sensitivity=0.693, Specificity=0.974, F1=0.590). Cost-sensitive learning and the ensembled Synthetic Minority Over-sampling Technique improved the sensitivity of the baseline systems substantially (by 0.123 to 0.728 absolute gains). Our results show that a variety of features contributed to the best performance of HypoDetect.

Conclusions:

Despite the challenge of data imbalance, HypoDetect achieved promising results for the task of detecting hypoglycemia incidents from secure messages. The system has a great potential to facilitate early detection and treatment of hypoglycemia.

Citation

Please cite as:

Chen J, Lalor J, Liu W, Druhl E, Granillo E, Vimalananda VG, Yu H

Detecting Hypoglycemia Incidents Reported in Patients’ Secure Messages: Using Cost-Sensitive Learning and Oversampling to Reduce Data Imbalance

J Med Internet Res 2019;21(3):e11990

DOI: 10.2196/11990

PMID: 30855231

PMCID: 6431826

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Aug 21, 2018

Open Peer Review Period: Aug 26, 2018 - Oct 11, 2018

Date Accepted: Feb 10, 2019

(closed for review but you can still tweet)

Detecting Hypoglycemia Incidents Reported in Patients’ Secure Messages: Using Cost-sensitive Learning and Oversampling to Reduce Data Imbalance

ABSTRACT

Citation

Copyright

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Aug 21, 2018

Open Peer Review Period: Aug 26, 2018 - Oct 11, 2018

Date Accepted: Feb 10, 2019

(closed for review but you can still tweet)

Detecting Hypoglycemia Incidents Reported in Patients’ Secure Messages: Using Cost-sensitive Learning and Oversampling to Reduce Data Imbalance

ABSTRACT

Citation

Per the author's request the PDF is not available.

Copyright