JMIR Preprints #30328: Deriving Weight from Big Data: A Comparison of Body Weight Measurement Cleaning Algorithms

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Deriving Weight from Big Data: A Comparison of Body Weight Measurement Cleaning Algorithms

Richard Evans;
Jennifer Burns;
Anne Annis;
Michelle Freitag;
Susan Raffa;
Laura Damschroder;
Wyndy Wiitala

ABSTRACT

Background:

Patient body weight is a frequently utilized measure in biomedical studies, yet there are exist no standard methods for processing and cleaning weight data. Conflicting documentation on constructing body weight measurements presents challenges for research and program evaluation.

Objective:

We sought to describe and compare methods for extracting and cleaning weight data from electronic health record (EHR) databases to develop guidelines for standardized approaches that promote reproducibility.

Methods:

We conducted a systematic review of studies that used Veterans Health Administration (VHA) EHR weight data, published from 2008 – 2018 and documented the algorithms for constructing patient weight. We applied these algorithms to a cohort of veterans with at least one Primary Care visit in 2016. The resulting weight measures were compared at the patient and site levels.

Results:

We identified 496 studies and included 62 that utilized weight as outcome variables; 48% included a replicable algorithm. Algorithms varied from cut-offs of implausible weights to complex models using measures within patient over time. We found differences in the number of weight values after applying the algorithms (86% to 99% of raw data) and decreased variance (SD = 68 to 54), but little difference in average weights across methods (216 to 220 lbs.). The percent of patients with at least 5% weight loss over one year ranged from 18% to 24%.

Conclusions:

Determining the best method to assess weight using EHR data can be computationally demanding. Our results suggest that for many studies, applying simple cut-offs that require fewer computing resources and are easier to understand may be sufficient. We present guidelines for situations where more complex approaches may be warranted.

Citation

Please cite as:

Evans R, Burns J, Annis A, Freitag M, Raffa S, Damschroder L, Wiitala W

Deriving Weight From Big Data: Comparison of Body Weight Measurement–Cleaning Algorithms

JMIR Med Inform 2022;10(3):e30328

DOI: 10.2196/30328

PMID: 35262492

PMCID: 8943548

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: May 11, 2021

Open Peer Review Period: May 10, 2021 - Jul 5, 2021

Date Accepted: Jan 2, 2022

(closed for review but you can still tweet)

Deriving Weight from Big Data: A Comparison of Body Weight Measurement Cleaning Algorithms

ABSTRACT

Citation

Copyright