Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Jan 28, 2025
Date Accepted: Apr 4, 2025

The final, peer-reviewed published version of this preprint can be found here:

Advancing the Use of Longitudinal Electronic Health Records: Tutorial for Uncovering Real-World Evidence in Chronic Disease Outcomes

Huang F, Hou J, Zhou N, Greco K, Lin C, Sweet SM, Wen J, Shen L, Gonzalez N, Zhang S, Liao KP, Cai T, Xia Z, Bourgeois FT, Cai T

Advancing the Use of Longitudinal Electronic Health Records: Tutorial for Uncovering Real-World Evidence in Chronic Disease Outcomes

J Med Internet Res 2025;27:e71873

DOI: 10.2196/71873

PMID: 40357530

PMCID: 12107207

Advancing the Use of Longitudinal Electronic Health Records: Tutorial for Uncovering Real-World Evidence in Chronic Disease Outcome

  • Feiqing Huang; 
  • Jue Hou; 
  • Ningxuan Zhou; 
  • Kimberly Greco; 
  • Chenyu Lin; 
  • Sara Morini Sweet; 
  • Jun Wen; 
  • Lechen Shen; 
  • Nicolas Gonzalez; 
  • Sinian Zhang; 
  • Katherine P. Liao; 
  • Tianrun Cai; 
  • Zongqi Xia; 
  • Florence T. Bourgeois; 
  • Tianxi Cai

ABSTRACT

Managing chronic diseases requires ongoing monitoring of disease activity and therapeutic responses to optimize treatment plans. With the growing availability of disease-modifying treatments (DMTs), it is crucial to investigate comparative effectiveness and long-term outcomes beyond those available from randomized clinical trials (RCTs). We introduce a comprehensive pipeline for generating reproducible and generalizable real-world evidence (RWE) on disease outcomes by leveraging electronic health record (EHR) data. The pipeline links EHR data with registry information and applies algorithms based on longitudinal EHR features to evaluate therapies for chronic diseases, as illustrated through a case study of multiple sclerosis. Our approach addresses challenges in RWE generation for disease activity of chronic conditions, specifically the lack of direct observations on key outcomes and biases arising from imperfect or incomplete data. We present advanced machine learning techniques such as semi-supervised and ensemble methods to impute missing outcome data, further incorporating steps for calibrated causal analyses and bias correction.


 Citation

Please cite as:

Huang F, Hou J, Zhou N, Greco K, Lin C, Sweet SM, Wen J, Shen L, Gonzalez N, Zhang S, Liao KP, Cai T, Xia Z, Bourgeois FT, Cai T

Advancing the Use of Longitudinal Electronic Health Records: Tutorial for Uncovering Real-World Evidence in Chronic Disease Outcomes

J Med Internet Res 2025;27:e71873

DOI: 10.2196/71873

PMID: 40357530

PMCID: 12107207

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.