Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Dec 4, 2020
Date Accepted: May 4, 2021

The final, peer-reviewed published version of this preprint can be found here:

Artificial Intelligence–Based Prediction of Lung Cancer Risk Using Nonimaging Electronic Medical Records: Deep Learning Approach

Yeh MCH, Li YC, Wang YH, Yang HC, Bai KJ, Wang HH

Artificial Intelligence–Based Prediction of Lung Cancer Risk Using Nonimaging Electronic Medical Records: Deep Learning Approach

J Med Internet Res 2021;23(8):e26256

DOI: 10.2196/26256

PMID: 34342588

PMCID: 8371476

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Artificial Intelligence to Predict Risk of Lung Cancer with Electronic Medical Records: A Deep Learning and Big Data Approach

  • Marvin Chia-Han Yeh; 
  • Yu-Chuan(Jack) Li; 
  • Yu-Hsiang Wang; 
  • Hsuan-Chia Yang; 
  • Kuan-Jen Bai; 
  • Hsiao-Hang Wang

ABSTRACT

Background:

Artificial intelligence can integrate complex features and may be used to predict the risk of developing lung cancer, thereby decreasing the need for unnecessary and expensive diagnostic interventions.

Objective:

Using electronic medical records to pre-screening patient’s risk for developing lung cancer.

Methods:

Two million participants were randomly selected from the Taiwan National Health Insurance Research Database from 1999 to 2013; We built a predictive lung cancer screening model with neural networks that were trained and validated using pre-2012 data and tested prospectively on post-2012 data. An age- and gender-matched subgroup that is 10 times larger than the original lung cancer group was used to assess the predictive power of EMR. Discrimination (area under the curve [AUC]) and calibration analyses were performed.

Results:

The analysis included 11,617 cases of lung cancer and 1,423,154 controls. The model achieved an AUC of 0.90 for the overall population and 0.87 in patients >55 years of age. The AUC in the matched subgroup was 0.82. The positive predictive value was highest (14.3%) among those >55-years-old with a preexisting history of lung disease.

Conclusions:

Our model achieved excellent performance at predicting lung cancer within one year and may be deployed for digital patient screening. Deep learning facilitates the effective use of EMRs to identify individuals at high risk for developing lung cancer.


 Citation

Please cite as:

Yeh MCH, Li YC, Wang YH, Yang HC, Bai KJ, Wang HH

Artificial Intelligence–Based Prediction of Lung Cancer Risk Using Nonimaging Electronic Medical Records: Deep Learning Approach

J Med Internet Res 2021;23(8):e26256

DOI: 10.2196/26256

PMID: 34342588

PMCID: 8371476

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.