JMIR Preprints #20891: Reliability and Performance Assessment of Federated Learning on Clinical Benchmark Data

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Reliability and Performance Assessment of Federated Learning on Clinical Benchmark Data

GeunHyeong Lee;
Soo-Yong Shin

ABSTRACT

Background:

Federated learning (FL) is the newly proposed machine learning framework that uses decentralized dataset. Since data transfer is not necessary for the learning process in FL, FL has the great advantage in protecting personal privacy. Due to this merit, many studies have been being actively performed on diverse application areas.

Objective:

This study tries to evaluate the reliability and performance of FL on two benchmark datasets including clinical benchmark dataset.

Methods:

To evaluate FL in the realistic setting, we implemented FL that uses client-server architecture by Python. The implemented client-server version of FL software was deployed to Amazon Web Services (AWS). Modified National Institute of Standards and Technology (MNIST) and Medical Information Mart for Intensive Care-III (MIMIC-III) datasets were used to evaluate the performance of FL. For the test in the realistic setting, MNIST dataset was split into 10 different clients and each client contain only on a single digit. In addition, we conducted four different experiments by basic, imbalanced, skewed, and combined imbalanced with skewed. We also compared the performance of FL to a state-of-the-art (SOTA) performance on in-hospital mortality with MIMIC-III dataset. Likewise, we conducted experiments on basic and imbalanced data distribution. All experiments were compared performance by the area under receiver operating characteristic curve (AUROC) score and F1-score.

Results:

FL on the basic MNIST with 10 clients achieved an AUROC of 0.997 and an F1-score of 0.946. The experiment with the imbalanced MNIST achieved an AUROC of 0.995 and an F1-score of 0.921. The experiment with the skewed MNIST achieved and AUROC of 0.992 and an F1-score of 0.905. Finally combined imbalanced with skewed experiment achieved an AUROC of 0.990 and an F1-score of 0.891. The basic FL on in-hospital mortality using MIMIC-III achieved and AUROC of 0.850 and an F1-score of 0.944. The experiment with imbalanced MIMIC-III achieved an AUROC of 0.850 and an F1-score of 0.943.

Conclusions:

FL demonstrated the comparative performance on the benchmark datasets. In addition, FL showed the reliable performance on imbalanced, skewed, and extremely distribution case (i.e. data distributions are different from each hospitals). With its merit of no need to centralize the data, FL can be a good method to achieve both high performance and privacy protection.

Citation

Please cite as:

Lee G, Shin SY

Federated Learning on Clinical Benchmark Data: Performance Assessment

J Med Internet Res 2020;22(10):e20891

DOI: 10.2196/20891

PMID: 33104011

PMCID: 7652692

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Jun 1, 2020

Date Accepted: Oct 2, 2020

Reliability and Performance Assessment of Federated Learning on Clinical Benchmark Data

ABSTRACT

Citation

Copyright