Previously submitted to: Journal of Medical Internet Research (no longer under consideration since Aug 21, 2020)
Date Submitted: Jul 8, 2020
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
A Digital Pathology Platform for Artificial Intelligence Data Sharing
ABSTRACT
Background:
High-quality learning materials are needed for artificial intelligence (AI) development, but are not practically available; this situation is especially poor in the medical field. In particular, annotating medical images (e.g., annotation for tumor area by pathologists) is massive as well as expensive, and subject to privacy protection. These are major limitations for AI developers to approach and reproduce medical image data.
Objective:
This study aimed to reduce barriers for AI researchers to access medical image datasets by collating and sharing high-quality medical images with pathologists, and to find applicable ways to apply diagnostic AI assistance to reduce the pathologists’ workload.
Methods:
Pathology slides of tumors of five organs (liver, colon, prostate, pancreas and biliary tract, and kidney) from histologically confirmed cases were selected for this study. After scanning the slides to obtain whole slide digital images, the patient information was de-identified, and annotation for the tumor area was performed by the pathologist. Next, an AI-assisted annotation process was used in parallel to improve the annotation workload of pathologists and to draw complex lesion boundaries more accurately. This allowed all the data to include the annotations confirmed by experienced pathologists, and to be used as an AI learning dataset.
Results:
A web-based data-sharing platform for AI learning was built, and was unveiled in 2019. In total, 3,100 massive datasets of 5 organ carcinomas were shared through this platform, and were accessible to all researchers. The platform had the advantage that users could search data visually and intuitively; except for commercial purposes, all researchers made free use of the provided dataset for their research. Finally, the platform also provided five image data pre-processing algorithms that could help AI modeling learners.
Conclusions:
We built and operated a web-based data-sharing platform for AI researchers providing a high-quality digital pathology dataset personally annotated by pathologists. We hope that our experience will help researchers who want to build such a platform in future, by sharing issues gained from collecting and sharing these valuable data.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.