Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Mar 5, 2025
Date Accepted: Oct 29, 2025

The final, peer-reviewed published version of this preprint can be found here:

Leveraging Machine Learning and Robotic Process Automation to Identify and Convert Unstructured Colonoscopy Results Into Actionable Data: Proof-of-Concept Study

Stevens ER, Hartman J, Testa P, Mansukhani A, Monina C, Shunk A, Ranson D, Imberg Y, Cote A, Prabhu D, Szerencsy A

Leveraging Machine Learning and Robotic Process Automation to Identify and Convert Unstructured Colonoscopy Results Into Actionable Data: Proof-of-Concept Study

JMIR Med Inform 2025;13:e73504

DOI: 10.2196/73504

PMID: 41264858

PMCID: 12634012

Leveraging Machine Learning and Robotic Process Automation to identify and convert unstructured colonoscopy results into actionable data: a proof of concept

  • Elizabeth R Stevens; 
  • Jager Hartman; 
  • Paul Testa; 
  • Ajay Mansukhani; 
  • Casey Monina; 
  • Amelia Shunk; 
  • David Ranson; 
  • Yana Imberg; 
  • Ann Cote; 
  • Dinesha Prabhu; 
  • Adam Szerencsy

ABSTRACT

Background:

Effective Colorectal cancer (CRC) screening is a cornerstone of preventive healthcare. Existing electronic health record (EHR) tools to facilitate reminders for CRC screening follow-up are inadequate to address clinician needs. With rising patient volumes and a focus on quality, our health system had the objective to create a more efficient way to ensure accurate documentation of CRC screening follow-up intervals from inbound colonoscopy reports. We developed an integrated end-to-end workflow solution using machine learning (ML) and robotic process automation (RPA) to extract and update follow-up dates from unstructured data.

Objective:

To automate data extraction from external, free-text colonoscopy reports to identify and document recommended follow-up dates for CRC screening in structured fields within the EHR.

Methods:

As proof of concept, we outline the process development, validity, and implementation of an approach that integrates available tools to automate data retrieval and entry within the EHR of a large academic health system. The health system uses Epic Systems as its EHR platform. This proof-of-concept process study consisted of six stages: 1) identification of gaps in documenting recommendations for follow-up CRC screening from external colonoscopy reports; 2) defining process objectives; 3) identification of technologies; 4) creation of process architecture; 5) process validation; and 6) health system-wide implementation. Chart review was performed to validate process outcomes and estimate impact.

Results:

We developed an automated process with three primary steps that leveraged ML and RPA to create a fully orchestrated workflow to update the CRC screening recall date based on colonoscopy reports received from external sources. Process validity was assessed with 690 scanned colonoscopy reports. From the organization-wide implementation go-live date until December 31st, 2024, the system has processed 16,563 external colonoscopy reports. Of these 35.3% (5841) had a follow-up date that met the relevant threshold by the ML model and thus were identified as ready for RPA processing. This resulted in an estimated increase in documentation accuracy of 27.2%.

Conclusions:

The implementation of an automated workflow to extract and update CRC screening follow-up dates from external colonoscopy reports is feasible and has the potential to improve accuracy in patient recall based on recommendations while reducing clinician documentation burden. Expanding similar processes to other types of unstructured data could provide another mechanism to solve for a lack of data integration and improve reporting for quality measures within the EHR. Automated workflows leveraging ML and RPA offer practical solutions to overcome interoperability challenges and enhance the use of unstructured data within healthcare systems.


 Citation

Please cite as:

Stevens ER, Hartman J, Testa P, Mansukhani A, Monina C, Shunk A, Ranson D, Imberg Y, Cote A, Prabhu D, Szerencsy A

Leveraging Machine Learning and Robotic Process Automation to Identify and Convert Unstructured Colonoscopy Results Into Actionable Data: Proof-of-Concept Study

JMIR Med Inform 2025;13:e73504

DOI: 10.2196/73504

PMID: 41264858

PMCID: 12634012

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.