Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Mar 21, 2022
Date Accepted: Dec 4, 2022

The final, peer-reviewed published version of this preprint can be found here:

An End-to-End Natural Language Processing Application for Prediction of Medical Case Coding Complexity: Algorithm Development and Validation

Xu H, Maccari B, Guillain H, Herzen J, Agri F, Raisaro JL

An End-to-End Natural Language Processing Application for Prediction of Medical Case Coding Complexity: Algorithm Development and Validation

JMIR Med Inform 2023;11:e38150

DOI: 10.2196/38150

PMID: 36656627

PMCID: 9896350

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Development of an end-to-end NLP application for prediction of medical case coding complexity

  • He Xu; 
  • Bernard Maccari; 
  • Hervé Guillain; 
  • Julien Herzen; 
  • Fabio Agri; 
  • Jean Louis Raisaro

ABSTRACT

Background:

Medical coding is the process that converts clinical documentation into standard medical codes. Codes are used for several key purposes in a hospital such as insurance reimbursement and hospital performance analysis. The optimization of medical coding accuracy and efficiency is therefore crucial. With the rapid growth of NLP technologies, several commercial rule-based and machine-learning-based solutions have been proposed for aiding medical coding by automatically suggesting relevant codes for a medical case. However, their effectiveness is still limited to simple cases, and it is not yet clear how much value they can bring in improving coding efficiency and accuracy.

Objective:

Our study aims to propose an alternative approach for improving medical coding efficiency. Based on the analysis of the work organization of the coding team in the Lausanne University Hospital, Switzerland, we develop an end-to-end multimodal machine-learning-based application that can predict coding complexity in the pre-coding phase. The goal is to enable a more efficient redistribution of coding tasks based on the various levels of expertise within the coding unit to eventually minimize coding errors and improve coding throughput.

Methods:

We collected 2060 cases rated by coders from 1 (simplest) to 4 (most complex) to train and evaluate our ML approach. We asked two expert coders to rate 62 cases out of the 2060 as the gold standard. The agreements between experts are used as benchmarks for model evaluation. A case contains both clinical text and patient’s metadata from the hospital electronic health record. We extracted both text features and metadata features, then concatenated and fed into a ML model. We built two models: The first with cross-validated training on 1751 cases and testing on 309 cases aiming at assessing predictive power of the proposed approach and its generalizability, the second, trained on 1998 cases and tested on the gold standard to validate the best model performance against human benchmarks.

Results:

Our first model achieves macro-f1 score 0.51, accuracy 0.59. The model distinguishes well between the simple (complexity 1-2) and complex (complexity 3-4) cases with macro f1-score 0.65, accuracy 0.71. Our second model achieves 61% agreement with experts’ ratings, and macro-f1 0.62 on the gold standard, while the two experts have a 66% agreement ratio with macro-f1 score 0.67.

Conclusions:

We proposed a multimodal modeling approach that leverages information from both clinical text and patients’ metadata to predict the complexity of coding a case in the pre-coding phase. The proposed approach yields a NLP model that is comparable with human expert coders. By integrating this model to the hospital coding system, coders’ workloads will be better allocated, and domain experts will receive better decision support.


 Citation

Please cite as:

Xu H, Maccari B, Guillain H, Herzen J, Agri F, Raisaro JL

An End-to-End Natural Language Processing Application for Prediction of Medical Case Coding Complexity: Algorithm Development and Validation

JMIR Med Inform 2023;11:e38150

DOI: 10.2196/38150

PMID: 36656627

PMCID: 9896350

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.