Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Mar 21, 2022
Date Accepted: Jul 12, 2022
Multi-Center Validation of Natural Language Processing Algorithms for Detection of Common Data Elements in Operative Notes for Total Hip Arthroplasty
ABSTRACT
Background:
Natural language processing (NLP) methods are powerful tools to extract and analyze critical information from free-text data. MedTaggerIE, an open-source NLP pipeline for information extraction based on text patterns, has been widely used in annotation of clinical notes. MedTagger-THA, developed based on MedTaggerIE, was previously shown to correctly identify surgical approach, fixation, and bearing surface from the total hip arthroplasty (THA) operative notes at Mayo Clinic.
Objective:
To assess the portability and generalizability of MedTagger-THA at two external institutions: University of Michigan and University of Iowa.
Methods:
We conducted iterative test-apply-refinement processes with three sites involved: development site (i.e. the site that developed the initial NLP system; Mayo Clinic) and two deployment sites (Michigan Medicine and University of Iowa). The activities at two deployment sites included extraction of the operative notes, gold standard development (Michigan: registry data, Iowa: manual chart review), refinement of NLP algorithms on training data, and evaluation on test data. Error analyses were conducted to understand language variation across sites. In order to further assess the model specificity for approach and fixation, we applied the refined MedTagger-THA to arthroscopic hip procedures and periacetabular osteotomy (PAO) cases because neither of these operative notes should contain approach or fixation (cemented, uncemented, hybrid, or reverse hybrid) terms.
Results:
At the Michigan site, study comprised THA-related notes for 2569 patient-date pairs. Prior to model refinement, MedTagger-THA algorithm demonstrated excellent accuracy for approach (96.6%) and fixation (95.7%). These results were comparable to internal accuracy at development site (99.2% for approach and 90.7% for fixation). Model refinement improved accuracies slightly for both approach (97.0%) and fixation (96.1%). The specificity of approach identification was 88.9% for arthroscopy cases, and the specificity of fixation identification was 100% for both PAO and arthroscopy cases. At the Iowa site, study comprised an overall dataset of 100 operative notes (50 training and 50 test). MedTagger-THA algorithm achieved moderate-high performance on the training data. After model refinement, the model achieved high performance for approach (100.0%), fixation (98.0%) and bearing surface (92.0%).
Conclusions:
MedTagger-THA algorithm has great generalizability for identifying approach, fixation and bearing surface. When porting NLP models across institutions, model refinement is useful for improving accuracy.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.