Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: May 6, 2021
Date Accepted: Aug 2, 2021

The final, peer-reviewed published version of this preprint can be found here:

Automatic Classification of Thyroid Findings Using Static and Contextualized Ensemble Natural Language Processing Systems: Development Study

Shin D, Kam HJ, Jeon MS, Kim HY

Automatic Classification of Thyroid Findings Using Static and Contextualized Ensemble Natural Language Processing Systems: Development Study

JMIR Med Inform 2021;9(9):e30223

DOI: 10.2196/30223

PMID: 34546183

PMCID: 8493453

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Static and Contextualized Ensemble NLP Systems for Automatic Classification of Thyroid Findings based on Non-Standardized Multi-Institutional Electronic Medical Records in Korean

  • Dongyup Shin; 
  • Hye Jin Kam; 
  • Min-Seok Jeon; 
  • Ha Young Kim

ABSTRACT

Background:

In the case of Korean institutions and enterprises that collect non-standardized and non-unified formats of electronic medical examination results from multiple medical institutions, a group of experienced nurses in examination work has been established by classification guidelines based on important keywords and manually classifying individual test results to offer consistent services. However, there have been problems in which rule-based algorithms or human-labor-intensive works can be time-consuming or limited owing to high potential errors. We investigated natural language processing (NLP) architectures and proposed ensemble models to create automated classifiers.

Objective:

This study aimed to develop practical deep learning models with electronic medical records (EMRs) from 284 healthcare institutions and open-source corpus datasets for automatically classifying three thyroid condition labels: healthy, caution-required, and critical. The primary goal is to increase the overall accuracy of the classification, yet there are practical and industrial needs to correctly predict healthy (negative) thyroid condition data, which are most medical examination results, and minimize false-negative rates under the prediction of healthy thyroid condition.

Methods:

The datasets included thyroid and comprehensive medical examination reports. The textual data are not only documented in fully complete sentences, but also written in the format of a list of words or phrases. Therefore, we propose static and contextualized ensemble NLP-neTwork (SCENT) systems to successfully reflect static and contextual information and handle incomplete sentences. We prepared each convolutional neural network (CNN), long short-term memory (LSTM), and Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA) based ensemble model by training or fine-tuning them multiple times. Through comprehensive experiments, we propose two versions of ensemble models, SCENT-v1 and SCENT-v2, with the single-architecture-based CNN, LSTM, and ELECTRA ensemble models for the best classification performance and practical use, respectively. SCENT-v1 is an ensemble of CNN and ELECTRA ensemble models, and SCENT-v2 is a hierarchical ensemble of CNN, LSTM, and ELECTRA ensemble models. SCENT-v2 first classifies the three labels using an ELECTRA ensemble model and then reclassifies them using an ensemble model of CNN and LSTM ensemble models if the ELECTRA ensemble model predicted them as “healthy” labels.

Results:

SCENT-v1 outperformed all the suggested models, with the highest F1-score (92.56%). SCENT-v2 had the second-highest recall value (94.44%) and the fewest misclassifications for caution-required thyroid condition while maintaining zero classification error for the critical thyroid condition under the prediction of the healthy thyroid condition.

Conclusions:

The proposed SCENT demonstrates good classification performance despite the unique characteristics of Koreans and problems of data lack and imbalance, especially for the extremely low amount of critical condition data. The result of SCENT-v1 indicates that different perspectives of static and contextual input token representations can enhance classification performance. SCENT-v2 has a strong impact on the prediction of healthy thyroid conditions.


 Citation

Please cite as:

Shin D, Kam HJ, Jeon MS, Kim HY

Automatic Classification of Thyroid Findings Using Static and Contextualized Ensemble Natural Language Processing Systems: Development Study

JMIR Med Inform 2021;9(9):e30223

DOI: 10.2196/30223

PMID: 34546183

PMCID: 8493453

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.