Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Apr 29, 2020
Date Accepted: Nov 11, 2020

The final, peer-reviewed published version of this preprint can be found here:

Use of BERT (Bidirectional Encoder Representations from Transformers)-Based Deep Learning Method for Extracting Evidences in Chinese Radiology Reports: Development of a Computer-Aided Liver Cancer Diagnosis Framework

Chen H, Liu H, Wang N, Huang Y, Zhang Z, Xu Y, Jiang R, Yang Z

Use of BERT (Bidirectional Encoder Representations from Transformers)-Based Deep Learning Method for Extracting Evidences in Chinese Radiology Reports: Development of a Computer-Aided Liver Cancer Diagnosis Framework

J Med Internet Res 2021;23(1):e19689

DOI: 10.2196/19689

PMID: 33433395

PMCID: 7837998

Identifying Diagnosis Evidence of Liver Cancer in Chinese Radiology Reports Using BERT-based Deep Learning Method

  • Hui Chen; 
  • Honglei Liu; 
  • Ni Wang; 
  • Yanqun Huang; 
  • Zhiqiang Zhang; 
  • Yan Xu; 
  • Rui Jiang; 
  • Zhenghan Yang

ABSTRACT

Background:

Liver cancer remains to be a substantial disease burden in China. As one of the primary diagnostic means for liver cancer, the dynamic enhanced computed tomography (CT) scan provides detailed diagnosis evidence that is recorded in the free-text radiology reports.

Objective:

In this study, we combined knowledge-driven deep learning methods and data-driven natural language processing (NLP) methods to extract the radiological features from these reports, and designed a computer-aided liver cancer diagnosis framework.In this study, we combined knowledge-driven deep learning methods and data-driven natural language processing (NLP) methods to extract the radiological features from these reports, and designed a computer-aided liver cancer diagnosis framework.

Methods:

We collected 1089 CT radiology reports in Chinese. We proposed a pre-trained fine-tuning BERT (Bidirectional Encoder Representations from Transformers) language model for word embedding. The embedding served as the inputs for BiLSTM (Bidirectional Long Short-Term Memory) and CRF (Conditional Random Field) model (BERT-BiLSTM-CRF) to extract features of hyperintense enhancement in the arterial phase (APHE) and hypointense in the portal and delayed phases (PDPH). Furthermore, we also extracted features using the traditional rule-based NLP method based on the content of radiology reports. We then applied random forest for liver cancer diagnosis and calculated the Gini impurity for the identification of diagnosis evidence.

Results:

The BERT-BiLSTM-CRF predicted the features of APHE and PDPH with an F1 score of 98.40% and 90.67%, respectively. The prediction model using combined features had a higher performance (F1 score, 88.55%) than those using the single kind of features obtained by BERT-BiLSTM-CRF (84.88%) or traditional rule-based NLP method (83.52%). The features of APHE and PDPH were the top two essential features for the liver cancer diagnosis.

Conclusions:

We proposed a BERT-based deep learning method for diagnosis evidence extraction based on clinical knowledge. With the recognized features of APHE and PDPH, the liver cancer diagnosis could get a high performance, which was further increased by combining with the radiological features obtained by the traditional rule-based NLP method. The BERT-BiLSTM-CRF had achieved the state-of-the-art performance in this study, which could be extended to other kinds of Chinese clinical texts. Clinical Trial: None


 Citation

Please cite as:

Chen H, Liu H, Wang N, Huang Y, Zhang Z, Xu Y, Jiang R, Yang Z

Use of BERT (Bidirectional Encoder Representations from Transformers)-Based Deep Learning Method for Extracting Evidences in Chinese Radiology Reports: Development of a Computer-Aided Liver Cancer Diagnosis Framework

J Med Internet Res 2021;23(1):e19689

DOI: 10.2196/19689

PMID: 33433395

PMCID: 7837998

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.