Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Jul 27, 2020
Date Accepted: Oct 26, 2020

The final, peer-reviewed published version of this preprint can be found here:

Measurement of Semantic Textual Similarity in Clinical Texts: Comparison of Transformer-Based Models

Yang X, Ma Y, He X, Zhang H, Bian J, Wu Y

Measurement of Semantic Textual Similarity in Clinical Texts: Comparison of Transformer-Based Models

JMIR Med Inform 2020;8(11):e19735

DOI: 10.2196/19735

PMID: 33226350

PMCID: 7721552

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Measuring Semantic Textual Similarity in Clinical Text: A Study of Transformer-based Models

  • Xi Yang; 
  • Yinghan Ma; 
  • Xing He; 
  • Hansi Zhang; 
  • Jiang Bian; 
  • Yonghui Wu

ABSTRACT

Background:

Semantic textual similarity (STS) is one of the fundamental tasks in natural language processing (NLP). Many shared tasks and corpora for STS have been organized in the general English domain; yet, such resources are limited in the biomedical domain. In 2019, the n2c2 challenge developed a comprehensive clinical STS dataset and called for a community effort to solicit state-of-the-art solutions for clinical STS.

Objective:

Based on our participation in this challenge, this study presents our transformer-based clinical STS models developed during this challenge as well as new models we explored after the challenge. This project is part of the 2019 N2C2/OHNLP shared task on clinical STS.

Methods:

In this study, we explored three transformer-based models, including BERT, XLNet, and RoBERTa for clinical STS. We examined transformer models pre-trained using both general English text and clinical text. We also explored using a general English STS dataset as a supplementary corpus in addition to the clinical training set developed in this challenge. Furthermore, we also investigated various ensemble methods to combine different transformer models.

Results:

Our best submission based on the XLNet model achieved the third-best performance (Pearson correlation of 0.8864) in this challenge. After challenge, we further explored other transformer models and improved the performance to 0.9065 using a RoBERTa model, which outperformed the best-performed system developed in this challenge (correlation of 0.9010).

Conclusions:

This study demonstrated the efficiency of utilizing transformer-based models to measure semantic similarity for clinical text. Our models can be applied to clinical applications such as clinical text de-duplication and summarization.


 Citation

Please cite as:

Yang X, Ma Y, He X, Zhang H, Bian J, Wu Y

Measurement of Semantic Textual Similarity in Clinical Texts: Comparison of Transformer-Based Models

JMIR Med Inform 2020;8(11):e19735

DOI: 10.2196/19735

PMID: 33226350

PMCID: 7721552

Per the author's request the PDF is not available.