Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Dec 31, 2019
Date Accepted: Mar 13, 2020

The final, peer-reviewed published version of this preprint can be found here:

Temporal Expression Classification and Normalization From Chinese Narrative Clinical Texts: Pattern Learning Approach

Pan X, Chen B, Weng H, Gong Y, Qu Y

Temporal Expression Classification and Normalization From Chinese Narrative Clinical Texts: Pattern Learning Approach

JMIR Med Inform 2020;8(7):e17652

DOI: 10.2196/17652

PMID: 32716307

PMCID: 7418025

TNorm: A Pattern Learning Approach for Temporal Expression Classification and Normalization from Chinese Narrative Clinical Texts

  • Xiaoyi Pan; 
  • Boyu Chen; 
  • Heng Weng; 
  • Yongyi Gong; 
  • Yingying Qu

ABSTRACT

Background:

Temporal information frequently exists in the representation of the disease progress, prescription, medication, the surgery progress, or discharge summary in narrative clinical text. The accurate extraction and normalization of temporal expressions can positively boost the analysis and understanding of narrative clinical texts so as to promote the clinical research and practice.

Objective:

The study is to propose a novel approach for extracting and normalizing temporal expressions from Chinese narrative clinical text.

Methods:

TNorm, a rule-based and pattern learning-based approach, has been developed for automatic temporal expression extraction and normalization from unstructured Chinese clinical text data. TNorm consists of three stages: extraction, classification, and normalization. It applies a set of heuristic rules and automatically-generated patterns for temporal expressions identification and extraction of clinical texts. Then, it collects the features of extracted temporal expressions for temporal type prediction and classification by using machine learning algorithms. Finally, the features are combined with the rule-based and a pattern learning-based approach to normalize the extracted temporal expressions.

Results:

The evaluation dataset is a set of narrative clinical texts in Chinese containing 1,459 discharge summaries of a domestic Grade-A Class-three hospital. The results present that TNorm, combined with temporal expressions extraction and temporal types prediction, achieves a precision of 0.8491, a recall of 0.8328, and a F1 score of 0.8409 in temporal expressions normalization.

Conclusions:

This study illustrates an automatic approach TNorm that extracts and normalizes temporal expression from Chinese narrative clinical texts. TNorm was evaluated on the basis of discharge summaries and demonstrated its effectiveness on temporal expression normalization with experiment results.


 Citation

Please cite as:

Pan X, Chen B, Weng H, Gong Y, Qu Y

Temporal Expression Classification and Normalization From Chinese Narrative Clinical Texts: Pattern Learning Approach

JMIR Med Inform 2020;8(7):e17652

DOI: 10.2196/17652

PMID: 32716307

PMCID: 7418025

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.