Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Dec 31, 2019
Date Accepted: May 28, 2020

The final, peer-reviewed published version of this preprint can be found here:

Medical Knowledge Graph to Enhance Fraud, Waste, and Abuse Detection on Claim Data: Model Development and Performance Evaluation

Xiao J, Sun H, Zhu W, He Y, Zhang S, Xu X, Hou L, Li J, Ni Y, Xie G

Medical Knowledge Graph to Enhance Fraud, Waste, and Abuse Detection on Claim Data: Model Development and Performance Evaluation

JMIR Med Inform 2020;8(7):e17653

DOI: 10.2196/17653

PMID: 32706714

PMCID: 7413281

Building a Medical Knowledge Graph to Enhance the Fraud, Waste and Abuse Detection on Claim Data

  • Jin Xiao; 
  • Haixia Sun; 
  • Wei Zhu; 
  • Yilong He; 
  • Sheng Zhang; 
  • Xiaowei Xu; 
  • Li Hou; 
  • Jiao Li; 
  • Yuan Ni; 
  • Guotong Xie

ABSTRACT

Background:

Detection of Fraud, Waste and Abuse (FWA) is an important yet challenging problem for the health insurance industry. A key step in the FWA detection is to check the clinical reasonableness of the claim, e.g. Whether the drug has an indication for the diagnosis. Currently, it requires human experts with enough medical knowledge to do this. To reduce the cost, there is a trend to build an intelligent system to automatically detect suspicious claims with impropriate diagnosis/medication. To implement such a system, the core foundation is a comprehensive and reliable medical knowledge graph (MKG).

Objective:

This study aimed to build a knowledge graph helping fraud, waste and abuse detection on claim data.

Methods:

In this paper, we defined the schema of a medical knowledge graph for the claim data processing. We applied Chinese natural language processing (NLP) methods to automatically build the medical knowledge graph from the unstructured knowledge sources such as insurance agreement documents, textbooks, drug labels, etc. In the instant identification, we integrated machine learning method with knowledge-driven method to identify symptoms, diagnosis, treatment, prognosis, examination and drug. In the relation identification, we used the distant supervision method to conduct the extraction relations including indication, contraindication, side effects, interaction, taboo, check, and sign. Then we developed a tool to enable the human­machine collaboration for knowledge graph fusion and knowledge graph quality control. Finally, we apply the MKG in the claim processing to assist the FWA detection.

Results:

We collected 185,796 drug labels from China Food and Drug Administration (CFDA), 88,892 kinds of disease information from medical textbooks including symptoms, diagnosis, treatment, prognosis, and 5,272 examinations and examination information as the knowledge sources. The final medical knowledge graph includes 1,616,549 nodes and 5,963,444 edges. We conducted extensive experiments to validate the accuracy of the named entity recognition (NER) and relation extraction (RE). Our NER method achieved F1 score of 0.83 and our RE obtained F1 score of 0.90. In addition, we randomly selected 100 claim documents and applied the MKG to detect the suspected claims. The MKG was assisted to detect 70% of the suspected claims.

Conclusions:

In this paper, we automatically constructed a medical knowledge graph (MKG) for FWA detection. The MKG based method successfully identified FWA suspected (such as fraud diagnosis, excess prescription and irrational prescription) from the claim documents, which helped to improve the efficiency of the claim processing.


 Citation

Please cite as:

Xiao J, Sun H, Zhu W, He Y, Zhang S, Xu X, Hou L, Li J, Ni Y, Xie G

Medical Knowledge Graph to Enhance Fraud, Waste, and Abuse Detection on Claim Data: Model Development and Performance Evaluation

JMIR Med Inform 2020;8(7):e17653

DOI: 10.2196/17653

PMID: 32706714

PMCID: 7413281

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.