Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Dec 31, 2019
Date Accepted: May 28, 2020
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Building a Medical Knowledge Graph to Enhance the Fraud, Waste and Abuse Detection on Claim Data
ABSTRACT
Background:
Detection of Fraud, Waste and Abuse (FWA) is an important yet challenging problem for the health insurance industry. A key step in the FWA detection is to check the clinical reasonableness of the claim, e.g. Whether the drug has an indication for the diagnosis. Currently, it requires human experts with enough medical knowledge to do this. To reduce the cost, there is a trend to build an intelligent system to automatically detect suspicious claims with impropriate diagnosis/medication. To implement such a system, the core foundation is a comprehensive and reliable medical knowledge graph (MKG).
Objective:
This study aimed to build a knowledge graph helping fraud, waste and abuse detection on claim data.
Methods:
In this paper, we defined the schema of a medical knowledge graph for the claim data processing. We applied Chinese natural language processing (NLP) methods to automatically build the medical knowledge graph from the unstructured knowledge sources such as insurance agreement documents, textbooks, drug labels, etc. In the instant identification, we integrated machine learning method with knowledge-driven method to identify symptoms, diagnosis, treatment, prognosis, examination and drug. In the relation identification, we used the distant supervision method to conduct the extraction relations including indication, contraindication, side effects, interaction, taboo, check, and sign. Then we developed a tool to enable the humanmachine collaboration for knowledge graph fusion and knowledge graph quality control. Finally, we apply the MKG in the claim processing to assist the FWA detection.
Results:
We collected 185,796 drug labels from China Food and Drug Administration (CFDA), 88,892 kinds of disease information from medical textbooks including symptoms, diagnosis, treatment, prognosis, and 5,272 examinations and examination information as the knowledge sources. The final medical knowledge graph includes 1,616,549 nodes and 5,963,444 edges. We conducted extensive experiments to validate the accuracy of the named entity recognition (NER) and relation extraction (RE). Our NER method achieved F1 score of 0.83 and our RE obtained F1 score of 0.90. In addition, we randomly selected 100 claim documents and applied the MKG to detect the suspected claims. The MKG was assisted to detect 70% of the suspected claims.
Conclusions:
In this paper, we automatically constructed a medical knowledge graph (MKG) for FWA detection. The MKG based method successfully identified FWA suspected (such as fraud diagnosis, excess prescription and irrational prescription) from the claim documents, which helped to improve the efficiency of the claim processing.
Citation
The author of this paper has made a PDF available, but requires the user to login, or create an account.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.