Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Nov 18, 2024
Date Accepted: Mar 28, 2026
Date Submitted to PubMed: Apr 21, 2026
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
DQA4M: Application of a Data Quality Assessment Process for Medical Research Using Electronic Health Records
ABSTRACT
Background:
With the convergence of healthcare and information technology, a variety of healthcare data are being generated within the medical industry. Healthcare data contains valuable information that is closely related to patient’s information such as treatment and surgery information, and it has a highly useful value. So, now there are active efforts to utilize healthcare data. However, since healthcare data is generated and collected from various sources for different purposes, many data quality issues have been reported during its utilization. Despite efforts from various organizations and researchers, including national institutions, to address these data quality issues, it remains highly challenging.
Objective:
This study applies data quality assessment tools to real-world electronic health record (EHR) data to analysis the quality assessment process and proposes a data quality assessment procedure for medical research.
Methods:
In this study, we evaluated the quality of EMR CDM data at Gachon University Gil Hospital using several tools: including OHDSI’s ACHILLES and DataQualityDashboard, which aim to improve health outcomes through medical data analysis, the DQe-c tool, developed by Harvard Medical School to evaluate the quality of EHR data repositories, and the MOA-DQM, established by the Korea Institute of Drug Safety & Risk Management to assess the quality of CDM data for drug safety management. Then, Through the process of conducting real-world healthcare EHR data quality assessments, we identified features, elements, and processes for data quality evaluation. Based on this, we proposed a Data Quality Assessment Process for Medical Research (DQA4M).
Results:
The DQA4M process was developed, consisting of four stages: Data quality definition, Quality validation rules, Quality assessment, Assessment report. Following this, the meaning of data quality was defined, and detailed validation rules were established for seven fundamental quality dimensions: Conformance, Consistency, Completeness, Uniqueness, Plausibility, Validity and Accuracy. Ultimately, data quality assessment items for healthcare research were developed in accordance with the DQA4M process.
Conclusions:
Healthcare data hold great potential for diverse applications. The DQA4M process enhances the reliability of quality assessments and improves data usability, ultimately contributing to better utilization of healthcare data.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.