Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Mar 30, 2023
Date Accepted: Jan 14, 2024
Real-World Data Quality Framework for Oncology Research: Implementation and Evaluation for an Oncology Time to Treatment Discontinuation Use Case
ABSTRACT
Background:
The importance of real-world evidence is widely recognized in oncology observational studies. However, the lack of interoperable data quality standards in the fragmented health information technology landscape represents an important challenge. Therefore, adopting validated, systematic methods for evaluating data quality is important for oncology outcomes research leveraging real-world data (RWD).
Objective:
This work aimed to implement real-world time to treatment discontinuation (rwTTD) for a systemic anticancer therapy (SACT) as a new use case for Use-case specific Relevance and Quality Assessment (UReQA), the framework linking data quality and relevance in fit-for-purpose RWD assessment.
Methods:
To define the rwTTD use case, we mapped the operational definition of rwTTD to RWD elements commonly available from oncology electronic health record (EHR)-derived datasets. We identified 20 tasks to check completeness and plausibility of data elements concerning SACT usage, line of therapy (LOT), death date, and length of follow up. Using descriptive statistics, we illustrated how to implement UReQA on two oncology databases (Datasets A and B) to estimate the rwTTD of an SACT drug (target SACT) for patients with advanced head and neck cancer diagnosed on/after January 1, 2015.
Results:
Twelve hundred of 4808 patients (25%) in Dataset A and 237 of 4003 patients (6%) in Dataset B received the target SACT, suggesting better relevance of the former to estimate rwTTD of the target SACT. The two datasets differed with regard to terminology used for SACT drugs, LOT format, and target SACT LOT distribution over time. Dataset B appeared to have less complete SACT records, longer lags in incorporating the latest data, and incomplete mortality data, suggesting lack of fitness for estimating rwTTD.
Conclusions:
Fit-for-purpose data quality assessment demonstrated substantial variability in quality of two real-world datasets. The data quality specifications applied for rwTTD estimation can be expanded to support a broad spectrum of oncology use cases.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.