Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Feb 17, 2020
Date Accepted: Jan 17, 2021
Date Submitted to PubMed: Jan 18, 2021
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Automated Identification of Disease-Specific Clinical Outcomes Using Clinicaltrials.gov
ABSTRACT
Background:
Common clinical outcomes are vital for ensuring comparability of clinical trial data and enabling meta analyses and inter-study comparisons. Traditionally, the process of deciding which outcomes should be recommended as common for a particular disease relied on assembling and surveying panels of subject-matter experts. This is usually a time-consuming and laborious process.
Objective:
The objectives of this work are to develop and evaluate a generalized pipeline that can automatically identify common outcomes specific to any given disease by finding, downloading, and analyzing data of previous clinical trials relevant to that disease.
Methods:
An automated pipeline to interface with ClinicalTrials.gov’s API and download the relevant trials for the input condition was designed. The primary and secondary outcomes of those trials were parsed and grouped based on text similarity and ranked based on frequency. The quality of the pipeline’s output was assessed by comparing the top outcomes identified by it for Chronic Obstructive Pulmonary Disease (COPD) to a list of 79 outcomes manually abstracted from 3 frequently cited expert reviews delineating clinical outcomes for COPD.
Results:
The pipeline successfully downloaded and processed 3,876 studies related to COPD. Manual verification indicated that the pipeline was downloading and processing the same number of trials as what was obtained from the self-service ClinicalTrials.gov portal. Evaluating the automatically identified outcomes against the manually abstracted ones showed the pipeline achieved recall of 91% and precision of 77%. Assessment of most frequent pipeline outcomes that were not included in the reviews indicated that they were relevant to COPD and could have been considered in future research.
Conclusions:
An automated, evidence-based pipeline can identify clinical outcomes of comparable breadth and quality as the outcomes identified by the reviews. Moreover, such an approach can highlight relevant outcomes for further consideration.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.