Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Dec 23, 2018
Date Accepted: Mar 24, 2019
Crowdsourcing the Citation Screening Process for Systematic Reviews: Feasibility and Validation Study
ABSTRACT
Background:
Systematic reviews (SR) are often cited as the highest level of evidence available as they involve the identification and synthesis of published studies on a topic. Unfortunately, it is increasingly challenging for small teams to complete SR procedures in a reasonable time period, given the exponential rise in volume of primary literature. Crowdsourcing has been postulated as a potential solution.
Objective:
The feasibility objective was to determine whether an online crowd would be willing to perform and complete abstract and full text screening. The validation objective was to assess the quality of the crowd’s work, including retention of eligible citations (sensitivity) and work-performed to the investigative team, defined as the percentage of citations excluded by the crowd.
Methods:
We performed a prospective study evaluating the feasibility and validity of crowdsourcing essential components of a systematic review, including abstract screening, document retrieval, and full text assessment. Using CrowdScreenSR citation screening software, 2323 articles from 6 systematic reviews were available to an online crowd. Citations excluded by ≤ 75% of the crowd were moved forward for full text assessment. For the validation component, performance of the crowd was compared with citation review through the accepted, gold-standard, trained expert approach.
Results:
Of 312 potential crowd members, 117 (37.5%) commenced abstract screening, and 71 (22.8%) completed the minimum requirement of fifty citation assessments. The majority of participants were undergraduate or medical students (N=192, 61.5%), with some prior research experience (N= 220, 70.5%). The crowd screened 16988 abstracts (median: 8 per citation, IQR: 7 – 8), and all citations achieved the minimum of 4 assessments after a median of 42 days (IQR: 26 – 67). Crowd members retrieved 83.5% (N=774/927) of the articles that progressed to the full text phase. 7604 full text-assessments were completed (median: 7 per citation, IQR: 3 – 11). Citations from all but one review achieved the minimum of 4 assessments after a median of 36 days (IQR: 24 – 70), with one review remaining incomplete after 3 months. When complete crowd member agreement at both levels was required for exclusion, sensitivity was 100% (95%CI: 97.9 – 100%) and work-performed was calculated at 68.3% (95%CI: 66.4 – 70.1%). Using the pre-defined alternative 75% exclusion threshold, sensitivity remained 100%, and work-performed increased to 72.9% (95%CI: 71.0 – 74.6%, P<.001). Finally, when a simple majority threshold was considered, sensitivity decreased marginally to 98.9% (95%CI: 96.0 – 99.7%, P=0.25), and work-performed increased substantially to 80.4% (95%CI: 78.7 – 82.0%, P<.001).
Conclusions:
Crowdsourcing of citation screening for systematic reviews is feasible and has reasonable sensitivity and specificity. By expediting the screening process, crowdsourcing could permit the investigative team to focus on more complex SR tasks. Future directions should focus on developing a user-friendly online platform that allows research teams to crowdsource their reviews.
Citation
Per the author's request the PDF is not available.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.