JMIR Preprints #37893: Table Correction: Identifying COVID-19 Outbreaks From Contact-Tracing Interview Forms for Public Health Departments: Development of a Natural Language Processing Pipeline

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Table Correction: Identifying COVID-19 Outbreaks From Contact-Tracing Interview Forms for Public Health Departments: Development of a Natural Language Processing Pipeline

John Caskey;
Iain L McConnell;
Madeline Oguss;
Dmitriy Dligach;
Rachel Kulikoff;
Brittany Grogan;
Crystal Gibson;
Elizabeth Wimmer;
Traci E DeSalvo;
Edwin E Nyakoe-Nyasani;
Matthew Churpek;
Majid Afshar

ABSTRACT

Background:

In Wisconsin, COVID-19 case interview forms contain free-text fields that need to be mined to identify potential outbreaks for targeted policy making. We developed an automated pipeline to ingest the free text into a pretrained neural language model to identify businesses and facilities as outbreaks.

Objective:

We aimed to examine the precision and recall of our natural language processing pipeline against existing outbreaks and potentially new clusters.

Methods:

Data on cases of COVID-19 were extracted from the Wisconsin Electronic Disease Surveillance System (WEDSS) for Dane County between July 1, 2020, and June 30, 2021. Features from the case interview forms were fed into a Bidirectional Encoder Representations from Transformers (BERT) model that was fine-tuned for named entity recognition (NER). We also developed a novel location-mapping tool to provide addresses for relevant NER. Precision and recall were measured against manually verified outbreaks and valid addresses in WEDSS.

Results:

There were 46,798 cases of COVID-19, with 4,183,273 total BERT tokens and 15,051 unique tokens. The recall and precision of the NER tool were 0.67 (95% CI 0.66-0.68) and 0.55 (95% CI 0.54-0.57), respectively. For the location-mapping tool, the recall and precision were 0.93 (95% CI 0.92-0.95) and 0.93 (95% CI 0.92-0.95), respectively. Across monthly intervals, the NER tool identified more potential clusters than were verified in WEDSS.

Conclusions:

We developed a novel pipeline of tools that identified existing outbreaks and novel clusters with associated addresses. Our pipeline ingests data from a statewide database and may be deployed to assist local health departments for targeted interventions.

Citation

Please cite as:

Caskey J, McConnell IL, Oguss M, Dligach D, Kulikoff R, Grogan B, Gibson C, Wimmer E, DeSalvo TE, Nyakoe-Nyasani EE, Churpek M, Afshar M

Correction: Identifying COVID-19 Outbreaks From Contact-Tracing Interview Forms for Public Health Departments: Development of a Natural Language Processing Pipeline

JMIR Public Health Surveill 2022;8(3):e37893

DOI: 10.2196/37893

PMID: 35324453

PMCID: 8990338

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Public Health and Surveillance

Date Submitted: Mar 10, 2022

Date Accepted: Mar 17, 2022

Table Correction: Identifying COVID-19 Outbreaks From Contact-Tracing Interview Forms for Public Health Departments: Development of a Natural Language Processing Pipeline

ABSTRACT

Citation

Copyright

JMIR Preprints

Accepted for/Published in: JMIR Public Health and Surveillance

Date Submitted: Mar 10, 2022

Date Accepted: Mar 17, 2022

Table Correction: Identifying COVID-19 Outbreaks From Contact-Tracing Interview Forms for Public Health Departments: Development of a Natural Language Processing Pipeline

ABSTRACT

Citation

Per the author's request the PDF is not available.

Copyright