Targeted COVID-19 and Human Resource for Health News Information Extraction with a Multi-Component Deep Learning Framework
ABSTRACT
Background:
Global pandemics like COVID-19 put high strain on healthcare systems and health workers worldwide. These crises generate a vast amount of news information published online across the globe. This extensive corpus of articles has the potential to provide valuable insights into the nature of ongoing events and guide interventions and policies. However, the sheer volume of information is beyond the capacity of human experts to process and analyze effectively.
Objective:
The aim of this study was to explore how Natural Language Processing (NLP) can be leveraged to build a system that allows for quick analysis of a high volume of news articles. Along with this, the objective was to create a workflow comprising human-computer symbiosis to derive valuable insights to support health workforce strategic policy dialogue, advocacy and decision-making.
Methods:
We conducted a review of open-source news coverage from January 2020 to June 2022 on COVID-19 and its impacts on the health workforce from WHO Epidemic Intelligence through Open Sources (EIOS) by synergizing NLP models, including classification and extractive summarization, and human-generated analyses. Our DeepCovid system was trained on 2.8 million news articles in English from more than 3,000 Internet sources across hundreds of jurisdictions.
Results:
Rules-based classification with hand-designed rules narrows down the dataset to 8,508 articles with high relevancy confirmed in human-led evaluation. DeepCovid’s automated information targeting component reaches a very strong binary classification performance of 98.98 ROC-AUC and 47.21 PR-AUC. Its information extraction component attains a good performance in automatic extractive summarization with 47.76 mean ROUGE score. DeepCovid’s final summaries were used by human experts to write reports on the Covid-19 pandemic.
Conclusions:
It is feasible to synergize high-performing NLP models and human-generated analyses to benefit open-source health workforce intelligence. DeepCovid approach can contribute to an agile and timely global view, providing complementary information to scientific literature.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.