Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Jul 25, 2020
Date Accepted: Jan 31, 2021
Date Submitted to PubMed: Feb 10, 2021
Revealing Opinions for COVID-19 Questions: Context Retriever and Opinion Aggregating Question-Answering
ABSTRACT
Background:
The COVID-19 has caused severe challenges to global public health because it is highly contagious and can be lethal. Numerous ongoing and recently published researches have emerged. However, the research regarding COVID-19 is largely ongoing and inconclusive.
Objective:
A potential approach to accelerate COVID-19 research is to borrow information from the existing researches of the other viruses that belong to the same coronavirus family. We develop a natural language processing method for answering factoid questions related to COVID-19 using published articles as knowledge sources.
Methods:
Given a question, first, a BM25 based context retriever model is implemented to select the most relevant passages from the articles. Second, for each selected context passage, an answer is obtained using a pre-trained BERT question-answering model. Third, an opinion aggregator, which is a combination of biterm topic model (BTM) and k-means clustering, is applied to aggregating all answers into several opinions.
Results:
We apply the proposed pipeline to extract answers, opinions and the most frequent words to six questions from the COVID-19 Open Research Dataset Challenge (CORD-19). By showing the longitudinal distributions of the opinions, we uncover the trends of opinions and popular words in the publications during four periods: before 1990, during 1990-2000, 2000-2010, 2011-2019, and after 2019. The changes in the opinions and popular words agree with several distinct characteristics and challenges of COVID-19, including a higher risk for senior people and people with pre-existing medical conditions, high contagion and rapid transmission, and more urgent need of screening and testing. The opinions and the popular words also provide additional insights for the COVID-19 related questions.
Conclusions:
Compared with other methods for literature retriever and answer generation, opinion aggregation in our method leads to more interpretable, robust and comprehensive question-specific literature reviews. The results demonstrate the usefulness of the proposed method in answering COVID-19 related questions with main opinions and capturing the trends of research about COVID-19 and other relevant strains of coronavirus in recent years.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.