Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Jul 8, 2020
Date Accepted: Sep 3, 2020
Date Submitted to PubMed: Sep 15, 2020
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
CoV-Seq: SARS-CoV-2 Genome Analysis and Visualization
ABSTRACT
Background:
COVID-19 has become a global pandemic not long after its inception in late 2019. SARS-CoV-2 genomes are being sequenced and shared on public repositories at a fast pace. To keep up with these updates, scientists need to frequently refresh and reclean datasets, which is ad hoc and labor-intensive. Further, scientists with limited bioinformatics or programming knowledge may find it difficult to analyze SARS-CoV-2 genomes.
Objective:
To address these challenges, we developed CoV-Seq, an integrated webserver to enable simple and rapid analysis of SARS-CoV-2 genomes.
Methods:
Given a new sequence, CoV-Seq automatically predicts gene boundaries and identifies genetic variants, which are displayed in an interactive genome visualizer and are downloadable for further analysis. A command-line interface is also available for high-throughput processing. Also, we aggregate all publicly available SARS-CoV-2 sequences from GISAID, NCBI, ENA, and CNGB, and extract genetic variants from these sequences for download and downstream analysis. The CoV-Seq database is updated weekly.
Results:
CoV-Seq is implemented in Python and Javascript. The web server is available at http://covseq.baidu.com/ and the source code is available from https://github.com/boxiangliu/covseq.
Conclusions:
We have developed CoV-Seq, an integrated web service for fast and easy analysis of custom SARS-CoV-2 sequences. The web server provides an interactive module for the analysis of custom sequences and weekly updated database of genetic variants from all publicly accessible SARS-CoV-2 sequences. We hope CoV-Seq will help improve our understanding of the genetic underpinnings of COVID-19.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.