This track shows deletions that have been found in the sequences uploaded to the GISAID database as of June 6, 2020.
Three confidence levels of deletion calls are shown:
- deletions found in at least 1
- deletions found in at least 2 GISAID
- deletions found in at least 2 GISAID sequences that
were able to be validated with raw reads.
We accessed all GISAID SARS-CoV-2 sequences on June 6, 2020. We filtered to
high coverage reads encompassing the entire SARS-CoV-2 genome (>=29000 bps),
leaving 12,403 sequences.
We aligned the reads using MAFFT.
We validated several deletions with the raw reads from NCBI's SRA Run browser.
Additionally, NYU Langone Health provided us with the aligned reads for many of
The raw data can be explored interactively with the
Table Browser, combined with other datasets in the
Data Integrator tool,
or downloaded directly as "microdel.txt.gz" from
the download server.
Please refer to our
mailing list archives
for questions, or our
Data Access FAQ
for more information.
We thank all of the labs that submitted their sequences to the GISAID database.
The full acknowledgement table can be found at
We thank the public health laboratories VIDRL and MDU-PHL at The Peter Doherty Institute for
Infection and Immunity for providing over 1000 high quality raw reads to NCBI.
Thank you NYU Langone SARS-CoV2 Sequencing Team's Matthew T Maurano, Matija Snuderl, and
Adriana Heguy for providing many of their raw reads.
Chrisman, Brianna Sierra, Kelley Paskov, Nate Stockham, Kevin Tabatabaei, Jae-Yoon Jung, Peter Washington, Maya Varma, Min Woo Sun, Sepideh Maleki, and Dennis P. Wall. "Indels in SARS-CoV-2 occur at template-switching hotspots." BioData Mining 14, no. 1 (2021): 1-16. https://doi.org/10.1186/s13040-021-00251-0