This track displays amino acid and nucleotide mutations in three SARS-CoV-2 Variants of
Concern (B.1.1.7, B.1.351, and P.1) and one Variant of Interest (B.1.429),
as defined in late January 2021.
Mutations were identified from viral sequences at
Variant incidence and geographic distribution information is available from links to the
Outbreak.info web resource
on the track details pages.
The related track
B.1.1.7 in USA
displays a phylogenetic tree of the first B.1.1.7 variant sequences collected in the United States.
Track colors are based on
lineage (Rambaut et al.) conventions at the time this track was generated.
The Greek-letter names as defined by the World Health Organization (WHO) and published by
the CDC are listed in the table.
Mutations in the amino acid track are named with the format:
[Reference amino acid][1-based coordinate in peptide][Alternate amino acid]. E.g., L452R
Mutations in the nucleotide track are named with the format:
[Reference nucleotide][1-based coordinate in genome][Alternate nucleotide]. E.g., T22918G
Insertions and deletions in both tracks are named:
[del/ins]_[1-based genomic coordinate of first affected nucleotide]. E.g., del_21991
For each virus variant, SARS-CoV-2 genome sequences containing all characteristic
mutations of the lineage were downloaded from GISAID using the lineage search feature
(restricting to complete, high-coverage genomes, and restricting to earliest sample
collection dates when there were too many results for the download limit of 10,000
sequences per query).
Sequences were aligned to the
SARS-CoV-2 reference genome
script from the
Single-nucleotide substitutions were extracted from the alignment using the UCSC tool
(available on the download server
or from bioconda;
also requires the
SARS-CoV-2 reference sequence).
Single nucleotide substitutions present at a frequency
of at least 0.95 were retained while all others are discarded.
For indel detection, the
suite of tools was used as follows:
minimap2 --cs [Reference Sequence] [Set of Unaligned Sequences] | paftools.js call -L 10000 -
Indels present at a frequency of at least 0.85 were retained.
The results were then combined and formatted by
The entire pipeline was run using
You can download the data files for this track from the
UCSC Download Server.
The data can be explored interactively with the
or the Data Integrator. The data can be
accessed from scripts through our API.
This work is made possible by the open sharing of genetic data by research
groups from all over the world. We gratefully acknowledge their contributions.
at the Australia National University for developing and maintaining the
sarscov2phylo web resource.
We also thank
the Su, Wu, and Andersen labs at Scripps Research for creating the
The lineageVariants scripts were developed and run at UCSC by Nick Keener.
Rambaut A, Holmes EC, O'Toole Á, Hill V, McCrone JT, Ruis C, du Plessis L, Pybus OG.
A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology.
Nat Microbiol. 2020 Nov;5(11):1403-1407.
Rambaut A, Loman N, Pybus O, Barclay W, Barrett J,
Carabelli A, Connor T, Peacock T, Robertson DL, Volz E, et al.
Preliminary genomic characterisation of an emergent SARS-CoV-2 lineage in the UK
defined by a novel set of spike mutations.
Virological. 2020 Dec 18.
Volz E, Mishra S, Chand M, Barrett JC, Johnson E,
Geidelberg L, Hinsley WR, Laydon DJ, Dabrera G, O'Toole Á, et al.
Transmission of SARS-CoV-2 Lineage B.1.1.7 in England: Insights from linking
epidemiological and genetic data.
Virological. 2020 Dec 31.
Tegally et al, December 21, 2020.
Emergence and rapid spread of a new severe acute respiratory syndrome-related coronavirus 2 (SARS-CoV-2) lineage with multiple spike mutations in South Africa medRxiv preprint.
Zhang al, January 20, 2021.
Emergence of a novel SARS-CoV-2 strain in Southern California, USA medRxiv preprint.
Voloch et al, December 26, 2020.
Genomic characterization of a novel SARS-CoV-2 lineage from Rio de Janeiro, Brazil medRxiv preprint.
Lanfear, Rob (2020). A global phylogeny of SARS-CoV-2 sequences from GISAID. Zenodo DOI:
Minimap2: pairwise alignment for nucleotide sequences.
Bioinformatics. 2018 Sep 15;34(18):3094-3100.
PMID: 29750242; PMC: PMC6137996
Gangavarapu, Karthik; Alkuzweny, Manar; Cano, Marco; Haag, Emily; Latif, Alaa Abdel; Mullen, Julia L.; Rush, Benjamin; Tsueng, Ginger; Zhou, Jerry; Andersen, Kristian G.; Wu, Chunlei; Su, Andrew I.; Hughes, Laura D. Outbreak.info