135 species Basewise Conservation by PhyloP (phyloP135way)
 

Position: chrII:14,646,376-14,667,875

Total Bases in view: 21,500

Statistics on: 21,500 bases (% 100.0000 coverage)

Database: ce11 Table: phyloP135way
Chrom Data
start
Data
end
# of Data
values
Each data
value spans
# bases
Bases
covered
Minimum Maximum Range Mean Variance Standard
deviation
chrII 14646376 14667875 21,500 1 21,500 (100.00%) -6.501 20 26.501 3.21293 28.3532 5.32478

25 bin histogram on 21500 values (zero count bins not shown)
bin
range in bin
minimum maximum
count Relative
Frequency
log2(Frequency)Cumulative
Relative
Frequency
(CRF)
1.0 - CRF
0 -6.501 -5.44096 1 4.65116e-05 -14.392 4.65116e-05 0.999953
1 -5.44096 -4.38092 4 0.000186047 -12.392 0.000232558 0.999767
2 -4.38092 -3.32088 14 0.000651163 -10.5847 0.000883721 0.999116
3 -3.32088 -2.26084 56 0.00260465 -8.58469 0.00348837 0.996512
4 -2.26084 -1.2008 434 0.020186 -5.6305 0.0236744 0.976326
5 -1.2008 -0.14076 3729 0.173442 -2.52748 0.197116 0.802884
6 -0.14076 0.91928 6447 0.29986 -1.73764 0.496977 0.503023
7 0.91928 1.97932 2907 0.135209 -2.88673 0.632186 0.367814
8 1.97932 3.03936 1681 0.078186 -3.67695 0.710372 0.289628
9 3.03936 4.0994 1035 0.0481395 -4.37663 0.758512 0.241488
10 4.0994 5.15944 709 0.0329767 -4.92241 0.791488 0.208512
11 5.15944 6.21948 557 0.025907 -5.27052 0.817395 0.182605
12 6.21948 7.27952 424 0.0197209 -5.66413 0.837116 0.162884
13 7.27952 8.33956 386 0.0179535 -5.79959 0.85507 0.14493
14 8.33956 9.3996 312 0.0145116 -6.10665 0.869581 0.130419
15 9.3996 10.4596 304 0.0141395 -6.14412 0.883721 0.116279
16 10.4596 11.5197 281 0.0130698 -6.25762 0.896791 0.103209
17 11.5197 12.5797 299 0.013907 -6.16805 0.910698 0.0893023
18 12.5797 13.6398 296 0.0137674 -6.1826 0.924465 0.0755349
19 13.6398 14.6998 279 0.0129767 -6.26793 0.937442 0.0625581
20 14.6998 15.7598 224 0.0104186 -6.58469 0.94786 0.0521395
21 15.7598 16.8199 105 0.00488372 -7.6778 0.952744 0.0472558
25 20 21.06 1016 0.0472558 -4.40336 1 -2.22045e-16

View table schema

Go to Cons 135 species track controls

Data last updated at UCSC: 2018-12-14

Description

This track shows multiple alignments of 135 species: 112 nematodes, 22 flatworms and one Ciona intestinalis sequence and measurements of evolutionary conservation using two methods (phastCons and phyloP) from the PHAST package, for all 135 species. The multiple alignments were generated using multiz and other tools in the UCSC/Penn State Bioinformatics comparative genomics alignment pipeline. Conserved elements identified by phastCons are also displayed in this track.

The phylogenetic tree was derived from kmers in common counting between the sequences to obtain a 'distance' matrix, then using the phylip command 'neighbors' operation for the simple neighbor joining algorithm to establish this binary tree. This tree is not necessarily biologically correct, but it does serve as a useful guide tree for the multiz alignment procedure. See also: Phylip distance operations, assembly and alignment-free phylogeny reconstruction, and recapitulating phylogenies using k-mers.

PhastCons (which has been used in previous Conservation tracks) is a hidden Markov model-based method that estimates the probability that each nucleotide belongs to a conserved element, based on the multiple alignment. It considers not just each individual alignment column, but also its flanking columns. By contrast, phyloP separately measures conservation at individual columns, ignoring the effects of their neighbors. As a consequence, the phyloP plots have a less smooth appearance than the phastCons plots, with more "texture" at individual sites. The two methods have different strengths and weaknesses. PhastCons is sensitive to "runs" of conserved sites, and is therefore effective for picking out conserved elements. PhyloP, on the other hand, is more appropriate for evaluating signatures of selection at particular nucleotides or classes of nucleotides (e.g., third codon positions, or first positions of miRNA target sites).

Another important difference is that phyloP can measure acceleration (faster evolution than expected under neutral drift) as well as conservation (slower than expected evolution). In the phyloP plots, sites predicted to be conserved are assigned positive scores (and shown in blue), while sites predicted to be fast-evolving are assigned negative scores (and shown in red). The absolute values of the scores represent -log p-values under a null hypothesis of neutral evolution. The phastCons scores, by contrast, represent probabilities of negative selection and range between 0 and 1.

Both phastCons and phyloP treat alignment gaps and unaligned nucleotides as missing data.

See also: lastz parameters and other details, and chain minimum score and gap parameters used in these alignments.

Missing sequence in the assemblies is highlighted in the track display by regions of yellow when zoomed out and Ns displayed at base level (see Gap Annotation, below).

OrganismSpeciesAssembly namebrowser or
NCBI source
alignment type
C. elegansCaenorhabditis elegans Feb. 2013 (WBcel235/ce11) Feb. 2013 (WBcel235/ce11) reference
A. ceylanicumAncylostoma ceylanicum Mar. 2014 (WS243/Acey_2013.11.30.genDNA/ancCey1) Mar. 2014 (WS243/Acey_2013.11.30.genDNA/ancCey1) net
Acrobeloides_nanusAcrobeloides nanus Jun. 2018 (v1) GCA_900406225.1 net
Ancylostoma_caninumAncylostoma caninum Jul. 2018 (A_caninum_9.3.2.ec.cg.pg) GCA_003336725.1 net
Ancylostoma_duodenaleAncylostoma duodenale Jan. 2015 (A_duodenale_2.2.ec.cg.pg) GCA_000816745.1 net
Angiostrongylus_cantonensisAngiostrongylus cantonensis Nov. 2016 (ASM188428v1) GCA_001884285.1 net
Ascaris_suumAscaris suum Nov. 2017 (ASM18702v3) GCA_000187025.3 net
Barber pole wormHaemonchus contortus Jul. 2013 (WormBase WS239/haeCon2) Jul. 2013 (WormBase WS239/haeCon2) net
Brugia_malayiBrugia malayi Mar. 2008 (ASM299v2) GCF_000002995.3 net
Brugia_pahangiBrugia pahangi Sep. 2015 (Brugia_pa_1.0) GCA_001280985.1 net
Bursaphelenchus_xylophilusBursaphelenchus xylophilus Oct. 2011 (ASM23113v1) GCA_000231135.1 net
C. angariaCaenorhabditis angaria Apr. 2012 (WS232/ps1010rel8/caeAng2) Apr. 2012 (WS232/ps1010rel8/caeAng2) net
C. brenneriCaenorhabditis brenneri Nov. 2010 (C. brenneri 6.0.1b/caePb3) Nov. 2010 (C. brenneri 6.0.1b/caePb3) net
C. briggsaeCaenorhabditis briggsae Apr. 2011 (WS225/cb4) Apr. 2011 (WS225/cb4) net
C. intestinalisCiona intestinalis Apr. 2011 (Kyoto KH/ci3) Apr. 2011 (Kyoto KH/ci3) net
C. japonicaCaenorhabditis japonica Aug. 2010 (WUSTL 7.0.1/caeJap4) Aug. 2010 (WUSTL 7.0.1/caeJap4) net
C. remaneiCaenorhabditis remanei Jul. 2007 (WS220/caeRem4) Jul. 2007 (WS220/caeRem4) net
C. sp. 5 ju800Caenorhabditis sp5 ju800 Jan. 2012 (WS230/Caenorhabditis_sp_5-JU800-1.0/caeSp51) Jan. 2012 (WS230/Caenorhabditis_sp_5-JU800-1.0/caeSp51) net
C. tropicalisCaenorhabditis tropicalis Nov. 2010 (WS226/WUSTL 3.0.1/caeSp111) Nov. 2010 (WS226/WUSTL 3.0.1/caeSp111) net
C_briggsaeCaenorhabditis briggsae Jul. 2014 (CB4) GCA_000004555.3 net
C_latensCaenorhabditis latens Aug. 2017 (CaeLat1.0) GCA_002259235.1 net
C_nigoniCaenorhabditis nigoni Nov. 2017 (nigoni.pc_2016.07.14) GCA_002742825.1 net
C_sp21_LS_2015Caenorhabditis sp. 21 LS-2015 Aug. 2018 (CPARV_v1) GCA_900536235.1 net
C_sp26_LS_2015Caenorhabditis sp. 26 LS-2015 Aug. 2018 (CZANZ_v1) GCA_900536285.1 net
C_sp31_LS_2015Caenorhabditis sp. 31 LS-2015 Aug. 2018 (CUTEL_v1) GCA_900536295.1 net
C_sp32_LS_2015Caenorhabditis sp. 32 LS-2015 Aug. 2018 (CSULS_v1) GCA_900536325.1 net
C_sp34_TK_2017Caenorhabditis sp. 34 TK-2017 Jun. 2017 (Sp34_v7) GCA_003052745.1 net
C_sp38_MB_2015Caenorhabditis sp. 38 MB-2015 Aug. 2018 (CQUIO_v1) GCA_900536415.1 net
C_sp39_LS_2015Caenorhabditis sp. 39 LS-2015 Aug. 2018 (CWAIT_v1) GCA_900536345.1 net
C_sp40_LS_2015Caenorhabditis sp. 40 LS-2015 Aug. 2018 (CTRIB_v1) GCA_900536305.1 net
Clonorchis_sinensisClonorchis sinensis Nov. 2011 (C_sinensis-2.0) GCA_000236345.1 net
Dicrocoelium_dendriticumDicrocoelium dendriticum Sep. 2014 (D_dendriticum_Leon_v1_0_4) GCA_000950715.1 net
Dictyocaulus_viviparusDictyocaulus viviparus Mar. 2015 (D_viviparus_9.2.1.ec.pg) GCA_000816705.1 net
Diploscapter_coronatusDiploscapter coronatus Jun. 2017 (ASM220778v1) GCA_002207785.1 net
Diploscapter_pachysDiploscapter pachys Sep. 2017 (DipSp1Ass11Ann3) GCA_002287525.1 net
Dirofilaria_immitisDirofilaria immitis Aug. 2013 (ASM107739v1) GCA_001077395.1 net
Ditylenchus_destructorDitylenchus destructor Mar. 2016 (ASM157970v1) GCA_001579705.1 net
Dog heartwormDirofilaria immitis Sep. 2013 (WS240/D. immitis v2.2/dirImm1) Sep. 2013 (WS240/D. immitis v2.2/dirImm1) net
Dugesia_japonicaDugesia japonica Jan. 2017 (Djap_assembly_v1) GCA_001938525.1 net
Echinococcus_canadensisEchinococcus canadensis May 2016 (ECANG7) GCA_900004735.1 net
Echinococcus_granulosusEchinococcus granulosus Jan. 2014 (ASM52419v1) GCA_000524195.1 net
Echinococcus_multilocularisEchinococcus multilocularis Dec. 2015 (EMULTI002) GCA_000469725.3 net
Elaeophora_elaphiElaeophora elaphi Nov. 2013 (EEL001) GCA_000499685.1 net
Eye wormLoa loa Jul. 2012 (WS235/L_loa_Cameroon_isolate/loaLoa1) Jul. 2012 (WS235/L_loa_Cameroon_isolate/loaLoa1) net
Fasciola_giganticaFasciola gigantica Jan. 2018 (ASM286751v1) GCA_002867515.1 net
Fasciola_hepaticaFasciola hepatica Apr. 2018 (Fasciola_10x_pilon) GCA_900302435.1 net
Filarial wormBrugia malayi May. 2014 (WS244/B_malayi-3.1/bruMal2) May. 2014 (WS244/B_malayi-3.1/bruMal2) net
Girardia_tigrinaGirardia tigrina Jan. 2017 (gtig.1) GCA_001938485.1 net
Globodera_ellingtonaeGlobodera ellingtonae Sep. 2016 (ASM172322v1) GCA_001723225.1 net
Globodera_pallidaGlobodera pallida May 2014 (GPAL001) GCA_000724045.1 net
Globodera_rostochiensisGlobodera rostochiensis Apr. 2016 (nGr) GCA_900079975.1 net
Gyrodactylus_salarisGyrodactylus salaris Jun. 2014 (Gsalaris_v1) GCA_000715275.1 net
H. bacteriophora/m31eHeterorhabditis bacteriophora Aug. 2011 (WS229/H. bacteriophora 7.0/hetBac1) Aug. 2011 (WS229/H. bacteriophora 7.0/hetBac1) net
Haemonchus_contortusHaemonchus contortus Aug. 2013 (HCON) GCA_000469685.1 net
Heligmosomoides_polygyrus_bakeriHeligmosomoides polygyrus bakeri Sep. 2016 (nHp_v2.0) GCA_900096555.1 net
Heterodera_glycinesHeterodera glycines Apr. 2008 (HG2) GCA_000150805.1 net
Hymenolepis_microstomaHymenolepis microstoma Dec. 2015 (HMIC002) GCA_000469805.2 net
Loa_loaLoa loa Jul. 2012 (Loa_loa_V3.1) GCF_000183805.2 net
M. haplaMeloidogyne hapla Sep. 2008 (M. hapla VW9 WS210/melHap1) Sep. 2008 (M. hapla VW9 WS210/melHap1) net
M. incognitaMeloidogyne incognita Feb. 2008 (M. incognita WS245/PRJEA28837/melInc2) Feb. 2008 (M. incognita WS245/PRJEA28837/melInc2) net
Macrostomum_lignanoMacrostomum lignano Aug. 2017 (Mlig_3_7) GCA_002269645.1 net
Meloidogyne_arenariaMeloidogyne arenaria May 2018 (ASM313380v1) GCA_003133805.1 net
Meloidogyne_floridensisMeloidogyne floridensis Jun. 2014 (nMf_1_1) GCA_000751915.1 net
Meloidogyne_graminicolaMeloidogyne graminicola Nov. 2017 (Mgraminicola_V1) GCA_002778205.1 net
Meloidogyne_incognitaMeloidogyne incognita May 2017 (Meloidogyne_incognita_V3) GCA_900182535.1 net
Meloidogyne_javanicaMeloidogyne javanica Apr. 2017 (ASM90000394v1) GCA_900003945.1 net
MicrowormPanagrellus redivivus Feb. 2013 (WS240/Pred3/panRed1) Feb. 2013 (WS240/Pred3/panRed1) net
N. americanusNecator americanus Dec. 2013 (WS242/N_americanus_v1/necAme1) Dec. 2013 (WS242/N_americanus_v1/necAme1) net
Necator_americanusNecator americanus Dec. 2013 (N_americanus_v1) GCF_000507365.1 net
Nippostrongylus_brasiliensisNippostrongylus brasiliensis Aug. 2017 (NbL5_MIMR_Canu1.5) GCA_900200055.1 net
O. volvulusOnchocerca volvulus Nov. 2013 (WS241/O_volvulus_Cameroon_v3/oncVol1) Nov. 2013 (WS241/O_volvulus_Cameroon_v3/oncVol1) net
Oesophagostomum_dentatumOesophagostomum dentatum Dec. 2014 (O_dentatum_10.0.ec.cg.pg) GCA_000797555.1 net
Onchocerca_flexuosaOnchocerca flexuosa Aug. 2017 (O_flexuosa_1.0.allpaths.pg.lrna) GCA_002249935.1 net
Onchocerca_ochengiOnchocerca ochengi Mar. 2016 (O_ochengi_Ngaoundere) GCA_000950515.2 net
Onchocerca_volvulusOnchocerca volvulus Feb. 2014 (ASM49940v2) GCA_000499405.2 net
Opisthorchis_viverriniOpisthorchis viverrini Jul. 2014 (OpiViv1.0) GCA_000715545.1 net
Oscheius_MCBOscheius sp. MCB Feb. 2015 (ASM93487v1) GCA_000934875.1 net
Oscheius_TEL_2014Oscheius sp. TEL-2014 Jan. 2016 (ASM151353v1) GCA_001513535.1 net
Oscheius_tipulaeOscheius tipulae May 2017 (Oscheius_tipulae_assembly_v2) GCA_900184235.1 net
P. exspectatusPristionchus exspectatus Mar. 2014 (WS243/P_exspectatus_v1/priExs1) Mar. 2014 (WS243/P_exspectatus_v1/priExs1) net
P. pacificusPristionchus pacificus Aug. 2014 (WS221/P_pacificus-v2/priPac3) Aug. 2014 (WS221/P_pacificus-v2/priPac3) net
Parapristionchus_giblindavisiParapristionchus giblindavisi Jun. 2018 (Parapristionchus_genome) GCA_900491355.1 net
Parascaris_univalensParascaris univalens Aug. 2017 (ASM225921v1) GCA_002259215.1 net
Parastrongyloides_trichosuriParastrongyloides trichosuri Sep. 2014 (P_trichosuri_KNP) GCA_000941615.1 net
Pig roundwormAscaris suum Sep. 2012 (WS229/AscSuum_1.0/ascSuu1) Sep. 2012 (WS229/AscSuum_1.0/ascSuu1) net
Pine wood nematodeBursaphelenchus xylophilus Nov. 2011 (WS229/B. xylophilus Ka4C1/burXyl1) Nov. 2011 (WS229/B. xylophilus Ka4C1/burXyl1) net
Plectus_sambesiiPlectus sambesii Nov. 2017 (Psam_v1.0) GCA_002796945.1 net
Pristionchus_arcanusPristionchus arcanus Jun. 2018 (P._arcanus_genome) GCA_900490705.1 net
Pristionchus_entomophagusPristionchus entomophagus Jun. 2018 (P._entomophagus_genome) GCA_900490825.1 net
Pristionchus_exspectatusPristionchus exspectatus May 2018 (Pristionchus_exspectatus_de_novo_assembly) GCA_900380275.1 net
Pristionchus_maxplanckiPristionchus maxplancki Jun. 2018 (Prisstionchus_maxplancki_genome) GCA_900490775.1 net
Pristionchus_pacificusPristionchus pacificus Oct. 2017 (El_Paco) GCA_000180635.3 net
Rhabditophanes_KR3021Rhabditophanes sp. KR3021 Sep. 2014 (Rhabditophanes_sp_KR3021) GCA_000944355.1 net
Romanomermis_culicivoraxRomanomermis culicivorax Jan. 2014 (nRc.2.0) GCA_001039655.1 net
Rotylenchulus_reniformisRotylenchulus reniformis Jun. 2015 (RREN1.0) GCA_001026735.1 net
Schistosoma_haematobiumSchistosoma haematobium Jun. 2014 (SchHae_1.0) GCA_000699445.1 net
Schistosoma_japonicumSchistosoma japonicum Apr. 2009 (ASM15177v1) GCA_000151775.1 net
Schistosoma_mansoniSchistosoma mansoni Dec. 2011 (ASM23792v2) GCA_000237925.2 net
Schmidtea_mediterraneaSchmidtea mediterranea Oct. 2017 (ASM260089v1) GCA_002600895.1 net
Setaria_digitataSetaria digitata Jan. 2018 (Sdigitata) GCA_900083525.1 net
Setaria_equinaSetaria equina Mar. 2018 (Setequ3.0) GCA_003012265.1 net
Spirometra_erinaceieuropaeiSpirometra erinaceieuropaei Sep. 2014 (S_erinaceieuropaei) GCA_000951995.1 net
Steinernema_carpocapsaeSteinernema carpocapsae Sep. 2014 (S_carpo_v1) GCA_000757645.1 net
Steinernema_feltiaeSteinernema feltiae Sep. 2014 (S_felt_v1) GCA_000757705.1 net
Steinernema_glaseriSteinernema glaseri Sep. 2014 (S_glas_v1) GCA_000757755.1 net
Steinernema_monticolumSteinernema monticolum Dec. 2013 (S_monti_v1) GCA_000505645.1 net
Steinernema_scapterisciSteinernema scapterisci Sep. 2014 (S_scapt_v1) GCA_000757745.1 net
Strongyloides_papillosusStrongyloides papillosus Nov. 2014 (S_papillosus_LIN) GCA_000936265.1 net
Strongyloides_stercoralisStrongyloides stercoralis Nov. 2014 (S_stercoralis_PV0001) GCA_000947215.1 net
Strongyloides_venezuelensisStrongyloides venezuelensis Jun. 2015 (S_venezuelensis_HH1) GCA_001028725.1 net
Subanguina_moxaeSubanguina moxae Apr. 2015 (SAMX_assembly_v0.8) GCA_000981365.1 net
Taenia_asiaticaTaenia asiatica Sep. 2016 (Taenia_asiatica_TASYD01_v1) GCA_001693035.2 net
Taenia_multicepsTaenia multiceps Jul. 2018 (T_multiceps_v2.0) GCA_001923025.2 net
Taenia_saginataTaenia saginata Oct. 2016 (ASM169307v2) GCA_001693075.2 net
Taenia_soliumTaenia solium Nov. 2016 (MEX_genome_complete.1-6-13) GCA_001870725.1 net
Teladorsagia_circumcinctaTeladorsagia circumcincta Sep. 2017 (T_circumcincta.14.0.ec.cg.pg) GCA_002352805.1 net
ThreadwormStrongyloides ratti Sep. 2014 (S. ratti ED321/strRat2) Sep. 2014 (S. ratti ED321/strRat2) net
Toxocara_canisToxocara canis Dec. 2014 (Toxocara_canis_adult_r1.0) GCA_000803305.1 net
TrichinellaTrichinella spiralis Jan. 2011 (WS225/Trichinella_spiralis-3.7.1/triSpi1) Jan. 2011 (WS225/Trichinella_spiralis-3.7.1/triSpi1) net
Trichinella_T6Trichinella sp. T6 Nov. 2015 (T6_ISS34_r1.0) GCA_001447435.1 net
Trichinella_T8Trichinella sp. T8 Nov. 2015 (T8_ISS272_r1.0) GCA_001447745.1 net
Trichinella_T9Trichinella sp. T9 Nov. 2015 (T9_ISS409_r1.0) GCA_001447505.1 net
Trichinella_britoviTrichinella britovi Nov. 2015 (T3_ISS120_r1.0) GCA_001447585.1 net
Trichinella_murrelliTrichinella murrelli Jul. 2017 (ASM222148v1) GCA_002221485.1 net
Trichinella_nativaTrichinella nativa Nov. 2015 (T2_ISS10_r1.0) GCA_001447565.1 net
Trichinella_nelsoniTrichinella nelsoni Nov. 2015 (T7_ISS37_r1.0) GCA_001447455.1 net
Trichinella_papuaeTrichinella papuae Nov. 2015 (T10_ISS1980_r1.0) GCA_001447755.1 net
Trichinella_patagoniensisTrichinella patagoniensis Nov. 2015 (T12_ISS2496_r1.0) GCA_001447655.1 net
Trichinella_pseudospiralisTrichinella pseudospiralis Nov. 2015 (T4_ISS588_r1.0) GCA_001447725.1 net
Trichinella_spiralisTrichinella spiralis Jan. 2011 (Trichinella_spiralis-3.7.1) GCF_000181795.1 net
Trichinella_zimbabwensisTrichinella zimbabwensis Nov. 2015 (T11_ISS1029_r1.0) GCA_001447665.1 net
Trichuris_murisTrichuris muris Mar. 2014 (TMUE2.2) GCA_000612645.1 net
Trichuris_trichiuraTrichuris trichiura Mar. 2014 (TTRE2.1) GCA_000613005.1 net
WhipwormTrichuris suis Jul. 2014 (WS243/T. suis DCEP-RM93M male/triSui1) Jul. 2014 (WS243/T. suis DCEP-RM93M male/triSui1) net
Wuchereria_bancroftiWuchereria bancrofti Feb. 2016 (Wb_PNG_Genome_assembly_pt22) GCA_001555675.1 net

Table 1. Genome assemblies included in the 135-way Conservation track.

Downloads for data in this track are available:

Display Conventions and Configuration

The track configuration options allow the user to display the three different sets of scores by all, subclass, individually, or any combination of these. In full and pack display modes, conservation scores are displayed as a wiggle track (histogram) in which the height reflects the value of the score. The conservation wiggles can be configured in a variety of ways to highlight different aspects of the displayed information. Click the Graph configuration help link for an explanation of the configuration options.

Pairwise alignments of each species to the C. elegans genome are displayed below the conservation histogram as a grayscale density plot (in pack mode) or as a wiggle (in full mode) that indicates alignment quality. In dense display mode, conservation is shown in grayscale using darker values to indicate higher levels of overall conservation as scored by phastCons.

Checkboxes on the track configuration page allow selection of the species to include in the pairwise display. Configuration buttons are available to select all of the species (Set all), deselect all of the species (Clear all), or use the default settings (Set defaults). Note that excluding species from the pairwise display does not alter the the conservation score display.

To view detailed information about the alignments at a specific position, zoom the display in to 30,000 or fewer bases, then click on the alignment.

Gap Annotation

The Display chains between alignments configuration option enables display of gaps between alignment blocks in the pairwise alignments in a manner similar to the Chain track display. The following conventions are used:

  • Single line: No bases in the aligned species. Possibly due to a lineage-specific insertion between the aligned blocks in the C. elegans genome or a lineage-specific deletion between the aligned blocks in the aligning species.
  • Double line: Aligning species has one or more unalignable bases in the gap region. Possibly due to excessive evolutionary distance between species or independent indels in the region between the aligned blocks in both species.
  • Pale yellow coloring: Aligning species has Ns in the gap region. Reflects uncertainty in the relationship between the DNA of both species, due to lack of sequence in relevant portions of the aligning species.

Genomic Breaks

Discontinuities in the genomic context (chromosome, scaffold or region) of the aligned DNA in the aligning species are shown as follows:

  • Vertical blue bar: Represents a discontinuity that persists indefinitely on either side, e.g. a large region of DNA on either side of the bar comes from a different chromosome in the aligned species due to a large scale rearrangement.
  • Green square brackets: Enclose shorter alignments consisting of DNA from one genomic context in the aligned species nested inside a larger chain of alignments from a different genomic context. The alignment within the brackets may represent a short misalignment, a lineage-specific insertion of a transposon in the C. elegans genome that aligns to a paralogous copy somewhere else in the aligned species, or other similar occurrence.

Base Level

When zoomed-in to the base-level display, the track shows the base composition of each alignment. The numbers and symbols on the Gaps line indicate the lengths of gaps in the C. elegans sequence at those alignment positions relative to the longest non-C. elegans sequence. If there is sufficient space in the display, the size of the gap is shown. If the space is insufficient and the gap size is a multiple of 3, a "*" is displayed; other gap sizes are indicated by "+".

Codon translation is available in base-level display mode if the displayed region is identified as a coding segment. To display this annotation, select the species for translation from the pull-down menu in the Codon Translation configuration section at the top of the page. Then, select one of the following modes:

  • No codon translation: The gene annotation is not used; the bases are displayed without translation.
  • Use default species reading frames for translation: The annotations from the genome displayed in the Default species to establish reading frame pull-down menu are used to translate all the aligned species present in the alignment.
  • Use reading frames for species if available, otherwise no translation: Codon translation is performed only for those species where the region is annotated as protein coding.
  • Use reading frames for species if available, otherwise use default species: Codon translation is done on those species that are annotated as being protein coding over the aligned region using species-specific annotation; the remaining species are translated using the default species annotation.

Codon translation uses the following gene tracks as the basis for translation, depending on the species chosen (Table 2).

Gene TrackSpecies
Ensembl Genes v92C. elegans, Ciona intestinalis
WormBase WS245 genesC. angaria, C. japonica, C. briggsae, C. sp. 5 ju800, C. remanei, C. brenneri, C. tropicalis, P. exspectatus, P. pacificus, Pine wood nematode, N. americanus, A. ceylanicum, Pig roundworm, Barber pole worm, Whipworm, Microworm, Filarial worm, Dog heartworm, O. volvulus, Eye worm, M. incognita, M. hapla, H. bacteriophora/m31e, Trichinella
no annotationsall others
Table 2. Gene tracks used for codon translation.

Methods

Pairwise alignments with the C. elegans genome were generated for each species using lastz from repeat-masked genomic sequence. Pairwise alignments were then linked into chains using a dynamic programming algorithm that finds maximally scoring chains of gapless subsections of the alignments organized in a kd-tree. Please note the specific parameters for the alignments. High-scoring chains were then placed along the genome, with gaps filled by lower-scoring chains, to produce an alignment net. For more information about the chaining and netting process and parameters for each species, see the description pages for the Chain and Net tracks.

The resulting best-in-genome pairwise alignments were progressively aligned using multiz/autoMZ, following the tree topology diagrammed above, to produce multiple alignments. The multiple alignments were post-processed to add annotations indicating alignment gaps, genomic breaks, and base quality of the component sequences. The annotated multiple alignments, in MAF format, are available for bulk download. An alignment summary table containing an entry for each alignment block in each species was generated to improve track display performance at large scales. Framing tables were constructed to enable visualization of codons in the multiple alignment display.

Phylogenetic Tree Model

Both phastCons and phyloP are phylogenetic methods that rely on a tree model containing the tree topology, branch lengths representing evolutionary distance at neutrally evolving sites, the background distribution of nucleotides, and a substitution rate matrix. The all species tree model for this track was generated using the phyloFit program from the PHAST package (REV model, EM algorithm, medium precision) using multiple alignments of 4-fold degenerate sites extracted from the 135-way alignment (msa_view). The 4d sites were derived from the NCBI RefSeq gene set, filtered to select single-coverage long transcripts.

This same tree model was used in the phyloP calculations, however their background frequencies were modified to maintain reversibility. The resulting tree model for all species.

PhastCons Conservation

The phastCons program computes conservation scores based on a phylo-HMM, a type of probabilistic model that describes both the process of DNA substitution at each site in a genome and the way this process changes from one site to the next (Felsenstein and Churchill 1996, Yang 1995, Siepel and Haussler 2005). PhastCons uses a two-state phylo-HMM, with a state for conserved regions and a state for non-conserved regions. The value plotted at each site is the posterior probability that the corresponding alignment column was "generated" by the conserved state of the phylo-HMM. These scores reflect the phylogeny (including branch lengths) of the species in question, a continuous-time Markov model of the nucleotide substitution process, and a tendency for conservation levels to be autocorrelated along the genome (i.e., to be similar at adjacent sites). The general reversible (REV) substitution model was used. Unlike many conservation-scoring programs, phastCons does not rely on a sliding window of fixed size; therefore, short highly-conserved regions and long moderately conserved regions can both obtain high scores. More information about phastCons can be found in Siepel et al. 2005.

The phastCons parameters used were: expected-length=45, target-coverage=0.3, rho=0.3.

PhyloP Conservation

The phyloP program supports several different methods for computing p-values of conservation or acceleration, for individual nucleotides or larger elements ( http://compgen.cshl.edu/phast/). Here it was used to produce separate scores at each base (--wig-scores option), considering all branches of the phylogeny rather than a particular subtree or lineage (i.e., the --subtree option was not used). The scores were computed by performing a likelihood ratio test at each alignment column (--method LRT), and scores for both conservation and acceleration were produced (--mode CONACC).

Conserved Elements

The conserved elements were predicted by running phastCons with the --viterbi option. The predicted elements are segments of the alignment that are likely to have been "generated" by the conserved state of the phylo-HMM. Each element is assigned a log-odds score equal to its log probability under the conserved model minus its log probability under the non-conserved model. The "score" field associated with this track contains transformed log-odds scores, taking values between 0 and 1000. (The scores are transformed using a monotonic function of the form a * log(x) + b.) The raw log odds scores are retained in the "name" field and can be seen on the details page or in the browser when the track's display mode is set to "pack" or "full".

Credits

This track was created using the following programs:

  • Alignment tools: lastz (formerly blastz) and multiz by Minmei Hou, Scott Schwartz and Webb Miller of the Penn State Bioinformatics Group
  • Chaining and Netting: axtChain, chainNet by Jim Kent at UCSC
  • Conservation scoring: phastCons, phyloP, phyloFit, tree_doctor, msa_view and other programs in PHAST by Adam Siepel at Cold Spring Harbor Laboratory (original development done at the Haussler lab at UCSC).
  • MAF Annotation tools: mafAddIRows by Brian Raney, UCSC; mafAddQRows by Richard Burhans, Penn State; genePredToMafFrames by Mark Diekhans, UCSC
  • Tree image generator: phyloPng by Galt Barber, UCSC
  • Conservation track display: Kate Rosenbloom, Hiram Clawson (wiggle display), and Brian Raney (gap annotation and codon framing) at UCSC

The phylogenetic tree is based on Murphy et al. (2001) and general consensus in the vertebrate phylogeny community as of March 2007.

References

Phylo-HMMs, phastCons, and phyloP:

Felsenstein J, Churchill GA. A Hidden Markov Model approach to variation among sites in rate of evolution. Mol Biol Evol. 1996 Jan;13(1):93-104. PMID: 8583911

Pollard KS, Hubisz MJ, Rosenbloom KR, Siepel A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 2010 Jan;20(1):110-21. PMID: 19858363; PMC: PMC2798823

Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005 Aug;15(8):1034-50. PMID: 16024819; PMC: PMC1182216

Siepel A, Haussler D. Phylogenetic Hidden Markov Models. In: Nielsen R, editor. Statistical Methods in Molecular Evolution. New York: Springer; 2005. pp. 325-351.

Yang Z. A space-time process model for the evolution of DNA sequences. Genetics. 1995 Feb;139(2):993-1005. PMID: 7713447; PMC: PMC1206396

Chain/Net:

Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784

Multiz:

Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AF, Roskin KM, Baertsch R, Rosenbloom K, Clawson H, Green ED, et al. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 2004 Apr;14(4):708-15. PMID: 15060014; PMC: PMC383317

Lastz (formerly Blastz):

Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468

Harris RS. Improved pairwise alignment of genomic DNA. Ph.D. Thesis. Pennsylvania State University, USA. 2007.

Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961

Phylogenetic Tree:

Bernard G, Ragan MA, Chan CX. Recapitulating phylogenies using k-mers: from trees to networks. F1000Res. 2016;5:2789. PMID: 28105314

Fan H, Ives AR, Surget-Groba Y, Cannon CH. An assembly and alignment-free method of phylogeny reconstruction from next-generation sequencing data. BMC Genomics. 2015;16(1):522. PMID: 26169061

Murphy WJ, Eizirik E, O'Brien SJ, Madsen O, Scally M, Douady CJ, Teeling E, Ryder OA, Stanhope MJ, de Jong WW, Springer MS. Resolution of the early placental mammal radiation using Bayesian phylogenetics. Science. 2001 Dec 14;294(5550):2348-51. PMID: 11743200