Schema for RepeatMasker - Repeating Elements by RepeatMasker
  Database: gasAcu1    Primary Table: rmsk (chrI_rmsk)    Row Count: 13,580   Data last updated: 2006-08-17
Format description: RepeatMasker .out record
fieldexampleSQL type info description
bin 585smallint(5) unsigned range Indexing field to speed chromosome range queries.
swScore 322int(10) unsigned range Smith Waterman alignment score
milliDiv 259int(10) unsigned range Base mismatches in parts per thousand
milliDel 0int(10) unsigned range Bases deleted in parts per thousand
milliIns 0int(10) unsigned range Bases inserted in parts per thousand
genoName chrIvarchar(255) values Genomic sequence name
genoStart 2653int(10) unsigned range Start in genomic sequence
genoEnd 2738int(10) unsigned range End in genomic sequence
genoLeft -28183176int(11) range -#bases after match in genomic sequence
strand +char(1) values Relative orientation + or -
repName (TAG)nvarchar(255) values Name of repeat
repClass Simple_repeatvarchar(255) values Class of repeat
repFamily Simple_repeatvarchar(255) values Family of repeat
repStart 2int(11) range Start (if strand is +) or -#bases after match (if strand is -) in repeat sequence
repEnd 86int(11) range End in repeat sequence
repLeft 0int(11) range -#bases after match (if strand is +) or start (if strand is -) in repeat sequence
id 1char(1) values First digit of id field in RepeatMasker .out file. Best ignored.

Sample Rows
 
binswScoremilliDivmilliDelmilliInsgenoNamegenoStartgenoEndgenoLeftstrandrepNamerepClassrepFamilyrepStartrepEndrepLeftid
58532225900chrI26532738-28183176+(TAG)nSimple_repeatSimple_repeat28601
585302700chrI29743011-28182903+AT_richLow_complexityLow_complexity13702
585215700chrI30343069-28182845+AT_richLow_complexityLow_complexity13503
58519786054chrI30793116-28182798+(TAAA)nSimple_repeatSimple_repeat33704
585594000chrI37003766-28182148+(CA)nSimple_repeatSimple_repeat26705
58522513200chrI51035141-28180773+(TAG)nSimple_repeatSimple_repeat13806
58524726600chrI54565520-28180394+(TAG)nSimple_repeatSimple_repeat26507
58518477510chrI57145753-28180161+(CAGAGA)nSimple_repeatSimple_repeat44408
585855000chrI89489043-28176871+(CA)nSimple_repeatSimple_repeat19509
585194174021chrI91439190-28176724+GA-richLow_complexityLow_complexity34801

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.

RepeatMasker (rmsk) Track Description
 

Description

This track was created by using Arian Smit's RepeatMasker program, which screens DNA sequences for interspersed repeats and low complexity DNA sequences. The program outputs a detailed annotation of the repeats that are present in the query sequence (represented by this track), as well as a modified version of the query sequence in which all the annotated repeats have been masked (generally available on the Downloads page). RepeatMasker uses the Repbase Update library of repeats from the Genetic Information Research Institute (GIRI). Repbase Update is described in Jurka (2000) in the References section below.

Display Conventions and Configuration

In full display mode, this track displays up to ten different classes of repeats:

  • Short interspersed nuclear elements (SINE), which include ALUs
  • Long interspersed nuclear elements (LINE)
  • Long terminal repeat elements (LTR), which include retroposons
  • DNA repeat elements (DNA)
  • Simple repeats (micro-satellites)
  • Low complexity repeats
  • Satellite repeats
  • RNA repeats (including RNA, tRNA, rRNA, snRNA, scRNA, srpRNA)
  • Other repeats, which includes class RC (Rolling Circle)
  • Unknown

The level of color shading in the graphical display reflects the amount of base mismatch, base deletion, and base insertion associated with a repeat element. The higher the combined number of these, the lighter the shading.

A "?" at the end of the "Family" or "Class" (for example, DNA?) signifies that the curator was unsure of the classification. At some point in the future, either the "?" will be removed or the classification will be changed.

Methods

Data are generated using the RepeatMasker -s flag. Additional flags may be used for certain organisms. Repeats are soft-masked. Alignments may extend through repeats, but are not permitted to initiate in them. See the FAQ for more information.

Credits

Thanks to Arian Smit, Robert Hubley and GIRI for providing the tools and repeat libraries used to generate this track.

References

Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0. http://www.repeatmasker.org. 1996-2010.

Repbase Update is described in:

Jurka J. Repbase Update: a database and an electronic journal of repetitive elements. Trends Genet. 2000 Sep;16(9):418-420. PMID: 10973072

For a discussion of repeats in mammalian genomes, see:

Smit AF. Interspersed repeats and other mementos of transposable elements in mammalian genomes. Curr Opin Genet Dev. 1999 Dec;9(6):657-63. PMID: 10607616

Smit AF. The origin of interspersed repeats in the human genome. Curr Opin Genet Dev. 1996 Dec;6(6):743-8. PMID: 8994846