Schema for Segmental Dups - Duplications of >1000 Bases of Non-RepeatMasked Sequence

Home
Genomes
Genome Browser
Tools
Mirrors
- Euro/Asia Mirrors
- Mirroring Instructions
- US Server
- European Server
- Asian Server
Downloads
My Data
Projects
Help
About Us
- News
- Publications
- Blog
- Cite Us
- Credits
- Release Log
- Staff
- Conditions of Use
- Our History
- Jobs
- Licenses
- Contact Us

field

example

SQL type

info

description

bin

585

smallint(6)

range

Indexing field to speed chromosome range queries.

chrom

chr1

varchar(255)

values

Reference sequence chromosome or scaffold

chromStart

10000

int(10) unsigned

range

Start position in chromosome

chromEnd

87112

int(10) unsigned

range

End position in chromosome

name

chr15:101906152

varchar(255)

values

Other chromosome involved

score

int(10) unsigned

range

Score based on the raw BLAST alignment score. Set to 0 and not used in later versions.

strand

char(1)

values

Value should be + or -

otherChrom

chr15

varchar(255)

values

Other chromosome or scaffold

otherStart

101906152

int(10) unsigned

range

Start in other sequence

otherEnd

101981189

int(10) unsigned

range

End in other sequence

otherSize

75037

int(10) unsigned

range

Total size of other chromosome

uid

11764

int(10) unsigned

range

Unique id shared by the query and subject

posBasesHit

1000

int(10) unsigned

range

For future use

testResult

N/A

varchar(255)

values

For future use

verdict

N/A

varchar(255)

values

For future use

chits

N/A

varchar(255)

values

For future use

ccov

N/A

varchar(255)

values

For future use

alignfile

align_both/0009/both0046049

varchar(255)

values

alignment file path

alignL

77880

int(10) unsigned

range

spaces/positions in alignment

indelN

int(10) unsigned

range

number of indels

indelS

3611

int(10) unsigned

range

indel spaces

alignB

74269

int(10) unsigned

range

bases Aligned

matchB

73743

int(10) unsigned

range

aligned bases that match

mismatchB

526

int(10) unsigned

range

aligned bases that do not match

transitionsB

331

int(10) unsigned

range

number of transitions

transversionsB

195

int(10) unsigned

range

number of transversions

fracMatch

0.992918

float

range

fraction of matching bases

fracMatchIndel

0.991969

float

range

fraction of matching bases with indels

jcK

0.00711601

float

range

K-value calculated with Jukes-Cantor

k2K

0.00711937

float

range

Kimura K

bin

chrom

chromStart

chromEnd

name

score

strand

otherChrom

otherStart

otherEnd

otherSize

uid

posBasesHit

testResult

verdict

chits

ccov

alignfile

alignL

indelN

indelS

alignB

matchB

mismatchB

transitionsB

transversionsB

fracMatch

fracMatchIndel

jcK

k2K

585

chr1

10000

87112

chr15:101906152

chr15

101906152

101981189

75037

11764

1000

N/A

align_both/0009/both0046049

77880

3611

74269

73743

526

331

195

0.992918

0.991969

0.00711601

0.00711937

585

chr1

10000

20818

chr12:10043

chr12

10043

20853

10810

7822

1000

N/A

align_both/0003/both0017002

10947

266

10681

10500

181

105

0.983054

0.980942

0.0171404

0.017154

585

chr1

10000

19844

chrX:156020216

chrX

156020216

156030574

10358

3851

1000

N/A

align_both/0014/both0071854

10437

672

9765

9598

167

107

0.982898

0.980889

0.0172999

0.0173215

585

chr1

10169

37148

chr1:180723

chr1

180723

207666

26943

1000

N/A

align_both/0014/both0071547

27025

128

26897

26628

269

164

105

0.989999

0.988896

0.0100684

0.0100743

585

chr1

10464

40733

chr2:113572720

chr2

113572720

113602409

29689

1980

1000

N/A

align_both/0014/both0071629

30313

668

29645

29285

360

233

127

0.987856

0.986426

0.0122431

0.0122543

585

chr1

10485

19844

chr16:10426

chr16

10426

19533

9107

14041

1000

N/A

align_both/0009/both0046306

9377

288

9089

8970

119

0.986907

0.985389

0.0132084

0.0132182

585

chr1

10485

40733

chr9:10843

chr9

10843

40515

29672

3547

1000

N/A

align_both/0014/both0071825

30274

628

29646

29475

171

107

0.994232

0.993729

0.00579036

0.00579252

585

chr1

18392

87112

chr19:60000

chr19

60000

128672

68672

21278

1000

N/A

align_both/0013/both0065164

68799

206

68593

68280

313

189

124

0.995437

0.994828

0.00457709

0.00457824

585

chr1

20863

40733

chr12:22923

chr12

22923

44900

21977

7824

1000

N/A

align_both/0003/both0017004

22015

2183

19832

19613

219

136

0.988957

0.987861

0.0111249

0.0111326

585

chr1

70007

87112

chr6:60000

chr6

60000

77075

17075

2927

1000

N/A

align_both/0014/both0071692

17136

17044

16908

136

0.992021

0.990916

0.0080221

0.00802624

Description

This track shows regions detected as putative genomic duplications within the golden path. The following display conventions are used to distinguish levels of similarity:

Light to dark gray: 90 - 98% similarity
Light to dark yellow: 98 - 99% similarity
Light to dark orange: greater than 99% similarity
Red: duplications of greater than 98% similarity that lack sufficient Segmental Duplication Database evidence (most likely missed overlaps)

For a region to be included in the track, at least 1 Kb of the total sequence (containing at least 500 bp of non-RepeatMasked sequence) had to align and a sequence identity of at least 90% was required.

Methods

Segmental duplications play an important role in both genomic disease and gene evolution. This track displays an analysis of the global organization of these long-range segments of identity in genomic sequence.

Large recent duplications (>= 1 kb and >= 90% identity) were detected by identifying high-copy repeats, removing these repeats from the genomic sequence ("fuguization") and searching all sequence for similarity. The repeats were then reinserted into the pairwise alignments, the ends of alignments trimmed, and global alignments were generated. For a full description of the "fuguization" detection method, see Bailey et al., 2001. This method has become known as WGAC (whole-genome assembly comparison); for example, see Bailey et al., 2002.

Credits

These data were provided by Ginger Cheng, Xinwei She, Archana Raja, Tin Louie and Evan Eichler at the University of Washington.

References

Bailey JA, Gu Z, Clark RA, Reinert K, Samonte RV, Schwartz S, Adams MD, Myers EW, Li PW, Eichler EE. Recent segmental duplications in the human genome. Science. 2002 Aug 9;297(5583):1003-7. PMID: 12169732

Bailey JA, Yavor AM, Massa HF, Trask BJ, Eichler EE. Segmental duplications: organization and impact within the current human genome project assembly. Genome Res. 2001 Jun;11(6):1005-17. PMID: 11381028; PMC: PMC311093