Contigs Track Settings
 
Assembly Contigs   (All Mapping and Sequencing tracks)

Display mode:      Duplicate track
Data schema/format description and download
Assembly: Rhesus Jan. 2006 (MGSC Merged 1.0/rheMac2)
Data last updated at UCSC: 2006-02-15

Description

This track shows the whole genome shotgun (WGS) contigs of the The macaque genome assembly (v. 1.0, Mmul051212) from the Rhesus Macaque Genome Sequencing Consortium.

The Mmul_051212 release was produced by the Macaque Genome Sequencing Consortium led by the Baylor College of Medicine Human Genome Sequencing Center in collaboration with the J. Craig Venter Joint Technology Center and the Genome Sequencing Center at Washington University in St. Louis. Each group carried out a preliminary assembly of the genome data using different and complementary approaches and the resulting data were melded into the Mmul_051212 assembly. The Atlas, Celera, and PCAP assembly systems were used in the different preliminary assemblies. The resulting data were joined in a melded assembly by the assembly team led by Granger Sutton (JCVI). This collaborative venture also made use of published macaque maps (Rogers, 2006; Murphy, 2005), the BAC fingerprint map from the Michael Smith Genome Sciences Centre, and human synteny.

Several WGS libraries, with inserts of 2-4 kb and 10 kb, fosmids with ~35 kb inserts, and BACs with 180 kb inserts were used to produce the clone-end sequence data. About 20.1 million reads were used in the assembly, representing about 14.9 Gb of sequence and about 5.1x coverage of the (clonable) macaque genome.

The products of the assemblers are a set of contigs and scaffolds. Scaffolds include sequence contigs that can be ordered and oriented with respect to each other as well as isolated contigs that could not be linked (single contig scaffolds or singletons). Reads that were not assembled are found in the collection of reads called UnassembledReads. The N50 of the contigs is 25.7 kb and the N50 of the scaffolds is 5.87 Mb. The N50 size is the length such that 50% of the assembled genome lies in blocks of the N50 size or longer.

The total length of all contigs is 2.87 Gb. When the gaps between contigs in scaffolds are included, the total span of the assembly is 3.01 Gb.

The Mmul_051212 assembly was tested by comparison with other available macaque data sets (finished BAC sequences and ESTs) to determine the extent of coverage (completeness). Over 97% of the sequences in these data sets are represented, indicating that the shotgun libraries used to sequence the genome were comprehensive.

The quality of the assembly was tested by aligning finished BACs (19 total) to the genome. All alignments showed linear alignment of the contigs and scaffolds to the finished BACs, suggesting misassemblies are rare.