Schema for LICR RNA-seq - RNA-seq from ENCODE/LICR

Home
Genomes
Genome Browser
Tools
Mirrors
- Euro/Asia Mirrors
- Mirroring Instructions
- US Server
- European Server
- Asian Server
Downloads
My Data
Projects
Help
About Us
- News
- Publications
- Blog
- Cite Us
- Credits
- Release Log
- Staff
- Conditions of Use
- Our History
- Jobs
- Licenses
- Contact Us

field

description

qName

Query template name - name of a read

flag

Flags. 0x10 set for reverse complement. See SAM docs for others.

rName

Reference sequence name (often a chromosome)

pos

1 based position

mapQ

Mapping quality 0-255, 255 is best

cigar

CIGAR encoded alignment string.

rNext

Ref sequence for next (mate) read. '=' if same as rName, '*' if no mate

pNext

Position (1-based) of next (mate) sequence. May be -1 or 0 if no mate

tLen

Size of DNA template for mated pairs. -size for one of mate pairs

seq

Query template sequence

qual

ASCII of Phred-scaled base QUALity+33. Just '*' if no quality scores

tagTypeVals

Tab-delimited list of tag:type:value optional extra fields

qName

flag

rName

pos

mapQ

cigar

rNext

pNext

tLen

seq

qual

tagTypeVals

SOLEXA2_0001:1:9:9603:10862#0

chr1

3001453

255

36M

ATGGGGTGTTTTTCTTTATGGGTCTGGGTTACTTCA

CCCAAB=CBBCC@@@@@@CCBDCCBDDBBDDDDDDD

NM:C:0

SOLEXA2_0001:1:67:12011:5748#0

chr1

3006757

36M

CTCGTTTTTGTGTTATTCCANATGAATCTGCCGATT

CCCCCCCCC=BBCCC??AAA#BBBBBBBBBBBBBBB

NM:C:2

SOLEXA2_0001:1:73:8501:3783#0

chr1

3008680

255

36M

CGCTTGTTTTGNGACCAATTATGTGGTAAATTTTGG

CCCCCCCCCBA#BBBCCCCCCCCCCCCCCCCCCCCC

NM:C:1

SOLEXA2_0001:1:58:14599:6722#0

chr1

3013910

36M

ACAGAAGGTACTCTACCCAATTCCTTCTACGAANCC

BBBBBBBBBBBBBBBBAAAAAAA'AABBDBB>>#>>

NM:C:2

SOLEXA2_0001:1:88:12300:10435#0

chr1

3016982

36M

CCATTCTGCTGGACCAGAGATTATTCGGCGGGAGTC

:0::::7;77<;8>>00////00//AAA?ABB4B?8

NM:C:2

SOLEXA2_0001:1:27:19128:11074#0

chr1

3017825

255

36M

TAATAGGTGTTGCTTGCATGTATATTGGTCCCATAG

CCCCCCCBCCCCCCCCCCCCCCCCCCCCCCCCCCCC

NM:C:0

SOLEXA2_0001:1:93:7958:21493#0

chr1

3024375

36M

NGTTGATTTCAGCCTTGAGTTTGGTTATTTCCTGCA

#*'*'22222AAAAAAAAAAAAAAAAAAAAAAAAAA

NM:C:1

SOLEXA2_0001:1:76:13399:19341#0

chr1

3029049

36M

ATGAGAATGAATACTTTCAACTCAAGGGACCAGCAA

CCCCCC@CCCB+BBBCCCCCCCCCC4CCCCCCCCCC

NM:C:2

SOLEXA2_0001:1:68:19007:13481#0

chr1

3035458

36M

GTAATTACAGCGAGTGGGTCCTCGTGTTTCTTTGGG

CCCCCCCCCCCCBCCCCCBCACCCBCDCCDDD@@BD

NM:C:0

SOLEXA2_0001:1:27:12089:8015#0

chr1

3036157

255

36M

GGCACATAGGATTTGTAGTATATGCAATTTTGAGCA

>;<;>////0AAABB>BBBABBBB@BBBBBBABB@B

NM:C:0

Description

Using RNA-seq (Mortazavi et al., 2008), high-resolution genome-wide maps of the mouse transcriptome in various mouse (C57BL/6) tissues, primary cells, cell lines of different developmental stage and age groups were generated.

Display Conventions and Configuration

This is a composite track that contains multiple data types (views). For each view, there are multiple subtracks that display individually on the browser. Instructions for configuring composite tracks are here. This track contains the following views:

Signal: Density graph (wiggle) of signal enrichment based on processed data.
Alignments: Mappings of short reads to the genome. See the SAM Format Specification for more information on the SAM/BAM file format.

Metadata for a particular subtrack can be found by clicking the down arrow in the list of subtracks.

Additional views are available on the Downloads page.

Methods

Cells were grown according to the approved ENCODE cell culture protocols.

RNA-seq

RNA samples from tissues and primary cells were extracted from Trizol® according to protocol (Invitrogen). Long PolyA+ RNA was purified with the Dynabeads mRNA purification kit (Invitrogen). The mRNA libraries were prepared for strand-specific sequencing as described previously (Parkhomchuk et al., 2009).

Sequencing and Analysis

Samples were sequenced on Illumina Genome Analyzer II, Genome Analyzer IIx and HiSeq 2000 platforms for 36 cycles. Image analysis, base calling and alignment to the mouse genome version NCBI37/mm9 were performed using Illumina's RTA. Alignment to the mouse genome was performed using TopHat (Trapnell et al., 2009). Wig files were generated by TopHat and expression levels were calculated with Cufflinks (Trapnell et al., 2010).

Release Notes

This is Release 2 (Mar 2012). It contains a total of 22 RNA-seq experiments with the addition of 12 new experiments.

Credits

These data were generated and analyzed in Bing Ren's laboratory at the Ludwig Institute for Cancer Research.

Contact: Yin Shen

References

Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008 Jul;5(7):621-8. PMID: 18516045

Parkhomchuk D, Borodina T, Amstislavskiy V, Banaru M, Hallen L, Krobitsch S, Lehrach H, Soldatov A. Transcriptome analysis by strand-specific sequencing of complementary DNA. Nucleic Acids Res. 2009 Oct;37(18):e123. PMID: 19620212; PMC: PMC2764448

Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009 May 1;25(9):1105-11. PMID: 19289445; PMC: PMC2672628

Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010 May;28(5):511-5. PMID: 20436464; PMC: PMC3146043

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column, above. The full data release policy for ENCODE is available here.