Schema for LICR RNA-seq - RNA-seq from ENCODE/LICR
  Database: mm9    Primary Table: wgEncodeLicrRnaSeqCortexCellPapMAdult8wksC57bl6AlnRep1
BAM File: /gbdb/mm9/bbi/wgEncodeLicrRnaSeqCortexCellPapMAdult8wksC57bl6AlnRep1.bam
Format description: The fields of a SAM short read alignment, the text version of BAM.
See the SAM Format Specification for more details
fielddescription
qNameQuery template name - name of a read
flagFlags. 0x10 set for reverse complement. See SAM docs for others.
rNameReference sequence name (often a chromosome)
pos1 based position
mapQMapping quality 0-255, 255 is best
cigarCIGAR encoded alignment string.
rNextRef sequence for next (mate) read. '=' if same as rName, '*' if no mate
pNextPosition (1-based) of next (mate) sequence. May be -1 or 0 if no mate
tLenSize of DNA template for mated pairs. -size for one of mate pairs
seqQuery template sequence
qualASCII of Phred-scaled base QUALity+33. Just '*' if no quality scores
tagTypeValsTab-delimited list of tag:type:value optional extra fields

Sample Rows
 
qNameflagrNameposmapQcigarrNextpNexttLenseqqualtagTypeVals
SOLEXA2_0001:1:9:9603:10862#00chr1300145325536M*00ATGGGGTGTTTTTCTTTATGGGTCTGGGTTACTTCACCCAAB=CBBCC@@@@@@CCBDCCBDDBBDDDDDDDNM:C:0
SOLEXA2_0001:1:67:12011:5748#00chr13006757036M*00CTCGTTTTTGTGTTATTCCANATGAATCTGCCGATTCCCCCCCCC=BBCCC??AAA#BBBBBBBBBBBBBBBNM:C:2
SOLEXA2_0001:1:73:8501:3783#00chr1300868025536M*00CGCTTGTTTTGNGACCAATTATGTGGTAAATTTTGGCCCCCCCCCBA#BBBCCCCCCCCCCCCCCCCCCCCCNM:C:1
SOLEXA2_0001:1:58:14599:6722#016chr13013910036M*00ACAGAAGGTACTCTACCCAATTCCTTCTACGAANCCBBBBBBBBBBBBBBBBAAAAAAA'AABBDBB>>#>>NM:C:2
SOLEXA2_0001:1:88:12300:10435#00chr13016982036M*00CCATTCTGCTGGACCAGAGATTATTCGGCGGGAGTC:0::::7;77<;8>>00////00//AAA?ABB4B?8NM:C:2
SOLEXA2_0001:1:27:19128:11074#00chr1301782525536M*00TAATAGGTGTTGCTTGCATGTATATTGGTCCCATAGCCCCCCCBCCCCCCCCCCCCCCCCCCCCCCCCCCCCNM:C:0
SOLEXA2_0001:1:93:7958:21493#00chr13024375336M*00NGTTGATTTCAGCCTTGAGTTTGGTTATTTCCTGCA#*'*'22222AAAAAAAAAAAAAAAAAAAAAAAAAANM:C:1
SOLEXA2_0001:1:76:13399:19341#00chr13029049036M*00ATGAGAATGAATACTTTCAACTCAAGGGACCAGCAACCCCCC@CCCB+BBBCCCCCCCCCC4CCCCCCCCCCNM:C:2
SOLEXA2_0001:1:68:19007:13481#00chr13035458036M*00GTAATTACAGCGAGTGGGTCCTCGTGTTTCTTTGGGCCCCCCCCCCCCBCCCCCBCACCCBCDCCDDD@@BDNM:C:0
SOLEXA2_0001:1:27:12089:8015#00chr1303615725536M*00GGCACATAGGATTTGTAGTATATGCAATTTTGAGCA>;<;>////0AAABB>BBBABBBB@BBBBBBABB@BNM:C:0

LICR RNA-seq (wgEncodeLicrRnaSeq) Track Description
 

Description

Using RNA-seq (Mortazavi et al., 2008), high-resolution genome-wide maps of the mouse transcriptome in various mouse (C57BL/6) tissues, primary cells, cell lines of different developmental stage and age groups were generated.

Display Conventions and Configuration

This is a composite track that contains multiple data types (views). For each view, there are multiple subtracks that display individually on the browser. Instructions for configuring composite tracks are here. This track contains the following views:

Signal
Density graph (wiggle) of signal enrichment based on processed data.
Alignments
Mappings of short reads to the genome. See the SAM Format Specification for more information on the SAM/BAM file format.

Metadata for a particular subtrack can be found by clicking the down arrow in the list of subtracks.

Additional views are available on the Downloads page.

Methods

Cells were grown according to the approved ENCODE cell culture protocols.

RNA-seq
RNA samples from tissues and primary cells were extracted from Trizol® according to protocol (Invitrogen). Long PolyA+ RNA was purified with the Dynabeads mRNA purification kit (Invitrogen). The mRNA libraries were prepared for strand-specific sequencing as described previously (Parkhomchuk et al., 2009).

Sequencing and Analysis
Samples were sequenced on Illumina Genome Analyzer II, Genome Analyzer IIx and HiSeq 2000 platforms for 36 cycles. Image analysis, base calling and alignment to the mouse genome version NCBI37/mm9 were performed using Illumina's RTA. Alignment to the mouse genome was performed using TopHat (Trapnell et al., 2009). Wig files were generated by TopHat and expression levels were calculated with Cufflinks (Trapnell et al., 2010).

Release Notes

This is Release 2 (Mar 2012). It contains a total of 22 RNA-seq experiments with the addition of 12 new experiments.

Credits

These data were generated and analyzed in Bing Ren's laboratory at the Ludwig Institute for Cancer Research.

Contact: Yin Shen

References

Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008 Jul;5(7):621-8. PMID: 18516045

Parkhomchuk D, Borodina T, Amstislavskiy V, Banaru M, Hallen L, Krobitsch S, Lehrach H, Soldatov A. Transcriptome analysis by strand-specific sequencing of complementary DNA. Nucleic Acids Res. 2009 Oct;37(18):e123. PMID: 19620212; PMC: PMC2764448

Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009 May 1;25(9):1105-11. PMID: 19289445; PMC: PMC2672628

Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010 May;28(5):511-5. PMID: 20436464; PMC: PMC3146043

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column, above. The full data release policy for ENCODE is available here.