|
|
|
About
the ENCODE Data Coordination Center (DCC) |
|
|
|
|
The Encyclopedia of
DNA Elements (ENCODE) Consortium is an international collaboration
of research groups funded by the National Human Genome Research
Institute (NHGRI).
The goal of the consortium is to build a comprehensive parts list of
the functional elements of the human genome, including elements that
act at the protein level (coding genes) and RNA level (non-coding
genes), and regulatory elements that control the cells and
circumstances in which a gene is active. The discovery and annotation
of gene elements is accomplished primarily by sequencing RNA from a
diverse range of sources, comparative genomics, integrative
bioinformatic methods, and human curation. Regulatory elements are
typically investigated through DNA hypersensitivity assays, assays of
DNA methylation, and chromatin immunoprecipitation (ChIP) of proteins
that interact with DNA, including modified histones and transcription
factors, followed by sequencing (ChIP-Seq). The results of ENCODE
experiments, collected in the ENCODE DCC database, are displayed on the
UCSC Genome Browser. The data can also be downloaded from the ENCODE
DCC website in text format.
To access ENCODE data, open the Genome Browser, select the
March 2006 assembly of the human genome, and go to your region of
interest. ENCODE tracks will be marked with the NHGRI logo . The bulk of
the ENCODE data can be found in the Expression and Regulation
track groups, with a few in the Mapping, Genes, and
Variation groups. Although most
participating research groups have provided several tracks, generally
only selected data from each research group are displayed by default.
Click the hyperlinked name of a particular track to display a page
containing configuration options and details about the methods used to
generate the data. See the Genome Browser User's Guide for
further information about displaying tracks and navigating in the
Genome Browser. To receive notifications of ENCODE data releases and
related news by email, subscribe to the encode-announce mailing list.
Data from the earlier pilot phase of
the ENCODE project are available on the May 2004 and March 2006 human
assemblies. These datasets are generally available only in the initial
ENCODE-targeted regions that covered approximately 1% of the genome.
The ENCODE Pilot Project web pages provide
convenient browser access to these regions.
Before publishing research that uses
ENCODE data, please read the data release policy,
which
places some restrictions on publication use of data for nine
months following the data release.
|
|
|
|
| News |
|
|
|
|
|
|
|
6 January 2010 - December 2009 ENCODE data
releases
Initial release of the Caltech
RNA-seq track: This track contains sequence reads and RPKM
transcript abundance measures for sequences that map to either the
genome or to known RNA splice sites. The results of four
different mapping algorithms are provided, enabling comparison between
different mapping algorithms. Results are available for polyA+
and total RNA for the two ENCODE Tier 1 cell lines.
Release
2 of the Broad
Histone track: This track displays maps of chromatin state
generated using CHIP-seq. Release 2 adds data for the ENCODE Tier
2 cell lines H1-hESC and HepG2, plus NHLF (normal human lung
fibroblasts) and HMEC (human mammary epithelial) cells. This
expands the track data to 9 cell lines, and 11 antibodies plus
an input control.
Release
2 of the CSHL
Long
RNA-seq track: This track depicts sequencing of long
RNAs of more than 200 nucleotides in length. Release 2 adds data
from strand-specific assays of total RNA for the two ENCODE Tier 1 cell
lines.
Release
2 of the ENCODE
Open Chromatin track: This track displays evidence of open
chromatin as identified by two complementary methods, DNaseI
hypersensitivity and FAIRE, combined with ChIP identification
methods. Release 2 adds data from eight additional cell types,
expanding the track to 41 experiments in 13 cell lines.
7
November 2009 - October ENCODE News
Sep 2009 data freeze complete: The
ENCODE
Consortium has just completed data submissions for the fourth
production data freeze (Sep 09). The first set of data from this freeze
to complete quality review is now available on the UCSC public server,
in Release 2 of the ENCODE
Transcription Factor Binding Sites from Yale/UC-Davis/Harvard
track. Release 2 adds 59 ChIP-seq experiments to this track.
Other October track releases: The
Affymetrix/CSHL
Subcellular
RNA Localization by Tiling Array track was expanded to
include 4 additional experiments.
encodeproject.org: By
request
of the ENCODE Consortium, the domain encodeproject.org
has been registered by the ENCODE Data Coordination Center, and is
redirected to the ENCODE portal at UCSC.
New grants funded: NHGRI
has
funded 5 new ENCODE grants, as part of the American Investment and
Recovery Act. The new grants include expansion of ENCODE to the mouse
genome and proteogenomics.
Job openings at UCSC: The
UCSC
Genome Browser and ENCODE projects are currently accepting
applications for Software
Developer and Biological
Database
Testing/User Support Technician positions. We are looking
for talented individuals who would like to use their skills in computer
science, biology, and bioinformatics on fast-paced projects featuring
the work of top genomics scientists worldwide.
24
September 2009 - ENCODE data releases
since July 1
During this period a total of 10 new
ENCODE tracks were released to the UCSC public server. Functional
elements and region characterization in these tracks include:
For track names and file access, see
the Release Log and Downloads links listed in the
left menu bar.
We would like to thank the
contributing ENCODE labs and the the DCC team at UCSC for their efforts
completing these tracks.
1 July
2009 - ENCODE data releases for the
period April - June 2009
During this period, a total of seven
new ENCODE tracks were released to the UCSC public server. These tracks
include high-quality gene annotations, maps of transcription factor
binding, histone modifications, and open chromatin, RNA subcellular
localization, and RNA/protein binding sites. Read more.
|
|
|
|
|
|
|
|
The sequence and annotation data
displayed in the Genome Browser are freely available for academic,
nonprofit, and personal use with the following conditions:
|
|
|
|
|
|