Schema for T2T Encode - T2T Encode Reanalysis
  Database: hs1    Primary Table: hub_3671779_T2T_Encode_Peaks_HL-60.H3K27ac Data last updated: 2022-04-26
Big Bed File Download: /gbdb/hs1/encode/peaks/HL-60.H3K27ac.chm13v2.0.bb
Item Count: 42,814
The data is stored in the binary BigBed format.

Format description: Browser Extensible Data
fieldexampledescription
chromchr1Reference sequence chromosome or scaffold
chromStart165835834Start position in chromosome
chromEnd165836267End position in chromosome
namemacs2/ENCSR919WLM.CHM13.v2.0_peak_2926Name of item.
score64Score (0-1000)
strand.+ or - for strand
field82.74211Undocumented field
field98.35646Undocumented field
field106.46109Undocumented field
field11193Undocumented field

Sample Rows
 
chromchromStartchromEndnamescorestrandfield8field9field10field11
chr1165835834165836267macs2/ENCSR919WLM.CHM13.v2.0_peak_292664.2.742118.356466.46109193
chr1166184532166185427macs2/ENCSR919WLM.CHM13.v2.0_peak_29271000.9.46019143.74631141.13268436
chr1166185790166187284macs2/ENCSR919WLM.CHM13.v2.0_peak_2928465.5.7159148.8008546.53594253
chr1166221130166222935macs2/ENCSR919WLM.CHM13.v2.0_peak_29291000.14.55735250.30617247.43587858
chr1166247789166248228macs2/ENCSR919WLM.CHM13.v2.0_peak_2930203.4.0518022.4630820.38040245
chr1166321431166321900macs2/ENCSR919WLM.CHM13.v2.0_peak_2931260.5.8353028.1841626.05043317
chr1166361298166363556macs2/ENCSR919WLM.CHM13.v2.0_peak_2932785.7.6373980.9412378.531621392
chr1166389535166390413macs2/ENCSR919WLM.CHM13.v2.0_peak_2933619.6.3608264.2442061.90448358
chr1166406190166406424macs2/ENCSR919WLM.CHM13.v2.0_peak_293455.2.735017.472465.59687139
chr1166437307166438250macs2/ENCSR919WLM.CHM13.v2.0_peak_2935217.4.7300323.8185021.72267406

HL-60 H3K27ac (hub_3671779_T2T_Encode_Peaks_HL-60_H3K27ac) Track Description
 

Description

These tracks represent a reanalysis of ENCODE data against the T2T chm13 genome. All ChIP-seq experiments with pair-end data and read lengths of 100bp or greater are included.

Track types include:

  • Coverage pileups of mapped and filtered reads
  • Enrichment of mapped reads relative to a control
  • ChIP-seq peaks as called by MACS2
  • ChIP-seq peaks as called by MACS2 in GRCh38 and lifted over to chm13

Methods

Prior to mapping, reads originating from a single library were combined. Reads were mapped with Bowtie2 (v2.4.1) as paired-end with the arguments "--no-discordant --no-mixed --very-sensitive --no-unal --omit-sec-seq --xeq --reorder". Alignments were filtered using SAMtools (v1.10) using the arguments "-F 1804 -f 2 -q 2" to remove unmapped or single end mapped reads and those with a mapping quality score less than 2. PCR duplicates were identified and removed with the Picard tools "mark duplicates" command (v2.22.1) and the arguments "VALIDATION_STRINGENCY=LENIENT ASSUME_SORT_ORDER=queryname REMOVE_DUPLICATES = true".

Alignments were then filtered for the presence of unique k-mers. Specifically, for each alignment, reference sequences aligned with template ends were compared to a database of minimum unique k-mer lengths. The size of the k-mers in the k-mer filtering step are dependent on the length of the mapped reference sequence. Alignments were discarded if no unique k-mers occurred in either end of the read. The minimum unique k-mer length database was generated using scripts found here. Alignments from replicates were then pooled.

Bigwig coverage tracks were created using deepTools bamCoverage (v3.4.3) with a bin size of 1bp and default for all other parameters. Enrichment tracks were created using deepTools bamCompare with a bin size of 50bp, a pseudo-count of 1, and excluding bins with zero counts in both target and control tracks.

Peak calls were made using MACS2 (v2.2.7.1) with default parameters and estimated genome sizes 3.03e9 and 2.79e9 for chm13 and GRCh38, respectively. GRCh38 peak calls were lifted over to chm13 using the UCSC liftOver utility, the chain file created by the T2T consortium, and the parameter "-minMatch=0.2".

Credits

Data were processed by Michael Sauria at Johns Hopkins University. For inquiries, please contact us at the following address: msauria@jhu.edu

References

Gershman A, Sauria MEG, Guitart X, Vollger MR, Hook PW, Hoyt SJ, Jain M, Shumate A, Razaghi R, Koren S, Altemose N, Caldas GV, Logsdon GA, Rhie A, Eichler EE, Schatz MC, O'Neill RJ, Phillippy AM, Miga KH, Timp W. Epigenetic patterns in a complete human genome. Science. 2022 Apr;376(6588):eabj5089. doi: 10.1126/science.abj5089. Epub 2022 Apr 1. PMID: 35357915.