|
This page contains links to custom annotation tracks contributed by the UCSC
Genome Bioinformatics group and by the research community. Click on a track to
display it in the UCSC Genome Browser. Please check the Genome Browser standard
track set for additional contributed annotation tracks.
For information on how to create a custom annotation track, see
Displaying Your Own Annotations in the Genome Browser. If you would
like to submit your own custom tracks to this list, contact us.
Human Genome
Phased haplotypes
of 'Max Planck One' (MP1) genome in hg18 as described in
Suk EK et al.
A comprehensively molecular haplotype-resolved genome of a European individual.
Genome Res. 2011 Oct;21(10):1672-85.
RefSeq genes are shown in the first track for reference purposes.
The second track shows the extent of each molecularly phased segment within the genome of MP1.
The two haplotypes of MP1 are shown in two separate tracks (MP1_haplotype_1 and MP1_haplotype_2)
and are colored by base. Phased indels are also included in these haplotypes. All SNPs from MP1
are shown in the fifth track (MP1_all_SNPs). These SNPs are annotated with their dbSNP rs numbers
(or are annotated as novel). Non-synonymous SNPs are colored bright pink if they cause a
potentially damaging mutation and dark pink if they are not predicted to be damaging.
Thanks to the Max Planck Institute for Molecular Genetics for contributing these data.
DNA binding sites in hg18 for nuclear receptor HNF4alpha (NR2A1). The
PBM track
shows in vitro validated sites as determined by protein binding
microarrays (PBMs) (number after sequence indicates relative binding score). The
SVM track
shows predicted sites by support vector machine (SVM) analysis (number after sequence indicates
predicted relative binding score). For more information, see Bolotin E et al.
Integrated approach for the identification of human hepatocyte nuclear factor 4α
target genes using protein binding microarrays. Hepatology. 2010 Feb;51(2):642-653.
Thanks to the Sladek lab,
University of California Riverside for contributing these data.
Transcribed ultraconserved regions (T-UCRs) reblatted to hg18.
The first track shows intragenic T-UCRs (red); the second one displays
intergenic T-UCRs (blue) (intragenic and intergenic relative to the RefSeq
Genes track). For more information, see
Mestdagh P et al.
An integrative genomics screen uncovers ncRNA T-UCR functions in neuroblastoma tumours.
Oncogene. 2010 Jun 17;29(24):3583-92.
Thanks to Erik Fredlund, Pieter Mestdagh, Filip
Pattyn and Jo Vandesompele of the Center for Medical Genetics, Ghent University
Hospital, Ghent, Belgium for contributing these tracks.
Vervet monkey gene expression data (hg18) providing
mean expression differences for 8-16 samples per tissue type (publication
pending). See the
UCLA vervet gene expression atlas project website for more
information. Thanks to Dmitriy Skvortsov from the laboratory of Stanley F.
Nelson in the Department of Human Genetics and Psychiatry at the David Geffen
School of Medicine, UCLA, for contributing this track. The work is a
collaboration with Zugen Chen, Barry Merriman, Lynn Fairbanks, Roger Woods,
and Nelson Freimer.
Nucleosome Exclusion Prediction data sets (hg18) accompanying the paper
Radwan A et al.
Prediction and analysis of nucleosome exclusion regions in the human genome.
BMC Genomics. 2008 Apr 22;9(1):186.
View the Nucleosome Regions tracks to
see the whole genome annotation for nucleosome exclusion regions. View the
Nucleosome Scores tracks to see the nucleosome exclusion scores which were
calculated individually for each nucleotide. This annotation was contributed
by Ahmed Radwan, Akmal Younis, Peter Luykx, and Sawsan Khuri,
at the University of Miami, Miami, FL, USA. Contact Sawsan Khuri at
skhuri@med.miami.edu.
Click on the chromosome
you wish to display. Nucleosome Regions:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
X
Y
M.
Nucleosome Scores:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
X
Y
M.
Results of a
genome-wide association study of bipolar disorder (hg17)
published in Baum AE et al.
A genome-wide association study implicates diacylglycerol kinase
eta (DGKH) and several other genes in the etiology of bipolar disorder.
Mol Psychiatry. 2008 Feb;13(2):197-207.
The track shows the results of a two-stage study performed using the
Illumina HumanHap 550K chip. SNPs that replicated in both of two
independent case-control samples are shown, filtered for p-value and odds ratio.
Many thanks for this contribution to Amber Baum, Francis McMahon, and the
Unit on the Genetic Basis of Mood and Anxiety Disorders, Mood and Anxiety Disorders
Program, U.S. Department of Health and Human Services, National Institute of
Mental Health, National Institutes of Health, Bethesda, MD, USA, and the
Central
Institute for Mental Health, Mannheim, Germany.
Compare data from locus-specific databases with the genotypic and functional
data in the Genome Browser using PhenCode, which consolidates variants from many
curated locus-specific databases and one genome-wide database. Click
here to access the PhenCode query page that lets you select
and display a filtered set of locus variants data in the Genome Browser.
Thanks to Belinda Giardine,
Ross Hardison, Webb Miller, and Cathy Riemer at the
Center for Comparative
Genomics and Bioinformatics, Penn State University, University Park,
PA, USA, for contributing these data.
DISCLAIMER: PhenCode is intended for research purposes only. Although the
data are freely available to all, users should treat the reported mutations with
extreme caution in clinical settings or for any diagnostic or population
screening purpose. This information requires expertise to interpret properly;
clinical diagnosis and/or treatment recommendations should be made only by
medical professionals.
Tracks providing CpG island strength predictions and mapping of bona fide
CpG islands for the human genome (hg17/hg18). The tracks are based on
large-scale epigenome predictions, which give rise to an improved and
quantitative annotation of CpG islands. Additional information on these tracks
is available from the
supplementary website and from the corresponding paper
Bock C et al.
CpG island mapping by epigenome prediction.
PLoS Comput Biol. 2007 Jun;3(6):e110. For prioritization of candidate regions,
the quantitative CpG island strength predictions are recommended
(hg17/
hg18).
For genome annotation, three maps of bona fide CpG islands are provided: (i) a
highly specific map
(hg17/
hg18),
(ii) a balanced map recommended for most applications
(hg17/
hg18)
and (iii) a highly sensitive map
(hg17/
hg18).
Finally, all tracks can be viewed simultaneously
(hg17/
hg18),
which may take longer to load.
Three tracks (hg17) accompanying the paper Nakaya HI et al.
Genome mapping and
expression analyses of human intronic noncoding RNAs reveal tissue-specific
patterns and enrichment in genes related to regulation of transcription.
Genome Biol. 2007 Mar 26;8(3):R43.
The
TIN_RNAs
track shows the genomic mapping coordinates of all
55,139 Totally Intronic Noncoding RNA (TIN RNA) transcripts identified in the
human genome. The
PIN_RNAs
track shows the mapping coordinates of all 12,592
Partially Intronic Noncoding RNA (PIN RNA) transcripts. The
TIN_PIN_probes
track shows the genomic coordinates
of all TIN and PIN sense and antisense intronic probes plus the exonic
probes in a custom-designed 44K intron-exon oligoarray. This array
was used for gene expression experiments with human prostate, kidney and
liver tissues. Thanks to Sergio Verjovski-Almeida, Eduardo M. Reis, and
Helder I. Nakaya from Instituto de Quimica - Universidade de Sao Paulo for
contributing these data sets.
Copy-Number Variants (hg17) accompanying the paper
Wong K et al.
A comprehensive analysis of common copy-number variations in the human genome.
Am J Hum Genet. 2007 Jan;80(1):91-104.
The following color scheme is used to indicate the frequency
with which clones were seen: blue (1 or 2), red (3), green (4 or 5), black (6 or more).
Thanks to Kendy Wong and Ronald deLeeuw for contributing this data.
Data sets (hg17) accompanying the paper
Carroll JS et al.
Genome-wide analysis of estrogen receptor binding sites.
Nat Genet. 2006 Nov;38(11):1289-97. The set of six custom tracks
shows ER and RNA Pol2 ChIP-chip data at two cutoffs (low and high),
upregulated genes, and downregulated genes. Thanks to the
Myles Brown
lab at the Dana-Farber Cancer Institute, Harvard Medical School, Boston,
MA, USA for contributing these data.
Sliding window analysis of Tajima's D across the human genome for
hg17 and
hg16.
This track identifies regions putatively subject to strong, recent, selective
sweeps and identified Contiguous Regions of Tajima's D Reduction (CRTRs) in each
of three populations. For details, see the Tajima's D SNPs track on the hg17
and hg16 Genome Browsers, as well as Carlson CS et al.
Genomic regions exhibiting positive selection identified from dense genotype data.
Genome Res. 2005 Nov;15(11):1553-65.
Structural RNAs predicted by RNAz (hg17).
This track displays putative
functional RNA elements with exceptionally stable and/or evolutionary conserved
secondary structure. For a description of the RNAz program, see Washietl S et al.
Fast and reliable prediction of noncoding RNAs.
Proc Natl Acad Sci U S A. 2005 Feb 15;102(7):2454-9.
Additional information on how this track has been generated can be found
here.
Thanks to Stefan Washietl, Ivo Hofacker and Peter F. Stadler for contributing
this annotation.
Isochore track
(hg17)
generated using
IsoFinder,
a segmentation algorithm developed by
Grupo de Bioinformatica, Universidad de Granada, Spain.
Data on older human assemblies are also available
(hg16,
hg15,
hg13,
hg12).
Thanks to Dr. Jose L. Oliver for contributing this track.
Stanford Human Promoters
(hg16,
hg15,
hg13).
The Stanford Human Promoters data sets were generated by the Richard M. Myers lab at
Stanford University and is described in Trinklein N et al.
Identification and functional analysis of human transcriptional promoters.
Genome Res. 2003 Feb;13(2):308-12.
Thanks to Nathan Trinklein at Stanford School of Medicine for contributing this track,
and to Daryl Thomas of UCSC for lifting the hg15 data to the hg16 assembly.
Mouse Ortholog (hg12).
Human and Mouse gene predictions based
on fgenesh++ clustered using a BLAT protein alignment and the reciprocal best
matches retained. Thanks to Robert Baertsch for creating this track.
Penn State University Known Regulatory Regions Set 1 (hg12).
This set contains acollection of known regulatory regions gathered from
literature. Set 1 is limited to the smallest recognized segment containing full
function, and was used as the data set for Elnitski L et al.
Distinguishing regulatory DNA from neutral sites.
Genome Res. 2003 Jan;13(1):64-72.
For more information, see http://pipmaker.bx.psu.edu/mousegroup/Reg_annotations/. Thanks to
Robert Baertsch for creating this track.
Penn State University Known Regulatory Regions Set 2 (hg12).
This set of functional regions contains names and coordinates of an additional
set of regulatory regions that were not trimmed (as in Set 1) to show the
smallest possible functional element with maximum activity. The regions range in
size from 300-4000bp. For more information, see
http://pipmaker.bx.psu.edu/mousegroup/Reg_annotations/. Thanks to
Robert Baertsch for creating this track.
Mouse Genome
Transcriptome-wide monoallelic expression in CNS-derived stem cells for four clonal
hybrid (B6 x JF1) cell lines is displayed in mm9. The
track shows the allelic preference for cell lines 2A1, 2A5, 3A1 and 4A5 at JF1
cSNP locations. The allelic preference is denoted by the proportion of the B6 allele
vs JF1 allele. For more information, see: Li SM et al.
Transcriptome-wide survey of CNS-derived cells reveals monoallelic
expression within novel gene families. PLoS ONE. 2012 Feb;7(2):e31751.
Genome-wide DNase hypersensitivity in male and female mouse liver mapped
by DNase I treatment of pooled livers from male and female mice coupled
with high-throughput sequencing (DNase-seq). The tracks here are BED
files representing (1)
Liver_DHS_peaks: peaks identified using PeakSeq
(Rozowsky J et al.
PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls.
Nat Biotechnol. 2009;27(1):66-75), and (2)
Liver_DHS_regions:
broader regions of hypersensitivity identified using SICER (Zang C et al.
A clustering approach for identification of enriched domains from histone modification ChIP-Seq data.
Bioinformatics. 2009;25(15):1952-8), that are sex-independent (gray) and sex-specific
(blue for male-specific, pink for female-specific; darker shade for
higher stringency for sex-specificity). For more information, see
Ling G et al.
Unbiased, genome-wide in vivo mapping of transcriptional regulatory elements reveals sex differences
in chromatin structure associated with sex-specific liver gene expression.
Mol Cell Biol. 2010 Dec;30(23):5531-44.
DMRT1 is a transcription factor that is expressed in germ cells and Sertoli cells and
plays multiple roles in testis development. This study analyzed DMRT1 genome wide
promoter occupancy in the mouse testis at postnatal day 9 as determined by ChIP-chip
on Nimblegen mouse promoter arrays. The three WIG traces
[1]
[2]
[3]
are from three
independent biological replicates and displayed on mouse genome assembly mm8. The WIG
traces represent the enrichment for each probe on the array calculated as the
log-ratio of the intensities of the DMRT1 ChIP product (Cy5) to control input
chromatin (Cy3). More details and gene expression analysis can be found on
the associated interactive web site: www.dmrt1.umn.edu and in the publication:
Murphy MW et al.
Genome-wide analysis of DNA binding and transcriptional regulation by the mammalian Doublesex
homolog DMRT1 in the juvenile testis.
Proc Natl Acad Sci U S A. 2010 Jul 27;107(30):13360-5.
Farnesoid X receptor (FXR) is a bile acid-activated transcription factor
belonging to the nuclear receptor superfamily. FXR is highly expressed in liver
and intestine, and crosstalk mediated by FXR in these two organs is critical in
maintaining bile acid homeostasis. This study analyzed genome-wide FXR binding
in liver and intestine of mice treated with a synthetic FXR ligand (GW4064) by
chromatin immunoprecipitation coupled to massively parallel sequencing
(ChIP-seq). The
Fxr Liver
and
Fxr Intestine
tracks shown here are WIG files that represent the number of
times a particular 35bp fragment of DNA was sequenced in the reaction. More
details can be found in the publication
Thomas AM et al.
Genome-wide tissue-specific farnesoid X receptor binding in mouse liver and intestine.
Hepatology. 2010 Apr;51(4):1410-9.
Thanks to Ann Thomas and Steven Hart in the Department of Pharmacology,
Toxicology, and Therapeutics at the University of Kansas Medical Center,
Kansas City, KS, for contributing these tracks.
An experiment looking at four different ages of mouse liver to observe
how different histone modifications (DNA methylation, H3K4me2, and
H3K27) change across postnatal development (mm9).
A ChIP-on-chip tiling array for three mouse chromosomes
(chr5,
chr12,
chr15)
was used. The tracks show three types of data: 1) a genomic region with a
sequence of >800bp and an average signal increase greater than the
threshold, defined as an interval, 2) a genomic region with one or more
enriched intervals in close proximity to each other (at least one base overlap)
at any given age, defined as an active region, and 3) peak values for each
interval. For more information, see Li Y et al.
Dynamic patterns of histone methylation are associated with
ontogenic expression of the Cyp3a genes during mouse liver maturation.
Mol Pharmacol. 2009 May;75(5):1171-1179.
Thanks to Steven Hart in the Department of Pharmacology, Toxicology, and
Therapeutics at the University of Kansas Medical Center, Kansas City, KS, for
contributing these tracks.
Locations of known, suspected, and imputed SNPs generated by BLAT alignment
of 3 million Celera associated sequences to the May 2004 mouse genome
assembly (mm5), provided by The GeneNetwork and WebQTL. Only those SNPs that
distinguish strains C57BL/6J from DBA/2J (1.75 million) or that distinguish
C57BL/6J from A/J (1.80 million) are displayed in the custom track.
Due to the proprietary nature of these data, only low resolution
position data (SNP density per 100,000 to 300,000bp) are currently provided.
This custom track is available on any PHYSICAL and GENETIC maps in WebQTL for
the BXD and AXB/BXA genetic reference panels simply by clicking on
interval maps.
The mm5 assembly has been archived, however, the data is still available here:
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
X.
Thanks to Celera Genomics (Richard Mural and
Paul Thomas) for this level of access to CDS data and to Christopher
Vincent (Georgia Tech), Alex G. Williams (UCSC); Robert Crowell
(UTHSC and MIT), Gary Churchill and Natalie Blades (The Jackson
Laboratory), and the WebQTL group at UTHSC (Jintao Wang, Yanhua Qu,
Yan Cui, Robert Williams, and Kenneth Manly) for contributing this
track.
Isochore track (mm5) generated using
IsoFinder,
a segmentation algorithm developed by
Grupo de Bioinformatica, Universidad de Granada, Spain.
The mm5 assembly has been archived, however, the data is still available
here.
Data are also available for the
mm3 assembly.
Click
here for more information about this annotation.
Thanks to Dr. Jose L.
Oliver for contributing this track.
Rat Genome
Isochore track
(rn3)
generated using
IsoFinder,
a segmentation algorithm developed by
Grupo de Bioinformatica, Universidad de Granada, Spain.
Data are also available for the
rn2 and
rn1 assemblies.
Click
here
for more information about this annotation. Thanks to
Dr. Jose L. Oliver
for contributing this track.
Tetraodon Genome
CAGE transcription start sites for the tetraodon (tetNig2) genome. Thank you to Chirag Nepal for creating these
custom annotation tracks. For a full list of those individuals and institutions involved in the creation of the
data included in these tracks, please refer to the following paper: Nepal C et al.
Dynamic regulation of the transcription initiation landscape at single nucleotide resolution during
vertebrate embryogenesis.
Genome Res. 2013 Nov;23(11):1938-50.
To view these tracks, click
here.
Zebrafish Genome
A suite of tracks for the zebrafish (danRer7) genome that include CAGE transcription start sites, plus H3K4me3
and RNAseq coverage. Thank you to Chirag Nepal for creating these custom annotation tracks. For a full list of
those individuals and institutions involved in the creation of the data included in these tracks,
please refer to the following paper: Nepal C et al.
Dynamic regulation of the transcription initiation landscape at single nucleotide resolution during
vertebrate embryogenesis.
Genome Res. 2013 Nov;23(11):1938-50.
This paper inlcudes information on the methods used to produce the data included in these tracks.
To view these tracks, click
here.
Multi-Species Annotations
A Hidden Markov Model (HMM) based method was used to look for CpG
islands (CGI) from DNA sequences. Two HMMs are fitted for GC content and
observed to expected ratios of CpG counts. The CGIs were detected by jointly
thresholding the result posterior probabilities. Unlike the current CGI
definition which was derived from studying promoters of known human genes,
this method is data-driven and can be applied to species with different
sequence compositions. For details please see Wu H et al.
Redefining CpG islands using hidden Markov models.
Biostatistics. 2010 Jul;11(3):499-514.
H. sapiens (human)
hg19
H. sapiens (human)
hg18
P. troglodytes (chimpanzee)
panTro2
P. abelii (Orangutan)
ponAbe2
R. macaque (monkey)
rheMac2
M. musculus (mouse)
mm9
M. musculus (mouse)
mm8
C. familiaris (dog)
canFam2
E. caballus (horse)
equCab2
D. melanogaster (fruit fly)
dm3
C. elegans (worm)
ce2
Lists of CpG islands and an R software package can be download from
http://rafalab.jhsph.edu/CGI/.
| |