Using UCSC Genome Browser Track Hubs
Contents
Additional resources
Questions and feedback are welcome.
What Are Track Hubs?
Track hubs are web-accessible directories of genomic data that can be viewed on the UCSC Genome
Browser (please note that hosting hub files on HTTP tends to work even better than FTP and local
hubs can be displayed on GBiB).
Track hubs can be displayed on genomes that UCSC directly supports, or on your own sequence. Hubs
are a useful tool for visualizing a large number of genome-wide data sets. For example, a project
that has produced several wiggle plots of data can use the hub utility to organize the tracks into
composite and super-tracks, making it possible to show the data for a large collection of tissues
and experimental conditions in a visually elegant way, similar to how the ENCODE native data tracks
are displayed in the browser.
The track hub utility allows efficient access to data sets from around the world through the
familiar Genome Browser interface. Browser users can display tracks from any public track hub that
has been registered with UCSC. Additionally, users can import data from unlisted hubs or can set up,
display, and share their own track hubs. Genome assemblies that UCSC does not support can be loaded
and viewed with associated data.
The data underlying the tracks and optional sequence in a hub reside on the remote server of the
data provider rather than at UCSC. Genomic annotations are stored in compressed binary indexed
files in bigBed, bigBarChart, bigGenePred, bigPsl, bigChain, bigMaf, bigWig, BAM, CRAM, HAL or VCF
format that contain the data at several resolutions. In the case of assemblies that UCSC does not
support, genomic sequence is stored in the efficient twoBit format. When a hub track is displayed
in the Genome Browser, only the relevant data needed to support the view of the current genomic
region are transmitted rather than the entire file. The transmitted data are cached on the UCSC
server to expedite future access. This on-demand transfer mechanism eliminates the need to transmit
large data sets across the Internet, thereby minimizing upload time into the browser.
The track hub utility offers a convenient way to view and share very large sets of data. Individuals
wishing to display only a few small data sets may find it easier to use the Genome Browser
custom track utility. As with hub tracks, custom tracks
can be uploaded to the UCSC Genome Browser and viewed alongside the native annotation tracks. Custom
tracks can be constructed from a wide range of data types; hub tracks are limited to compressed
binary indexed formats that can be remotely hosted. However, the custom tracks utility does not
offer the data persistence and track configurability provided by the track hub mechanism: hub tracks
can be grouped into composite or super-tracks and configured to display the data using a wide
variety of options. There is no way to create a browser on your own sequence with custom tracks.
In general, for users who have large data sets that would be prohibitive to upload, need to ensure
the persistence of their data, or would like to take full advantage of track functionality, or
create a browser on sequence not natively supported by UCSC or a genome browser mirror, track hubs
are a better solution. Both mechanisms give data providers the flexibility to directly add, update,
and remove data from their display as needed.
What Are Assembly Hubs?
Assembly Data Hubs extend the functionality of Track Data Hubs to assemblies that are not hosted
natively on the Browser. Assembly Data Hubs were developed to address the increasing need for
researchers to annotate sequence for which UCSC does not provide an annotation database. They allow
researchers to include the underlying reference sequence, as well as data tracks that annotate that
sequence. Sequence is stored in the UCSC twoBit format, and the annotation tracks are stored in the
same manner as Track Data Hubs. For more information on how to setup your own Assembly Data Hub,
please refer to the Assembly Hub
wiki and see the Quick Start Guide to
Assembly Hubs.
Viewing Track Hubs in the browser
Public hubs
The Genome Browser provides links to a collection of public track hubs that have been registered
with UCSC. To view a list of the public track hubs available for the currently selected assembly,
click the "track hubs" button on the Genome Browser gateway or annotation tracks page. The
Public Hubs tab on the Track Hubs page lists the hubs that are available for display in the browser.
To add a hub to your display, click the "Connect" button next to the hub. You can also
click "Description" and "Hub Name" to see more details about some Public Hubs.
After connecting to the hub, click the "Genome Browser" link from the top blue bar. The
default assembly for the selected hub will be displayed with hub tracks in a separate track group
below the browser image. The tracks can be configured and
manipulated in the same fashion as native browser tracks. As with any very large track in the
Genome Browser, exercise caution when viewing a broad genomic region that requires the Genome
Browser to display a large number of track features: the browser display may time out.
Unlisted hubs (located in the My
Hubs tab)
In addition to the publicly available hubs listed on the Public Hubs tab, it is possible to load
your own unlisted track hub or one created by a colleague as long as you know its URL. To add an
unlisted hub, open the Track Hubs page and click the My Hubs tab. This tab lists the unlisted track hubs that you have loaded into
your browser. To import a new hub, type its URL into the text box, then click the Add Hub button.
If the track hub is imported successfully, it will be added to the list. It is also possible to add
a track hub by directly adding the hub's URL to the browser URL. If you add
hubUrl=[URL] to your hgTracks URL line, it will add the hub directly into the
browser (e.g.,
http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&hubUrl=http://vizhub.wustl.edu/VizHub/RoadmapRelease3.txt).
If the hub you've selected is an assembly hub, the supported assemblies will be selectable from the
gateway page by selecting the name of the track hub in the group pulldown.
If the browser is unable to load the track hub, it will display an error message. Some common causes
for an import to fail include typos in the URL, attempting to add a track hub from a different
assembly, a hub server that is offline, or errors in the track hub files. Once you have successfully
loaded a hub, you can view it in the browser by clicking the "Connect" button. Please note
that unlisted hubs are in no way secure. The URL helps to obfuscate the location of the data; it is
a simple barrier to casual users. Please also know that hubs can be loaded from a local directory
when using GBiB.
To remove a track hub from your Genome Browser display, click the "Disconnect" button on
the Track Data Hubs page.
Occasionally, remote track hubs may be missing, off-line, or otherwise unavailable. If a user is
already browsing data from the remote hub when it disconnects, a yellow error message will be
displayed instead of the expected data.
The track sets in hubs are genome assembly specific. The Track Hubs page lists each hub and the
assemblies it supports. To switch to a different assembly, click the Genomes link in the top menu
bar, then select the new assembly from the Gateway page. If the hub supports assemblies that are not
natively supported by UCSC or the mirror you're visiting, the assembly can be selected by choosing
the track hub name under the group menu.
Tracks accessed through a hub can be used in Genome Browser
sessions and custom
tracks in the same manner as other tracks. The data underlying data hub tracks can be viewed,
manipulated, and downloaded using the UCSC Table Browser.
Setting up your own Track Hub
This section provides a step-by-step description of the process used to set up a track hub on your
own server.
To create your own hub you will need:
-
one or more data sets formatted in one of the compressed binary index formats supported by the
Genome Browser: bigBed, bigBarChart, bigGenePred, bigPsl, bigChain, bigMaf, bigWig, BAM, CRAM,
HAL or VCF
-
a set of text files that specify properties for the track hub and for each of the data tracks
within it
-
a twoBit file with your sequence if you are setting up an assembly hub.
-
an Internet-enabled web server or ftp server
The files are placed on the server in a file hierarchy like the one shown in Example 1.
Users experienced in setting up Genome Browser mirrors that contain their own data will find that
setting up a track hub is similar, but is usually much easier. Depending on the number and
complexity of the data sets, a track hub can typically be set up in a day or two. It is generally
easiest to run the command-line data formatting programs in a Linux programming environment,
although it's possible to manipulate smaller data sets using Mac OS-X as well.
Example 1: Directory hierarchy for a hub containing DNase and RNAseq data
for the hg18 and hg19 human genome assemblies. The hg18/ and hg19/ subdirectories contain the
assembly-specific data files.
myHub/ - directory containing track hub files
hub.txt - a short description of hub properties
genomes.txt - list of genome assemblies included in the hub data
hg19/ - directory of data for the hg19 (GRCh37) human assembly
trackDb.txt - display properties for tracks in this directory
dnase.html - description text for a DNase track
dnaseLiver.bigWig - wiggle plot of DNase in liver
dnaseLiver.bigBed - regions of active DNase
liverGenes.bigGenePred - gene annotations of genes over-expressed in liver tissue
dnaseLung.bigWig - wiggle plot of DNase in lung
dnaseLung.bigWig - regions of active DNase
...
rnaSeq.html - description text for an RNAseq track
rnaSeqLiver.bigWig - wiggle plot of RNAseq data in liver
rnaSeqLiver.bigBed - intron/exon lists for liver
rnaSeqLung.bigWig - wiggle plot of RNAseq data in lung
rnaSeqLung.bigBed - intron/exon lists for lung
hg18/ - directory of data for the hg18 (Build 36) human assembly
trackDb.txt - display properties for tracks in this directory
dnase.html - description text for a DNase track
dnaseLiver.bigWig - wiggle plot of DNase data in liver
dnaseLiver.bigBed - regions of active DNase
dnaseLung.bigWig - wiggle plot of DNase data in lung
dnaseLung.bigWig - regions of active DNase
...
rnaSeq.html - description text for an RNAseq track
rnaSeqLiver.bigWig - wiggle plot of RNAseq data in liver
rnaSeqLiver.bigBed - intron/exon lists for liver
rnaSeqLung.bigWig - wiggle plot of RNAseq data in lung
rnaSeqLung.bigBed - intron/exon lists for lung
Step 1. Format the data
The data tracks provided by a hub must be formatted in one of the compressed binary index formats
supported by the Genome Browser:
bigWig,
bigBed,
bigGenePred,
bigChain,
bigBarChart,
bigPsl,
bigMaf,
bigWig,
BAM,
CRAM, HAL or
VCF.
bigWig - The bigWig format is best for displaying continuous value plot data, such as
read depths from short read sequencing projects or levels of conservation observed in a
multiple-species alignment. A bigWig file contains a list of chromosome segments, each of which is
associated with a floating point value. When graphed, the segments may appear as a big
"wiggle". Although each bigWig file can contain only a single value for any given base,
bigWig tracks are often combined into "container multiWig" or "compositeTrack
on" tagged tracks. For information on creating and configuring bigWig tracks, see the
bigWig Track Format help page.
bigBed - BigBed files are binary indexed versions of Browser Extensible Data
(BED) files. BED format is useful for associating a
name and (optionally) a color and a score with one or more related regions on the same chromosome,
such as all the exons of a gene. See the bigBed Track Format help page for
information on creating and configuring bigBed tracks.
bigGenePred - BigGenePred files are binary indexed versions of Browser Extensible Data
(BED) files with an extra eight fields that are
useful for describing gene predictions that are modeled after the fields in
genePred files. BigGenePred format is useful for
associating a name and (optionally) a color and a score with one or more related regions on the same
chromosome, such as all the exons of a gene. See the bigGenePred Track
Format help page for information on creating and configuring bigGenePred tracks.
bigBarChart - BigBarChart files are binary indexed versions of
barChart files. BigBarChart format is useful for
bringing barChart display into track hubs, and supports schema customization and label configuration
that is not supported for regular barChart format. See the barChart Track
Format help page for information on creating and configuring bigBarChart tracks.
bigPsl - BigPsl files are binary indexed versions of
PSL files. BigPsl format is useful for large data
sets created by BLAT or other tools. See the bigPsl Track Format help page
for more information on creating and configuring bigPsl tracks.
bigChain - BigChain files are binary indexed versions of
chain files. BigChain format is useful for large pairwise alignment data
sets. See the bigChain Track Format help page for more information on
creating and configuring bigChain tracks.
bigMaf - BigMaf files are binary indexed versions of
MAF files. BigMaf format is useful for large multiple
alignment data sets. See the bigMaf Track Format help page for more
information on creating and configuring bigMaf tracks.
BAM - BAM files contain alignments of (generally short) DNA reads to a reference
sequence, usually a complete genome. BAM files are binary versions of Sequence Alignment/Map
(SAM) format files. Unlike bigWig and
bigBed formats, the index for a BAM file is in a separate file, which the track hub expects to be in
the same directory with the same root name as the BAM file with the addition of a .bai
suffix. See the BAM Track Format help page for more information.
CRAM - The CRAM file format is a more dense form of BAM files with
the benefit of saving much disk space. While BAM files contain all sequence data within a file, CRAM
files are smaller by taking advantage of an additional external "reference sequence" file.
This file is needed to both compress and decompress the read information. See the
CRAM Track Format help page for more information.
HAL - HAL (Hierarchical Alignment Format) is a graph-based structure to efficiently store
and index multiple genome alignments and ancestral reconstructions.
HAL files are
represented in HDF5 format, an open
standard for storing and indexing large, compressed scientific data sets. HAL is the native output
format of the Progressive Cactus alignment pipeline, and is included in the
Progressive Cactus
installation package.
VCF - VCF (Variant Call Format) files can contain annotations of single nucleotide
variants, insertions/deletions, copy number variants, structural variants and other types of genomic
variation. When a VCF file is compressed and indexed using
tabix (available
here), it can be
used as a data track file. Unlike bigWig and bigBed formats, the tabix index is in a separate file,
which the track hub expects to be in the same directory with the same root name as the VCF file with
the addition of a .tbi suffix. See the VCF Track Format help page
for more information.
Step 2. Create the track hub directory
Create a track hub directory in an Internet-accessible location on your web or ftp server. This
directory will contain the hub.txt and genomes.txt files that define properties of the track hub and
a subdirectory for each of the genome assemblies covered by the hub track data.
Step 3. Place the track data files in an Internet-accessible location
The data files underlying a track in a hub do not have to reside in the track hub directory or even
on the same server, but they must be accessible via the Internet. The track hub utility supports
Internet protocols such as http://, https://, and ftp://, as well as file paths relative to the hub
directory hierarchy. The location of a track file is defined by its bigDataUrl tag in the
associated trackDb.txt file (Step 7).
Step 4. Create the hub.txt file
Within the hub directory, create a hub.txt file containing a single stanza with up to six fields
that define properties of the track hub:
hub hub_name
shortLabel hub_short_label
longLabel hub_long_label
genomesFile genomes_filelist
email email_address
descriptionUrl descriptionUrl
hub - a single-word name of the directory containing the track hub files. Not displayed to
hub users. This must be the first line in the hub.txt file.
shortLabel - the short name for the track hub. Suggested maximum length is 17 characters.
Displayed as the hub name on the Track Hubs page and the track group name on the browser tracks
page.
longLabel - a longer descriptive label for the track hub. Suggested maximum length is 80
characters. Displayed in the description field on the Track Hubs page.
genomesFile - the relative path of the genomes.txt file, which contains the list of genome
assemblies covered by the track data and the names of their associated configuration files. By
convention the genomes.txt file is located in the same directory as the hub.txt file.
email - the contact to whom questions regarding the track hub should be directed.
descriptionUrl - URL to HTML page with a description of the hub's contents. This can be
relative to the directory which holds hub.txt. This file is assumed to be HTML, and if the hub is a
UCSC public hub, this HTML will be crawled nightly by UCSC to build an index with which public hubs
can be searched. If present, clicks on the shortLabel will open this HTML in a new tab. This field
is optional.
Example 2: Sample hub.txt file defining attributes for the track hub
shown in Example 1.
hub UCSCHub
shortLabel UCSC Hub
longLabel UCSC Genome Informatics Hub for human DNase and RNAseq data
genomesFile genomes.txt
email [email protected]
descriptionUrl ucscHub.html
Step 5. Create the genomes.txt file
Create a genomes.txt file within the track hub directory that contains a two-line stanza that must
be separated by a line for each genome assembly that is supported by the hub data. Each stanza shows
the location of the trackDb file that defines display properties for each track in that
assembly.
genome assembly_database_1
trackDb assembly_1_path/trackDb.txt
genome assembly_database_2
trackDb assembly_2_path/trackDb.txt
genome - a valid UCSC database name. Each stanza must begin with this tag and each stanza
must be separated by an empty line.
trackDb - the relative path of the trackDb file for the assembly designated by the
genome tag. By convention, the trackDb file is located in a subdirectory of the hub
directory. However, the trackDb tag may also specify a complete URL.
If this genomes.txt file is for an assembly that does not have native support in the browser, the
following fields must also be present:
twoBitPath - refers to the .2bit file containing the sequence for this assembly. Typically
this file is constructed from the original fasta files for the sequence using the kent program
faToTwoBit. See here for instructions on how to build a 2bit file.
groups - a file which defines the track groups on this Genome Browser. Track groups are the
sections of related tracks grouped together under the primary genome browser graphics display image.
The groups.txt file defines the grouping of track controls under the primary Genome Browser image
display. The example referenced here has the usual definitions as found in the UCSC Genome Browser.
Each group is defined, for example the Mapping group:
name map
label Mapping
priority 2
defaultIsClosed 0
The name is used in the trackDb.txt track definition group, to assign a particular track to this
group. The label is displayed on the genome browser as the title of this group of track controls
The priority orders this track group with the other track groups. The defaultIsClosed determines if
this track group is expanded or closed by default. Values to use are 0 or 1.
description - will be displayed for user information on the gateway page and most title
pages of this genome assembly browser. It is the name displayed in the assembly pull-down menu on
the browser gateway page.
organism - the string which is displayed along with the description on most title pages in the Genome Browser. Adjust your names in organism and description until they are appropriate. This
organism name is the name that appears in the genome pull-down menu on the browser gateway page.
defaultPos - specifies the default position the genome browser will open when a user first
views this assembly. This is usually selected to highlight a popular gene or region of interest in
the genome assembly.
orderKey - used with other genome definitions at this hub to order the pull-down menu
ordering the genome pull-down menu.
htmlPath - refers to an html file that is used on the gateway page to display information
about the assembly.
Example 3: Sample genomes.txt file defining attributes for the hub shown
in Example 1.
genome hg18
trackDb hg18/trackDb.txt
genome hg19
trackDb hg19/trackDb.txt
genome newOrg1
trackDb newOrg1/trackDb.txt
twoBitPath newOrg1/newOrg1.2bit
groups newOrg1/groups.txt
description Big Foot V4
organism BigFoot
defaultPos chr21:33031596-33033258
orderKey 4800
scientificName Biggus Footus
htmlPath newOrg1/description.html
Step 6. Create the genome assembly subdirectories
Within the track hub directory, create a subdirectory for each of the genome assemblies that have
track data in the hub. The subdirectory names must have a 1:1 correspondence with the database names
defined by the genome tags in the genomes.txt file.
Step 7. Create the trackDb.txt files
The trackDb.txt file, which is based on the Genome Browser .ra format, is the most complicated of
the text files in the hub directory. It contains a stanza for each of the data files for the given
assembly that defines display and configuration properties for the track. If the tracks are grouped
into larger entities, such as composite or super-tracks, the larger entities will have a stanza in
the file as well.
The Track Database Definition Document will help you
understand how to create a trackDb.txt file. This document describes how to declare dataset display
settings and values, and indicates the support level for each setting. While there are over 100
track settings supported at UCSC, other sites that display hubs have more limited settings support.
To further portability of hubs, we have used input from other sites to identify a "base"
subset of the "full" settings list, and the document has been assigned a version number.
See the document introduction for a fuller explanation.
At a minimum, each track in the trackDb.txt file must contain the "required" settings:
track track_name
bigDataUrl track_data_URL
shortLabel short_label
longLabel long_label
type track_type
track - the symbolic name of the track. The first character must be a letter, and the
remaining characters must be letters, numbers, or under-bar ("_"). Each track must have a
unique name. This tag pair must be the first entry in the trackDb.txt file.
bigDataUrl - the file name, path, or Web location of the track's data file. The bigDataUrl
can be a full URL. If it is not prefaced by a protocol, such as http://, https://
or ftp://, then it is considered to be a path relative to the trackDb.txt file.
shortLabel - the short name for the track displayed in the track list, in the configuration
and track settings, and on the details pages. Suggested maximum length is 17 characters.
longLabel - the longer description label for the track that is displayed in the
configuration and track settings, and on the details pages. Suggested maximum length is 80
characters.
type - the format of the file specified by bigDataUrl. Must be either bigWig,
bigBed, bigBarChart, bigGenePred, bigChain, bigPsl,
bigMaf, bam, halSnake or vcfTabix (Note: use type bam
for CRAM files). If the type is bigBed, it may be followed by an optional number denoting
the number of fields in the bigBed file (e.g., "type bigBed 12" for a file with 12 fields).
If no number is given, a default value of 3 is assumed (a very limited display that omits names, strand
information, and exon boundaries).
Example 4: Sample trackDb.txt file containing two simple tracks.
track dnaseSignal
bigDataUrl dnaseSignal.bigWig
shortLabel DNAse Signal
longLabel Depth of alignments of DNAse reads
type bigWig
track dnaseReads
bigDataUrl dnaseReads.bam
shortLabel DNAse Reads
longLabel DNAse reads mapped with MAQ
type bam
Suggestions:
Default subtracks for composite For each composite, it is recommended that
a subset of subtracks are "selected" (on) by default. This way, when a user turns the
composite from hide to another visibility, they will see tracks displayed in the browser.
Default composites within a super-track: For super-tracks that you don't
want displaying by default when your track hub is turned on, it is recommended that some (or all)
composites within the super-track be set to dense (or some visibility other than hide) by default
and that the super-track be set to hide by default. This way, if a user changes a super-track from
hide to show from the controls under the browser image, tracks are displayed. To implement, change
the visibility line in trackDb of the super-tracks to hide and the visibility lines of all or some
of the composite tracks within to dense (or some visibility other than hide).
hgTrackUi controls: In addition to the controls for each view (click on
the title of the view drop-down), there is often another set of controls above the view drop-downs
(just under the "Overall display mode"). This set of controls is not associated with a
particular view and clashes with the view controls. It is recommended to remove the controls that
are not associated with a particular view.
Step 8. Create track description files
Each track in the hub may have an associated description file that describes the track to viewers.
The file provides detailed information about the data displayed in the track, including methods used
to produce and validate the data, background information, display conventions, acknowledgments, and
reference publications. The description file, which must be in HTML format, is inserted into the
track configuration page that displays when the user clicks on the track's short label. It also
displays on the track details page that is shown when the user clicks on a feature in the track
image.
The track description file must have the same name as the symbolic name for the track (defined by
the track tag in the trackDb.txt file) with a suffix of .html. For instance, a
description file associated with the track named "dnaseSignal" in Example 4 would
be named "dnaseSignal.html". The description file must reside in the same directory as the
trackDb.txt file.
Both parent and child tracks within a super-track can have their own description files. If the
description file is not present, the corresponding sections of the track settings and details pages
are left blank. Only one description page can be associated with composite and multiWig tracks; the
file name should correspond to the symbolic name of the top-level track in the composite.
Debugging and updating Track Hubs
Changing udcTimeout to debug hub display
As part of the track hub mechanism, UCSC caches data from the hub on the local server. The hub
utility periodically checks the time stamps on the hub files, and downloads them again only if they
have a time stamp newer than the UCSC one. For performance reasons, UCSC checks the time stamps
every 300 seconds, which can result in a 5-minute delay between the time a hub file is updated and
the change appears on the Genome Browser. Hub providers can work around this delay by inserting the
CGI variable udcTimeout=1 into the Genome Browser URL, which will reduce
the delay to one second. To add this variable, open the Genome Browser tracks page and zoom or
scroll the image to display a full browser URL in which the CGI variables visible. Insert the CGI
variable just after the "hgTracks" portion of the URL so that it reads
http://genome.ucsc.edu/cgi-bin/hgTracks?udcTimeout=1& (with the remainder of the URL
following the ampersand). To restore the default timeout, a warning message will appear on hgTracks
with a link to clear the udcTimeout variable.
Check hub settings using hubCheck utility
It is a good practice to run the command-line utility hubCheck on your track hub when you
first bring it online and whenever you make significant changes. This utility by default checks
that the files in the hub are correctly formatted, but it can also be configured to check a few
other things including that various trackDb settings are correctly spelled and that they are
supported by the UCSC Genome Browser. You can read more about using hubCheck to check the
compatibility of your hub with other genome browsers below.
Here is the usage statement for the hubCheck utility:
hubCheck - Check a track data hub for integrity.
usage:
hubCheck http://yourHost/yourDir/hub.txt
options:
-checkSettings - check trackDb settings to spec
-version=[v?|url] - version to validate settings against
(defaults to version in hub.txt, or current standard)
-extra=[file|url] - accept settings in this file (or url)
-level=base|required - reject settings below this support level
-settings - just list settings with support level
Will create this directory if not existing
-noTracks - don't check remote files for tracks, just trackDb (faster)
-udcDir=/dir/to/cache - place to put cache for remote bigBeds and bigWigs
Note that you will have to use the udcDir if /tmp/udcCache is not writable on your machine.
The hubCheck program is available from the UCSC downloads server at
http://hgdownload.soe.ucsc.edu/admin/exe/.
Setting up track item search
The Genome Browser supports searching for items within bigBed tracks in track data hubs. To support
this behavior you have to add an index to the bigBed file when you initially create the the bigBed
file from the bed file input. Indices are usually created on the name field of the bed, but can be
created on any field of the bed. Free-text searches can also be enabled by creating a TRIX index
file that maps id's in the track to free-text metadata.
See the searchIndex and searchTrix fields in the Hub Track Database Definition document for
information on how to set up your bigBed to enable searching. See an example
searchable hub.
Displaying Track Hubs by URL and in sessions
Once you have successfully loaded your hub, by pasting the URL to the location of your hub.txt file
into the My Hubs tab of the Track Data Hubs
page, you may want to consider building URLs to directly load the hub along with session
settings.
To build a URL that will load the hub directly, add "&hubUrl=" to the hgTracks
CGI followed by the address of the hub.txt file. You also need to include the UCSC assembly you
are displaying the hub upon such as "db=hg19". For example, here is a working link that
will visualize the ENCODE AWG hub:
http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&hubUrl=http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hub.txt
If you also want to load your hub at specific browser coordinates and with a specific set of other
browser tracks (you can also set the visibility of each track in your hub), you can save your
hub in a session, which are in essence "View Settings" collected in a text file (example
session text files here). Sessions can be
shared in different ways,
please see the instructions for
creating a session and saving it to a file.
By making your session file available over the Internet, you can build a URL that will load the
session automatically by adding "&hgS_loadUrlName=" to the hgTracks CGI followed by
the URL location of the saved session file. The location of the saved session file should be in the
same directory that holds your hub.txt file. Finally add "&hgS_doLoadUrl=submit"
to the URL to inform the browser to load the session.
There are four required variables for your URL to load a session with a hub and an example URL:
db - name of the assembly (e.g. hg19 or mm10)
hubUrl - URL to your track hub
hgS_loadUrlName - URL to your session file
hgS_doLoadUrl - value should be "submit"
http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&hubUrl=http://myLab.org/myHub.txt&hgS_loadUrlName=http://mySession&hgS_doLoadUrl=submit
Another feature one can use in place of "&hubUrl=" is
"&hubClear=", which will load a hub while simultaneously disconnecting
or clearing, hubs located at the same location. For example adding
&hubClear=http://university.edu/lab/folder/hub10.txt
would connect the
referenced hub10.txt while simultaneously disconnecting any hubs that might be displayed from
the same http://university.edu/lab/folder/
directory (for example, hub1.txt, hub2.txt,
ect.). This feature can be useful for dynamically generated hubs that might collect in the browser
otherwise.
Beyond the URL options of "&hubUrl=" and "&hubClear="
there are many other ways to link to the Browser
including a list of URL optional parameters described in the the Custom Tracks User's Guide.
Registering a Track Hub with UCSC
If you would like to share your track hub with other Genome Browser users, you can register your
hub with UCSC by contacting the Genome Browser technical support mailing list at
[email protected]. Please include the URL of your
hub.txt file in the message. Once registered, your hub will appear as a link on the Public Hubs tab
on the Track Hubs page. To assist developers of Public Hubs, there is a
Public Hubs Guidelines
page. The page shares pointers and preferred style approaches, such as the need for creating
description html pages for your data that display any available references and an email contact for
further data questions.
Alternatively, you can share your track hub with selected colleagues by providing them with the URL
needed to load your hub via the My Hubs tab.
Testing hub compatibility between genome browsers
Due to the growth in popularity of the track hub format, other genome browsers have
begun supporting the UCSC track hub format. The hubCheck utility can be used to check
the compatibility of your hub with the UCSC Genome Browser and other genome browsers.
The following examples describe various settings that may be useful to you as you test
this compatibility.
Example 1: Listing the trackDb settings and their support levels
and filtering them by support level.
You can see all of the currently supported trackDb settings and their support level in
the UCSC Genome Browser by running hubCheck with the "-settings" option:
$ hubCheck -settings
You can filter the displayed settings by support level by using the "-level="
option followed by the maximum level you wanted displayed. For example,
to show only the required settings:
$ hubCheck -settings -level=required
Or, to show settings at both the required and base support levels:
$ hubCheck -settings -level=base
When you use the "-level" option to declare a support level, it includes
the level that you define plus every level above it. This hierarchy of the different support
levels is defined at the beginning of the
Track Database Definition document.
Example 2: Checking your trackDb settings against the settings and support levels
provided by the UCSC Genome Browser.
The "-checkSettings" option can be used check the settings in your hub's trackDb file
against those provided by the UCSC Genome Browser on the Track Database definition page.
This does not check to see that all of these settings are properly used, but checks to see
that they are supported by the Genome Browser. For example, to fully run hubCheck on your hub:
$ hubCheck -checkSettings http://genome.ucsc.edu/goldenPath/help/examples/hubDirectory/hub.txt
If you are just looking to check the compatibility of your hub and not the integrity of your files,
you can use the "-noTracks" option to just check setting compatibility. Skipping these
file integrity checks will also speed up hubCheck.
Here is an example of some of the errors you might see if your hub includes
unsupported settings:
$ hubCheck -checkSettings http://genome.ucsc.edu/goldenPath/help/examples/hubExamples/hubCheckUnsupportedSettings/hub.txt
Found 1 problem:
Setting 'ensemblAssemblyName' is unknown/unsupported
If you want to check the settings used in your hub against those at a particular support level
and higher, you can also use the "-level=" setting here as well. For example:
$ hubCheck -checkSettings -level=base http://genome.ucsc.edu/goldenPath/help/examples/hubExamples/hubGroupings/hub.txt
The resulting list of problems reported by hubCheck are settings beyond the base support level and are
less likely to be supported at other genome browsers, as not all external browsers support
the full list of UCSC settings.
We will periodically increment the version number for our trackDb settings
as large numbers of settings are added, updated or (rarely) removed. Using just the
"-checkSettings" option will check your trackDb settings against those defined in
the most recent version of the Track Database definition page. However, if you want to
check your settings against an older version of these settings, you can use the
"-version=" option. For example, if you wanted to check your hub against version one
of our trackDb settings:
$ hubCheck -checkSettings -version=v1 http://genome.ucsc.edu/goldenPath/help/examples/hubExamples/hubGroupings/hub.txt
Example 3: Checking your settings against those provided by UCSC and another
source, such as Ensembl.
If you want to check the settings in your hub against those supported by other genome browsers,
you will first need to create a single-column file that lists each non-UCSC setting and then
use the "-extra=" option to specify this file when running hubCheck. For example,
if you knew that a setting called "ensemblAssemblyName" was supported for use in track
hubs by Ensembl, you could create a single line file that included the setting
"ensemblAssemblyName". Then, when you want to check a hub that includes these extra
trackDb settings, you would then specify this extra settings file on the command line:
$ hubCheck -checkSettings -extra=http://genome.ucsc.edu/goldenPath/help/examples/hubExamples/hubCheckUnsupportedSettings/myExtraSettings.txt http://genome.ucsc.edu/goldenPath/help/examples/hubExamples/hubCheckUnsupportedSettings/hub.txt
(Note: The settings listed here in the "extra" file are
just examples and do not represent real trackDb variables for hubs at Ensembl.)