4b3dd9e844b2a6c502e5405e70f7976e30df7225 braney Thu Apr 25 13:13:51 2024 -0700 knownGeneV45lift37 on hg19 diff --git src/hg/makeDb/trackDb/human/hg19/knownGeneV45lift37.html src/hg/makeDb/trackDb/human/hg19/knownGeneV45lift37.html new file mode 100644 index 0000000..a18605f --- /dev/null +++ src/hg/makeDb/trackDb/human/hg19/knownGeneV45lift37.html @@ -0,0 +1,178 @@ +

Description

+

+The GENCODE Genes track (version 45, January 2024) shows high-quality manual +annotations merged with evidence-based automated annotations across the entire +human genome generated by the +GENCODE project. +By default, only the basic gene set is +displayed, which is a subset of the comprehensive gene set. The basic set represents transcripts +that GENCODE believes will be useful to the majority of users.

+ +

+The track includes protein-coding genes, non-coding RNA genes, and pseudo-genes, though pseudo-genes +are not displayed by default. It contains annotations on the reference chromosomes as well as +assembly patches and alternative loci (haplotypes).

+ +

+The following table provides statistics for the v45 release derived from the GTF file that contains +annotations only on the main chromosomes. More information on how they were generated can be found +in the GENCODE site.

+ +

+

+ + + + + + + + +
GENCODE v45 Release Stats
GenesObservedTranscriptsObserved
Protein-coding genes19,395Protein-coding transcripts89,110
Long non-coding RNA genes20,424- full length protein-coding64,028
Small non-coding RNA genes7,565- partial length protein-coding25,082
Pseudogenes14,719Nonsense mediated decay transcripts21,427
Immunoglobulin/T-cell receptor gene segments648Long non-coding RNA loci transcripts59,719
Total No of distinct translations65,357Genes that have more than one distinct translations13,600

+

+ +

+For more information on the different gene tracks, see our Genes FAQ.

+ +

Display Conventions and Configuration

+

+By default, this track displays only the basic GENCODE set, splice variants, and non-coding genes. +It includes options to display the entire GENCODE set and pseudogenes. To customize these +options, the respective boxes can be checked or unchecked at the top of this description page. + +

+This track also includes a variety of labels which identify the transcripts when visibility is set +to "full" or "pack". Gene symbols (e.g. NIPA1) are displayed by default, but +additional options include GENCODE Transcript ID (ENST00000561183.5), UCSC Known Gene ID +(uc001yve.4), UniProt Display ID (Q7RTP0). Additional information about gene +and transcript names can be found in our +FAQ.

+ +

+This track, in general, follows the display conventions for gene prediction tracks. The exons for +putative non-coding genes and untranslated regions are represented by relatively thin blocks, while +those for coding open reading frames are thicker. +

Coloring for the gene annotations is based on the annotation type:

+ + +

+This track contains an optional codon coloring feature that allows users to +quickly validate and compare gene predictions. There is also an option to display the data as +a density graph, which +can be helpful for visualizing the distribution of items over a region.

+ + +

Squishy-pack Display

+

+Within a gene using the pack display mode, transcripts below a specified rank will be +condensed into a view similar to squish mode. The transcript ranking approach is +preliminary and will change in future releases. The transcripts rankings are defined by the +following criteria for protein-coding and non-coding genes:

+Protein_coding genes +
    +
  1. MANE or Ensembl canonical + +
  2. +
  3. Coding biotypes + +
  4. +
  5. Completeness + +
  6. +
  7. CARS score (only for coding transcripts)
  8. +
  9. Transcript genomic span and length (only for non-coding transcripts)
  10. +
+Non-coding genes +
    +
  1. Transcript biotype + +
  2. +
  3. Ensembl canonical
  4. +
  5. GENCODE basic
  6. +
  7. Transcript genomic span
  8. +
  9. Transcript length
  10. +
+ + +

Methods

+

+The GENCODE v45 track was built from the GENCODE downloads file +gencode.v45.chr_patch_hapl_scaff.annotation.gff3.gz. Data from other sources +were correlated with the GENCODE data to build association tables.

+ +

Related Data

+

+The GENCODE Genes transcripts are annotated in numerous tables, each of which is also available as a +downloadable +file. + +

+One can see a full list of the associated tables in the Table Browser by selecting GENCODE Genes from the track menu; this list +is then available on the table menu. + + +

Data access

+

+GENCODE Genes and its associated tables can be explored interactively using the +REST API, the +Table Browser or the +Data Integrator. +The genePred format files for hg38 are available from our + +downloads directory or in our + +GTF download directory. +All the tables can also be queried directly from our public MySQL +servers, with more information available on our +help page as well as on +our blog.

+ +

Credits

+

+The GENCODE Genes track was produced at UCSC from the GENCODE comprehensive gene set using a +computational pipeline developed by Jim Kent and Brian Raney. This version of the track was +generated by Jonathan Casper.

+ +

References

+ +

+Frankish A, Carbonell-Sala S, Diekhans M, Jungreis I, Loveland JE, Mudge JM, Sisu C, Wright JC, +Arnan C, Barnes I et al. + +GENCODE: reference annotation for the human and mouse genomes in 2023. +Nucleic Acids Res. 2023 Jan 6;51(D1):D942-D949. +PMID: 36420896; PMC: PMC9825462 +

+ +

A full list of GENCODE publications is available +at The GENCODE +Project web site. +

+ +

Data Release Policy

+

GENCODE data are available for use without restrictions.