4b3dd9e844b2a6c502e5405e70f7976e30df7225 braney Thu Apr 25 13:13:51 2024 -0700 knownGeneV45lift37 on hg19 diff --git src/hg/makeDb/trackDb/human/hg19/knownGeneV45lift37.html src/hg/makeDb/trackDb/human/hg19/knownGeneV45lift37.html new file mode 100644 index 0000000..a18605f --- /dev/null +++ src/hg/makeDb/trackDb/human/hg19/knownGeneV45lift37.html @@ -0,0 +1,178 @@ +
+The GENCODE Genes track (version 45, January 2024) shows high-quality manual +annotations merged with evidence-based automated annotations across the entire +human genome generated by the +GENCODE project. +By default, only the basic gene set is +displayed, which is a subset of the comprehensive gene set. The basic set represents transcripts +that GENCODE believes will be useful to the majority of users.
+ ++The track includes protein-coding genes, non-coding RNA genes, and pseudo-genes, though pseudo-genes +are not displayed by default. It contains annotations on the reference chromosomes as well as +assembly patches and alternative loci (haplotypes).
+ ++The following table provides statistics for the v45 release derived from the GTF file that contains +annotations only on the main chromosomes. More information on how they were generated can be found +in the GENCODE site.
+ ++
+ ++
+ GENCODE v45 Release Stats + Genes Observed Transcripts Observed + Protein-coding genes 19,395 Protein-coding transcripts 89,110 + Long non-coding RNA genes 20,424 - full length protein-coding 64,028 + Small non-coding RNA genes 7,565 - partial length protein-coding 25,082 + Pseudogenes 14,719 Nonsense mediated decay transcripts 21,427 + Immunoglobulin/T-cell receptor gene segments 648 Long non-coding RNA loci transcripts 59,719 + Total No of distinct translations 65,357 Genes that have more than one distinct translations 13,600
+
+For more information on the different gene tracks, see our Genes FAQ.
+ ++By default, this track displays only the basic GENCODE set, splice variants, and non-coding genes. +It includes options to display the entire GENCODE set and pseudogenes. To customize these +options, the respective boxes can be checked or unchecked at the top of this description page. + +
+This track also includes a variety of labels which identify the transcripts when visibility is set +to "full" or "pack". Gene symbols (e.g. NIPA1) are displayed by default, but +additional options include GENCODE Transcript ID (ENST00000561183.5), UCSC Known Gene ID +(uc001yve.4), UniProt Display ID (Q7RTP0). Additional information about gene +and transcript names can be found in our +FAQ.
+ ++This track, in general, follows the display conventions for gene prediction tracks. The exons for +putative non-coding genes and untranslated regions are represented by relatively thin blocks, while +those for coding open reading frames are thicker. +
Coloring for the gene annotations is based on the annotation type:
++This track contains an optional codon coloring feature that allows users to +quickly validate and compare gene predictions. There is also an option to display the data as +a density graph, which +can be helpful for visualizing the distribution of items over a region.
+ + ++Within a gene using the pack display mode, transcripts below a specified rank will be +condensed into a view similar to squish mode. The transcript ranking approach is +preliminary and will change in future releases. The transcripts rankings are defined by the +following criteria for protein-coding and non-coding genes:
+Protein_coding genes +
+The GENCODE v45 track was built from the GENCODE downloads file
+gencode.v45.chr_patch_hapl_scaff.annotation.gff3.gz
. Data from other sources
+were correlated with the GENCODE data to build association tables.
+The GENCODE Genes transcripts are annotated in numerous tables, each of which is also available as a +downloadable +file. + +
+One can see a full list of the associated tables in the Table Browser by selecting GENCODE Genes from the track menu; this list +is then available on the table menu. + + +
+GENCODE Genes and its associated tables can be explored interactively using the +REST API, the +Table Browser or the +Data Integrator. +The genePred format files for hg38 are available from our + +downloads directory or in our + +GTF download directory. +All the tables can also be queried directly from our public MySQL +servers, with more information available on our +help page as well as on +our blog.
+ ++The GENCODE Genes track was produced at UCSC from the GENCODE comprehensive gene set using a +computational pipeline developed by Jim Kent and Brian Raney. This version of the track was +generated by Jonathan Casper.
+ ++Frankish A, Carbonell-Sala S, Diekhans M, Jungreis I, Loveland JE, Mudge JM, Sisu C, Wright JC, +Arnan C, Barnes I et al. + +GENCODE: reference annotation for the human and mouse genomes in 2023. +Nucleic Acids Res. 2023 Jan 6;51(D1):D942-D949. +PMID: 36420896; PMC: PMC9825462 +
+ +A full list of GENCODE publications is available +at The GENCODE +Project web site. +
+ +GENCODE data are available for use without restrictions.