[Genome] Known Genes and Refseq genes
Rachel Harte
hartera at soe.ucsc.edu
Mon Sep 3 11:23:06 PDT 2007
Hello Zhang,
Known Genes has recently been updated to UCSC Genes which now includes
both protein-coding and noncoding genes. The ID that you provide below
(uc001dbx.1) is a UCSC stable ID from this new track on the human hg18
assembly.
If you go the the latest human assembly, hg18, and click on the blue/gray bar
at the left side of the UCSC Genes track (or on the hyperlink above the
track control), you will see the description for this track where it
is stated that it includes gene predictions for noncoding genes:
"The UCSC Genes track shows gene predictions based on data from RefSeq,
Genbank, and UniProt. This is a moderately conservative set of
predictions, requiring the support of one GenBank RNA sequence plus at
least one additional line of evidence. The RefSeq RNAs are an exception to
this, requiring no additional evidence. The track includes both
protein-coding and putative non-coding transcripts. Some of these
non-coding transcripts may actually code for protein, but the evidence for
the associated protein is weak at best."
Our Table Browser will allow you to select only the protein-coding genes
in the UCSC Gene set. If you click on the "Tables" link on the top blue
menu bar, you will be taken to the Table Browser interface. Select the
following:
genome: Human assembly: Mar. 2006
group: Genes and Gene Prediction Tracks track: UCSC Genes
table: kgTxInfo
Then click on the "create" button next to filter and you can set the
category to be coding.
Alternatively, you can use our public mySQL server to query our database
tables directly:
http://genome.ucsc.edu/FAQ/FAQdownloads#download29
I hope that this will help you. Please let us know if you have further
questions.
Rachel
Rachel Harte
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu
On Sun, 2 Sep 2007, zhangrui wrote:
> Hi,
>
> I found that the known genes set of human contains some noncoding genes, such as microRNAs. For example, "uc001dbx.1 - 65296704 65296779 chr1", which is correspond to "hsa-mir-101-1".
> However, in the website, it is said that the Known Genes track shows known protein coding genes based on proteins from SWISS-PROT, TrEMBL, and TrEMBL-NEW and their corresponding mRNAs from GenBank. Could you tell me the possible reason and how can I get the known protein coding genes in the human genomes?
>
> whether the Refseq genes only contain protein coding genes?
>
> Thanks,
> Zhang
>
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>
More information about the Genome
mailing list