[Genome] known gene IDs (kgId) refer to wrong Ensembl transcript ID (ENST) in knownToEnsembl table
Kayla Smith
kayla at soe.ucsc.edu
Thu Nov 8 17:24:41 PST 2007
Paco,
The Ensembl Gene Predictions track, ensGene, is a prediction track.
Notice that ACCN1 and ABCB8 are neighbors on chr7. There are a few
predicted ENSTs that span both of those genes. This is why you are
seeing ENSG00000197150 correspond to the two genes.
Our ensGene and ensGtp tables are out of date, and updating them is on
or list of things to do. I can let you know when this information is
updated.
I hope this information is helpful to you. Please don't hesitate to
contact us again if you require further assistance.
Kayla Smith
UCSC Genome Bioinformatics Group
Paco Hulpiau wrote:
> Hi,
>
> I want to get transcripts (both refseq NM_s and ensembl ENSTs) for every
> gene by using the approved symbols from HGNC. The script to do the job
> stopped because it found a duplicate entry for a hgncId and was
> apparently caused by the entries below.
>
> The HGNC symbols are searched in the kgXref table to get the RefSeqs and
> I use the known gene ID (kgId) to get the Ensembl transcripts. In the
> knownToEnsembl table I get one ENST id using the kgId and then use this
> ENST to get the ENSG id in the ensGtp table. If I have the ENSG I can
> get all ENSTs for the gene again in the ensGtp table.
>
> Both ACCN3 and ABCB8 lead to ENSG00000197150 (ABCB8). I think the kgIds
> for ACCN3 are referring to the wrong ENST ids in the knownToEnsembl
> table. Same for ASIC3. In the Ensembl genome browser ENST00000356058 is
> a transcript for ABCB8 and not for ACCN3 or for ASIC3. Is this correct
> or am I missing something here?
>
> kgXref table knownToEnsembl table ensGtp table
> [kgId] (using kgId) (using ENST)
> uc003win.1 NM_004769 ACCN3 ENST00000356058 ENSG00000197150
> uc003wio.1 NM_020321 ACCN3 ENST00000356058 ENSG00000197150
> uc003wip.1 NM_020322 ACCN3 ENST00000356058 ENSG00000197150
> uc003wik.1 NM_007188 ABCB8 ENST00000358849 ENSG00000197150
> uc003wiq.1 AB209421 ASIC3 ENST00000356058 ENSG00000197150
> uc003wil.1 AK002018 ABCB8 ENST00000297504 ENSG00000197150
> uc003wim.1 AK094005 ABCB8 ENST00000358849 ENSG00000197150
> uc003wij.1 CR599833 ABCB8 ENST00000356058 ENSG00000197150
> uc003wii.1 AK128129 ABCB8 ENST00000356058 ENSG00000197150
>
> Another thing I've noticed (e.g. for PCDH11X) is that some ENSTs are
> deprecated in Ensembl but are still in the ensGtp table. Now I also use
> the Ensembl API to look if all ENSTs found for a certain gene have a
> stable ID or not.
>
> Maybe there is another way to do the job? Thanks for any help or comment.
>
> Regards,
>
> Paco
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
More information about the Genome
mailing list