[Genome] protein ID / align ID
Fan Hsu
fanhsu at soe.ucsc.edu
Wed Mar 14 11:16:55 PDT 2007
Hi Anton,
If you look closely, NM_001001486 and NM_001001487 are slightly different
in their CDS structure. Their 2nd to last exons have different length.
The Known Gene NM_001001487 matches better with a UniProt splice variant,
P98194-3,
which is a splice isoform of P98194 (AT2C1_HUMAN).
The UCSC Known Genes tries to include different isoforms of a gene when
there are supporting data.
Fan.
-----Original Message-----
From: genome-bounces at soe.ucsc.edu [mailto:genome-bounces at soe.ucsc.edu]On
Behalf Of Anton Kratz
Sent: Tuesday, March 13, 2007 11:10 PM
To: genome at soe.ucsc.edu
Subject: [Genome] protein ID / align ID
Hi,
In the UCSC KnownGene (May 2004) dataset, why are there sometimes genes
which differ only in the protein ID and align ID fields?
For example NM_001001486 and NM_001001487 (exact description below) look
identical in structure, only protein ID and align ID are different.
I wonder what makes them different genes? Could you shortly explain the
meaning of protein ID and align ID fields (how to interpret them)?
Regarding different align ID, I thought it means that the KnownGene maps to
different locations on the genome with same quality. But start - end
positions are identical here.
regards,
Anton
name, chrom, strand, txStart, txEnd, cdsStart, cdsEnd, exonCount,
exonStarts, exonEnds, proteinID, alignID
NM_001001486 chr3 + 132096131 132218251
132096311 132217789 28 132096131,132131957,132133
563,132136163,132138969,132142171,132143132,132155362,132156553,132157645,13
2160816,132165512,132166489,132168686,132168871,132170
833,132176873,132180790,132182123,132194492,132195473,132197584,132198221,13
2199147,132199835,132201059,132202761,132217693,
13
2096317,132132068,132133680,132136253,132139005,132142233,132143241,13215551
8,132156622,132157721,132160883,132165637,132166587,13
2168782,132168961,132170938,132177030,132180961,132182221,132194543,13219564
0,132197653,132198338,132199295,132199931,132201201,13
2202886,132218251, AT2C1_HUMAN R9596
NM_001001487 chr3 + 132096131 132218251
132096311 132217789 28 132096131,132131957,132133
563,132136163,132138969,132142171,132143132,132155362,132156553,132157645,13
2160816,132165512,132166489,132168686,132168871,132170
833,132176873,132180790,132182123,132194492,132195473,132197584,132198221,13
2199147,132199835,132201059,132202761,132217693,
13
2096317,132132068,132133680,132136253,132139005,132142233,132143241,13215551
8,132156622,132157721,132160883,132165637,132166587,13
2168782,132168961,132170938,132177030,132180961,132182221,132194543,13219564
0,132197653,132198338,132199295,132199931,132201201,13
2202856,132218251, P98194-3 R17233
_______________________________________________
Genome maillist - Genome at soe.ucsc.edu
http://www.soe.ucsc.edu/mailman/listinfo/genome
More information about the Genome
mailing list