[Genome] protein ID / align ID

Fan Hsu fanhsu at soe.ucsc.edu
Wed Mar 14 11:16:55 PDT 2007


Hi Anton,

If you look closely, NM_001001486 and NM_001001487 are slightly different
in their CDS structure.  Their 2nd to last exons have different length.

The Known Gene NM_001001487 matches better with a UniProt splice variant,
P98194-3,
which is a splice isoform of P98194 (AT2C1_HUMAN).

The UCSC Known Genes tries to include different isoforms of a gene when
there are supporting data.

Fan.
-----Original Message-----
From: genome-bounces at soe.ucsc.edu [mailto:genome-bounces at soe.ucsc.edu]On
Behalf Of Anton Kratz
Sent: Tuesday, March 13, 2007 11:10 PM
To: genome at soe.ucsc.edu
Subject: [Genome] protein ID / align ID


Hi,



In the UCSC KnownGene (May 2004) dataset, why are there sometimes genes
which differ only in the protein ID and align ID fields?

For example NM_001001486 and NM_001001487 (exact description below) look
identical in structure, only protein ID and align ID are different.
I wonder what makes them different genes? Could you shortly explain the
meaning of protein ID and align ID fields (how to interpret them)?



Regarding different align ID, I thought it means that the KnownGene maps to
different locations on the genome with same quality. But start - end
positions are identical here.



regards,

Anton





name, chrom, strand, txStart, txEnd, cdsStart, cdsEnd, exonCount,
exonStarts, exonEnds, proteinID, alignID



NM_001001486    chr3    +       132096131       132218251
132096311       132217789       28      132096131,132131957,132133

563,132136163,132138969,132142171,132143132,132155362,132156553,132157645,13
2160816,132165512,132166489,132168686,132168871,132170


833,132176873,132180790,132182123,132194492,132195473,132197584,132198221,13
2199147,132199835,132201059,132202761,132217693,
13

2096317,132132068,132133680,132136253,132139005,132142233,132143241,13215551
8,132156622,132157721,132160883,132165637,132166587,13


2168782,132168961,132170938,132177030,132180961,132182221,132194543,13219564
0,132197653,132198338,132199295,132199931,132201201,13

2202886,132218251,      AT2C1_HUMAN     R9596



NM_001001487    chr3    +       132096131       132218251
132096311       132217789       28      132096131,132131957,132133

563,132136163,132138969,132142171,132143132,132155362,132156553,132157645,13
2160816,132165512,132166489,132168686,132168871,132170

833,132176873,132180790,132182123,132194492,132195473,132197584,132198221,13
2199147,132199835,132201059,132202761,132217693,
13

2096317,132132068,132133680,132136253,132139005,132142233,132143241,13215551
8,132156622,132157721,132160883,132165637,132166587,13

2168782,132168961,132170938,132177030,132180961,132182221,132194543,13219564
0,132197653,132198338,132199295,132199931,132201201,13


2202856,132218251,      P98194-3        R17233
_______________________________________________
Genome maillist  -  Genome at soe.ucsc.edu
http://www.soe.ucsc.edu/mailman/listinfo/genome



More information about the Genome mailing list