[Genome] BAT2D1 protein

John Darlow John.Darlow at ucd.ie
Mon Dec 3 08:32:14 PST 2007


On the page for Human BAT2D1, chr1:169,721,290-169,829,273, Mar 2006 
Assembly, 

http://genome.ucsc.edu/cgi-bin/hgGene?
hgg_gene=uc001ghs.1&hgg_prot=NP_055987&hgg_chrom=chr1&hgg_start=1697212
89&hgg_end=169829273&hgg_type=knownGene&db=hg18&hgsid=100350226

In the table under 'Sequence and Links to Tools and Databases' you 
have 'Protein (2817 aa)', but clicking on the link to Proteome Browser 
brings one to a page about a protein (Q9Y520) of 2701 amino-acids, not 
2817, and all the other links I tried to find out about domain 
structure of the protein also indicated that it only has 2701 amino-
acids.

So, I downloaded your genomic sequence of the BAT2D1 gene, first with 
exons in capitals, and then with CDS in capitals to identify the start 
and stop, then identified exactly which amino-acids were in each exon 
(since your diagram of the protein in the Protein-Browser does not 
show the exons), and labelled this on my copy. I found that it 
actually has apparently 2816 amino-acids, not 2817 (perhaps you 
counted the stop-codon as an amino-acid?) Then I translated it and 
aligned the 2816-amino-acid sequence with the 2701-amino-acid sequence 
from the Proteome Browser. When I found the place where the two 
sequences diverge, I then looked at my annotated genomic sequence and 
found where the difference comes.

You have exon 32 (chr1:169,823,411-169,823,506) and then just two 
nucleotides, cc, before a 5-nt exon 33 (chr1:169823509-169823513). The 
2701-amino-acid protein is made by translating the cc, which makes a 
different frame with an earlier stop-codon than the one in which the 
cc is spliced out of the RNA.

The question is. Why does your page say 'Protein (2817 aa)' (which 
should be 2816) and then give a link to a protein of 2701 aa, and 
which is the right answer?

John Darlow
National Centre for Medical Genetics
Dublin
Ireland




More information about the Genome mailing list