[Genome] BAT2D1 protein
John Darlow
John.Darlow at ucd.ie
Mon Dec 3 08:32:14 PST 2007
On the page for Human BAT2D1, chr1:169,721,290-169,829,273, Mar 2006
Assembly,
http://genome.ucsc.edu/cgi-bin/hgGene?
hgg_gene=uc001ghs.1&hgg_prot=NP_055987&hgg_chrom=chr1&hgg_start=1697212
89&hgg_end=169829273&hgg_type=knownGene&db=hg18&hgsid=100350226
In the table under 'Sequence and Links to Tools and Databases' you
have 'Protein (2817 aa)', but clicking on the link to Proteome Browser
brings one to a page about a protein (Q9Y520) of 2701 amino-acids, not
2817, and all the other links I tried to find out about domain
structure of the protein also indicated that it only has 2701 amino-
acids.
So, I downloaded your genomic sequence of the BAT2D1 gene, first with
exons in capitals, and then with CDS in capitals to identify the start
and stop, then identified exactly which amino-acids were in each exon
(since your diagram of the protein in the Protein-Browser does not
show the exons), and labelled this on my copy. I found that it
actually has apparently 2816 amino-acids, not 2817 (perhaps you
counted the stop-codon as an amino-acid?) Then I translated it and
aligned the 2816-amino-acid sequence with the 2701-amino-acid sequence
from the Proteome Browser. When I found the place where the two
sequences diverge, I then looked at my annotated genomic sequence and
found where the difference comes.
You have exon 32 (chr1:169,823,411-169,823,506) and then just two
nucleotides, cc, before a 5-nt exon 33 (chr1:169823509-169823513). The
2701-amino-acid protein is made by translating the cc, which makes a
different frame with an earlier stop-codon than the one in which the
cc is spliced out of the RNA.
The question is. Why does your page say 'Protein (2817 aa)' (which
should be 2816) and then give a link to a protein of 2701 aa, and
which is the right answer?
John Darlow
National Centre for Medical Genetics
Dublin
Ireland
More information about the Genome
mailing list