[Genome] blat not finding entire protein sequence

Marcos H Woehrmann marcosw at ucsc.edu
Sun May 20 11:57:33 PDT 2007


I'm using blat to search for the following protein 
sequence in the March 2006 assembly of the Human Genome:

SCDDFLGQLPHGRVLLPLNLQLGAKVSFVCDEGFRLKGRSASHCVLAGMKALWNSSVPVCEQI

The UCSC genome browser returns a 100% hit but only for 
the first 61 characters:

182  1 61  63 100.0%  1  ++  205851912 205854456  2545

However, when I look at the sequence in the browser the 
final two amino acids, Q and I, are there.  There is an 
intron located within the Q codon, but there is another 
intron in this squence which is handled correctly.

Are introns near the end of the query sequence a problem 
for blat?

marcos


More information about the Genome mailing list