[Genome] parameters for blat with 25mers
Yueming Ding
yueming.ding at jax.org
Thu May 10 06:55:49 PDT 2007
Hi Kayla,
I need your help on another problem. I am using regular BLAT with 25mers to scan mouse genome (use default parameters). I get hits for only half of the query sequences. Could you please tell me how I can set some parameters so that I can blat with 25mers? Thanks.
Yueming
-----Original Message-----
From: Kayla Smith [mailto:kayla at soe.ucsc.edu]
Sent: Tuesday, May 08, 2007 5:16 PM
To: Yueming Ding
Cc: genome at soe.ucsc.edu
Subject: Re: [Genome] blat
Yueming,
I've asked one of our developers about your question and here is what he
has to say:
Please see "Replicating web-based Blat percent identity and score
calculations" http://genome.ucsc.edu/FAQ/FAQblat.html#blat4
which has all details needed to calculate our hgBlat score.
Indels and mismatches are not treated the same,
that includes how BLAT does alignments and
how it calculates the final score.
BLAT builds the exons as alignments with
matches/mismatches extending from the seed
position until the alignment cannot be extended.
Then the parts are chained together giving
a final alignment that has exons and introns. All the
details of the score calculation are given above.
In general huge introns do not carry a huge
penalty. It's not subtracting one for
each base of the intron gap. It actually only
subtracts one for the entire gap or insert.
In general also a mismatch consumes a base from the query
side whereas a gap does not, e.g.
mismatch T/C (T in query is consumed)
query: ACTGACTG
target: ACCGACTG
gap example (gap on query side, nothing in query is consumed)
query: ACT---------GACTG
target: ACTCGCCGGCCCGACTG
Note on repeatMatches:
Despite the documentation, the repeatMatches feature
of the psls is basically not used, so you won't see
anything in that column. Instead, a match
in a repeated area will just be a regular match.
I hope this information is helpful to you. Please don't hesitate to
contact us again if you require further assistance.
Kayla Smith
UCSC Genome Bioinformatics Group
Yueming Ding wrote:
> Generator Microsoft Word 11 (filtered medium) Hi, is anyone able to tell me how Jim' s BLAT handles mismatch and indels? Does BLAT treat mismatch and indels equally (by assigning the same penalty scores)? Thanks.
>
> Yueming Ding
> The Jackson Laboratory
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
> From - Tue
More information about the Genome
mailing list