[Genome] blat - stepSize vs. maxGap

Galt Barber galt at soe.ucsc.edu
Tue Jan 22 17:18:58 PST 2008


stepSize was traditionally the same as tileSize, 11.
This creates non-overlapping tiles.
For greater sensitivity to support PCR, we use
stepSize=5 for dna or rna queries.  In this case
it creates overlapping tiles.

For maxGap > 0, it means that the two tiles can be on nearby
diagonals rather than on the same diagonal and still be considered
a clump/match to seed an alignment trial.

There is no BLAT command-line parameter that corresponds to W.

In the paper, they seem to use W as if it is the width of the query.
The letters a,b,c,d seem to represent tile hits, equal in size to
tileSize.  W allows them to estimate the specificity and sensitivity
of blat mathematically given that they want to solve the case
"Searching With Multiple Perfect Matches".

I suppose if the math is too difficult to get right,
one could use Monte-Carlo simulations to measure
the actual statistics of BLAT performance.

-Galt


On Tue, 22 Jan 2008, Isaac Ho wrote:

> Hi--
>
> I wanted to clarify the definitions for a few of the terms in the Blat
> documentation.
>
> What is the difference between stepSize and maxGap?   The explanation in
> the documentation that stepSize equals "spacing between tiles" is
> ambiguious to me.
>
> Also, which setting represents the variable "W" shown in Figure 1 of the
> BLAT paper published in Genome Research?  ( In that figure, W determines
> the maximum distance allowed between two tile hits s.t. they can be
> clumped together;  these tile hits must be located on the same diagonal,
> so no indels between them. )
>
> Thanks,
>
> Isaac
> _______________________________________________
> Genome maillist  -  Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>


More information about the Genome mailing list