[Genome] blat - stepSize vs. maxGap
Galt Barber
galt at soe.ucsc.edu
Tue Jan 22 17:18:58 PST 2008
stepSize was traditionally the same as tileSize, 11.
This creates non-overlapping tiles.
For greater sensitivity to support PCR, we use
stepSize=5 for dna or rna queries. In this case
it creates overlapping tiles.
For maxGap > 0, it means that the two tiles can be on nearby
diagonals rather than on the same diagonal and still be considered
a clump/match to seed an alignment trial.
There is no BLAT command-line parameter that corresponds to W.
In the paper, they seem to use W as if it is the width of the query.
The letters a,b,c,d seem to represent tile hits, equal in size to
tileSize. W allows them to estimate the specificity and sensitivity
of blat mathematically given that they want to solve the case
"Searching With Multiple Perfect Matches".
I suppose if the math is too difficult to get right,
one could use Monte-Carlo simulations to measure
the actual statistics of BLAT performance.
-Galt
On Tue, 22 Jan 2008, Isaac Ho wrote:
> Hi--
>
> I wanted to clarify the definitions for a few of the terms in the Blat
> documentation.
>
> What is the difference between stepSize and maxGap? The explanation in
> the documentation that stepSize equals "spacing between tiles" is
> ambiguious to me.
>
> Also, which setting represents the variable "W" shown in Figure 1 of the
> BLAT paper published in Genome Research? ( In that figure, W determines
> the maximum distance allowed between two tile hits s.t. they can be
> clumped together; these tile hits must be located on the same diagonal,
> so no indels between them. )
>
> Thanks,
>
> Isaac
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>
More information about the Genome
mailing list