[Genome] BLAT minScore help
Galt Barber
galt at soe.ucsc.edu
Tue Jul 31 17:50:21 PDT 2007
I have just looked at the blat source code and found the following:
Under the routine where the output is saved it does this:
(to read this right, the value for minScore has been passed
down and the local parameter calls it minMatch here)
if (minMatch > qSeq->size/2) minMatch = qSeq->size/2;
if (minMatch < 1) minMatch = 1;
(minMatch here is really minScore)
So that pretty much puts a ceiling on how high
you can set minScore. It seems that going over
half the query size accomplishes nothing.
You can however lower it.
There is another routine ssStitch that uses minScore,
but it ceilings its copy of the minScore variable to 20
before proceeding, so minScore can't be used to filter
tightly here either.
void ssStitch
/* The score may improve when we stitch together more alignments,
* so don't let minScore be too harsh at this stage. */
if (minScore > 20)
minScore = 20;
So, that pretty much means you can't do the kind of filtering
you want with minScore and minIdentity. Probably you would
be better off post-processing your psls with pslReps and
pslCDnaFilter.
-Galt
On Tue, 31 Jul 2007, Chris Smillie wrote:
> Hello,
>
> I have read the BLAT documentation and it seems that the score of an
> alignment is calculated (roughly) as follows:
>
> (number matches) - (number mismatches) - (some gap penalty).
>
> I am trying to map 33 bp solexa reads onto a set of contigs, and I was
> hoping to filter the results with the minScore option.
>
> With 33 bp reads, it seems that the highest score possible would be 33 (33
> matches - 0 mismatches - 0 gaps). However, after experimenting with
> different cut-off scores, I have found that this is not the case.
>
> When I use the -minScore=33 option, I still get matches that are only 20 bp
> in length. Even more, if I use -minScore=9999999999999, I still manage to
> get hits!
>
> This is confusing to me and I feel that I must be doing something wrong. Can
> you suggest anything? Thank you,
>
> Chris
>
> P.S. Here is an example of a command that I have tried to use:
>
> ../../blatSuite.34/blat -t=dna -q=dna -tileSize=6 -stepSize=6 -minMatch=2
> -minScore=99999999999999 -minIdentity=93 -out=pslx ../454LargeContigs.fna
> ../salmonella.fasta out.pslx
>
> But this still manages to return hits...
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>
More information about the Genome
mailing list