[Genome] BLAT client/server and parameters
Vikram Agarwal
vagarwal at mail.utexas.edu
Sat Jul 28 12:38:24 PDT 2007
Hello,
I just have a few general questions about BLAT:
1) Is it possible to use fastMap, the 11.ooc, and masking options of a
2bit database file with gfClient, or are these parameters only
restricted to command line BLAT? Are there any speed advantages in
using nib vs. 2bit databases?
2) If I use a 2bit database (~800 Mb) of the human genome, why does it
take up >3.1Gb memory when sending a query of 1000 sequences, while with
gfServer it only uses ~1Gb memory? If I upload the database from
command line, it reports that I have 33 sequences in the database, while
in reality I generated the 2bit file using the faToTwoBit command with
only my 24 chromosomes in fasta format.
3) Why are the results different when I output in blast8 vs. psl
format? For most input sequences, I get many more hits with blast8. If
I change the minScore and minIdentity while outputting in blast8 format,
the output remains exactly the same. Why is this so? For blast8
format, why do I recieve sequences with ~70% homology when the default
for minIdentity is 90%? I also get a lot of spurious results of reads
matching <10 bases with 100% homology in the results. How to I block
out results below a certain length?
Your help is much appreciated,
Vikram Agarwal
More information about the Genome
mailing list