[Genome] local gfClient / gfServer vs. web BLAT on short sequences
Julien Lagarde
jlagarde at imim.es
Thu Nov 16 03:24:02 PST 2006
Hi Genome,
I am a bit puzzled by the alignements i get with my local gfClient /
gfServer compared to those given by the online UCSC blat server.
my input seq is:
>chr22_primer
TTGCCTTCTCCCTCATCGAGGGTTA
Online BLAT result (hg17) is:
match mis- rep. N's Q gap Q gap T gap T gap
strand Q Q Q Q T T T
T block blockSizes qStarts tStarts
match match count bases count bases
name size start end name size
start end count
---------------------------------------------------------------------------------------------------------------------------------------------------------------
25 0 0 0 0 0 0 0 - chr22_primer 25 0
25 chr22 49554710 29006841 29006866 1 25, 0,
29006841,
21 1 0 0 0 0 0 0 + chr22_primer 25 3
25 chr7 158628139 73878910 73878932 1 22, 3,
73878910,
local gfClient/gfServer (v.32) output is:
11 0 0 0 0 0 0 0 + chr22_primer 25 0
11 chr1 245522847 7292025 7292036 1 11, 0, 7292025,
11 0 0 0 0 0 0 0 + chr22_primer 25 0
11 chr1 245522847 14727360 14727371 1 11, 0,
14727360,
11 0 0 0 0 0 0 0 + chr22_primer 25 0
11 chr1 245522847 17882525 17882536 1 11, 0,
17882525,
11 0 0 0 0 0 0 0 + chr22_primer 25 0
11 chr1 245522847 18867990 18868001 1 11, 0,
18867990,
11 0 0 0 0 0 0 0 + chr22_primer 25 0
11 chr1 245522847 19615590 19615601 1 11, 0,
19615590,
11 0 0 0 0 0 0 0 + chr22_primer 25 0
11 chr1 245522847 24596870 24596881 1 11, 0,
24596870,
12 0 0 0 0 0 0 0 + chr22_primer 25 0
12 chr1 245522847 26453344 26453356 1 12, 0,
26453344,
15 0 0 0 0 0 0 0 + chr22_primer 25 0
15 chr12 132449811 88185840 88185855 1 15, 0,
88185840,
15 0 0 0 0 0 0 0 + chr22_primer 25 0
15 chr12 132449811 131134400 131134415 1 15, 0,
131134400,
14 0 0 0 0 0 0 0 - chr22_primer 25 2
16 chr1 245522847 10657190 10657204 1 14, 9,
10657190,
13 0 0 0 0 0 0 0 - chr22_primer 25 2
15 chr1 245522847 19442182 19442195 1 13, 10,
19442182,
12 0 0 0 0 0 0 0 - chr22_primer 25 3
15 chr1 245522847 22864170 22864182 1 12, 10,
22864170,
15 0 0 0 0 0 0 0 - chr22_primer 25 0
15 chr11 134452384 73282830 73282845 1 15, 10,
73282830,
15 0 0 0 0 0 0 0 - chr22_primer 25 0
15 chr2 243018229 60887140 60887155 1 15, 10,
60887140,
25 0 0 0 0 0 0 0 - chr22_primer 25 0
25 chr22 49554710 29006841 29006866 1 25, 0,
29006841,
15 0 0 0 0 0 0 0 - chr22_primer 25 1
16 chr4 191411218 48118575 48118590 1 15, 9,
48118575,
16 0 0 0 0 0 0 0 - chr22_primer 25 1
17 chr7 158628139 4776264 4776280 1 16, 8, 4776264,
The parameters i use on my local installation are:
# convert hg17 to 2bit, no mask:
$ faToTwoBit -noMask
/seq/genomes/H.sapiens/golden_path_200405/chromFa/*fa
complete_hg17_noMask.2bit
# start gfServer:
$ gfServer -tileSize=10 -stepSize=5 -canStop start localhost 3500
complete_hg17_noMask.2bit
# query with gfClient:
$ gfClient -minScore=0 -nohead -minIdentity=0 localhost 3500 /
chr22_primer.fa chr22_primer.psl
I'm trying to make my local blat as sensitive as possible for short
seqs, following the recommendations posted by you guys in this list.
This results in many spurious hits, as expected. No problem, I can deal
with this, but the thing that bothers me is that my blat skips an
obvious, near-perfect match on chr7
(qStart=73878910)
that the online blat finds.
Do you have any idea why?
Thanks in advance,
j.
--
-----------------------------------------------------
Julien Lagarde
Genome Bioinformatics Research Group
Centre de Regulacio Genomica
Grup de Recerca en Informatica Biomedica (IMIM)
Dr. Aiguader, 88 (+34) 93 3160166 ph
E-08003 Barcelona (+34) 93 3160099 fax
http://genome.imim.es
--------------------------------
More information about the Genome
mailing list