[Genome] blat: same results with and without -q=rna
Ann Zweig
ann at soe.ucsc.edu
Thu Oct 5 09:07:16 PDT 2006
Hello Adnan,
When you blat with -q=rna, it assumes you know that the sequence given
is on the coding strand. So it tries the query against both strands of
the target and stops.
With -q=dna it continues, retrying the reverse complement of the query
against both strands, too. Often the results from the reverse
complement are nearly identical, so you get (near-)dupes of everything.
In certain cases -q=dna matters, but when mis-used it just makes more
dupes that have to be filtered out.
ESTs have introns, but the orientation is often unknown, so using
-q=dna sometimes makes sense for these. For high-quality rna sequence
with known strand orientation, use -q=rna.
If you want no intron processing, specify fastmap. If you want finer
intron processing for full-length mrnas, use -fine.
Here is a previously-answered mail list question that may be of help to
you:
Question:
What option "-q rna" really does besides trimming polyA tail? It seems
blat still works if I use "T" not "U" in the sequence. Is it necessary
to use this option for those genes in NCBI?
Answer:
We use T for everything, even RNA sequences. It means that your query
is already on the correct strand, so only the query is tried against
both strands of the target. If you specify -q dna instead it assumes
(e.g. a cDNA) that you don't know whether the sequence given is on the
transcribed strand or not. In that case it will also take the reverse
compliment of the query and try that against both strands of the target.
This usually creates a lot of nearly duplicate hits of dubious value.
If you know your sequence is from the active transcribed strand,
specify -q rna.
I hope this is helpful to you in understanding how to fine-tune the
blat parameters. Please feel free to contact the list again if you have
more questions.
Regards,
----------
Ann Zweig
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu
Adnan Derti wrote:
> Hi.
>
> I accidentally aligned a bunch of ESTs to a genome with blat without
> specifying -q=rna. I then did a very short and simple test to see if the
> results would be different (align a single Refseq mRNA and two ESTs to
> their respective chromosomes, with and without specifying -q=rna).
> Except for a small difference in the score for one exon, the alignments
> are identical, meaning that blat appears to heed splice sites even
> without -q=rna specified. Is that the case? Functionally, what's the
> difference between -q=dna and -q=rna?
>
> Thanks.
>
> Adnan Derti
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
More information about the Genome
mailing list