[Genome] blat didn't work on 33mers
Archana Thakkapallayil
archanat at soe.ucsc.edu
Tue Jan 9 15:50:27 PST 2007
Hello Yueming,
One of our developers has the following to say regarding the information
that you have send:
The input format that you are using has to be in a regular FATA format.
>NES17510657
TGTTGGGAGAACATTC[T/C]AGAACAGAAGATTAAA
Also, BLAT doesn't know what the section "[T/C]" means. So, Blat will
strip out non ACGT and search for this:
TGTTGGGAGAACATTCTCAGAACAGAAGATTAAA
Which will probably mean that it can try to align the left and right
ends with small gap in the middle. We are guessing here this means that
either T or C can appear.
If we don't have perfect matches, BLAT has a harder time.Perhaps you
should arbitrarily pick the first letter, e.g. "T" in the first example
and use that. Or duplicate it and make two entries and try to align them
both.
NES17510657.T
TGTTGGGAGAACATTCTAGAACAGAAGATTAAA
NES17510657.C
TGTTGGGAGAACATTCCAGAACAGAAGATTAAA
I hope that this helps you. Please let us know if you have further
questions.
Regards,
Archana
UCSC genome Bioinformatics Group
Jim Kent wrote:
> Hi - you probably want to use these blat parameters:
> -stepSize=5 -minScore=0
> 33mers are on the short side for accurate mapping though.
> Are these mouse sequences? From the same strain as the
> genome?
>
> On Jan 8, 2007, at 9:03 AM, Yueming Ding wrote:
>
>
>> HI Jim,
>>
>>
>>
>> I tried to use Blat to align 33mers to mouse genome. But it didn’t
>> work. I blatted 725000 33mers. After I ran pslReps, I only got 1031
>> lines. About half of the 1031 lines are mapped to wrong
>> chromosomes. Could you please tell me if 33mers is too short to run
>> blat? What is the minimum sequence length for blat? Thanks.
>>
>>
>>
>> Yueming Ding
>>
>> The Jackson Laboratory
>>
>> 600 Main Street
>>
>> Bar Harbor, ME 04609
>>
>>
>>
>
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>
More information about the Genome
mailing list