[Genome] maf2fasta
Angie Hinrichs
angie at soe.ucsc.edu
Tue Jun 19 10:29:11 PDT 2007
Hi Navin,
maf2fasta is actually from Webb Miller's group at PSU, so I suggest
you contact them -- http://www.bx.psu.edu/miller_lab/ has a contacts
link.
I turned the illustrative test case in your email into actual files
(attached) and ran maf2fasta, but the output appeared correct:
% maf2fasta /tmp/tmpRef.fa /tmp/tmp.maf fasta
>A
AAATTTGGG
>B
AAATTTGGG
>C
AAA---GGG
>D
AAATTTGGG
-- so unfortunately, those inputs are too straightforward to provoke
the bug you are seeing. In order to help the Miller group debug, I
would suggest isolating the smallest input sequence where you see the
bug, and sending that (including reference fasta) to them, so they can
reproduce your bug and step through the code locally.
Hope that helps,
Angie
On Tue, 19 Jun 2007, Navin Elango wrote:
> Dear UCSC members,
>
> I tried to use maf2fasta on a four species alignment and found the following
> problem.
>
> Lets say that the four species are A,B,C,D and the MAF file looks like the
> following (I might be messing up the one-based or zero-based index, but the
> point is that in the second block species C is missing)
>
> a score=1234
> s A 1 3 + 9 AAA
> s B 1 3 + 9 AAA
> s C 1 3 + 6 AAA
> s D 1 3 + 9 AAA
>
> a score=2345
> s A 4 3 + 9 TTT
> s B 4 3 + 9 TTT
> s D 4 3 + 9 TTT
>
> s A 7 3 + 9 GGG
> s B 7 3 + 9 GGG
> s C 7 3 + 6 GGG
> s D 7 3 + 9 GGG
>
> When I do maf2fasta, the output is
>
> >A
> AAATTTGGG
> >B
> AAATTTGGG
> >C
> AAATTTGGG
> >D
> AAA---GGG
>
> The gap which is supposed to be in the third species is actually found in the
> fourth species.
>
> Am I missing something? Do I have to process the maf file before I send it
> into maf2fasta? Could you please let me know how to fix this.
>
> Any help will be greatly appreciated!
>
> Thanks,
> Navin.
>
>
>
>
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>
-------------- next part --------------
>A
AAATTTGGG
-------------- next part --------------
##maf version=1 scoring=blastz
a score=12.0
s A 0 3 + 9 AAA
s B 0 3 + 9 AAA
s C 0 3 + 6 AAA
s D 0 3 + 9 AAA
a score=9.0
s A 3 3 + 9 TTT
s B 3 3 + 9 TTT
s D 3 3 + 9 TTT
a score=12.0
s A 6 3 + 9 GGG
s B 6 3 + 9 GGG
s C 3 3 + 6 GGG
s D 6 3 + 9 GGG
More information about the Genome
mailing list