[Genome] formula to locate sequence in the fasta files
ZHOU Jiangtao
zhouj at gis.a-star.edu.sg
Mon Oct 8 00:44:52 PDT 2007
Hi,
To get the genome sequences for a given gene location, let say, (chr1,
+, txStart, txEnd), I downloaded the FASTA files
>From goldenPath/hg18/chromosomes/
And use this formula:
Starting position of file chr1.fa:
strlen("chr1")+2+($txStart/50)*51+$txStart%50;
length:$txEnd-$txStart+1;
are these formula correct? I found out sometimes it will be 1 position
earlier than the one I can get from the genome browser. Are txStart 0
based or 1 based?
Regards,
Zhou Jiangtao
More information about the Genome
mailing list