[Genome] formula to locate sequence in the fasta files

ZHOU Jiangtao zhouj at gis.a-star.edu.sg
Mon Oct 8 00:44:52 PDT 2007


Hi,

 

To get the genome sequences for a given gene location, let say, (chr1,
+, txStart, txEnd), I downloaded the FASTA files

>From goldenPath/hg18/chromosomes/

And use this formula:

 

Starting position of file chr1.fa:

strlen("chr1")+2+($txStart/50)*51+$txStart%50;

length:$txEnd-$txStart+1;

 

are these formula correct? I found out sometimes it will be 1 position
earlier than the one I can get from the genome browser. Are txStart 0
based or 1 based?

 

Regards,

 

Zhou Jiangtao



More information about the Genome mailing list