[Genome] Promoter Zones

Archana Thakkapallayil archanat at soe.ucsc.edu
Mon Dec 18 10:46:05 PST 2006


Hello Tiago,

You could get the start and end positions of the promotor zones for the 
RefSeq Genes from the header of upstream sequence files that we provide 
for download on our download server at:

http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/

The files are:
upstream1000.zip
upstream2000.zip
upstream5000.zip

An example from the upstream1000.zip file:

 >NM_198943_up_1000_chr1_14755_r chr1:14755-15754
gcattttaaacccaagtgaaatctcctaggcccttcatgccacactcatc
catccctacctacttgtgttgcaaccaagggccccactgtagtgcctagg
........

The chromosome range, chr1:14755-15754, gives you the start and end 
positions of 1000 base pairs upstream of the RefSeq gene NM_198943.

You could also get this information for the Known Genes using the Table 
Browser. To do this, click on the "Tables" link in the blue bar at the 
top of the Genome Browser page.  Then make the following selections:

clade: vertebrate
genome: Human
assembly: Mar. 2006
group: Genes and Gene Prediction Tracks
track: Known Genes
table: knownGene
region: genome
output format: sequence
click "get output"

Then choose "genomic" and hit "submit". On this page check only the box 
"Promotor/Upstream by -- bases" and specify the number of bases upstream 
in the text box here. The default is 1000. Then hit "get sequence".

This gives you the fasta sequence of promotor upstream regions for all 
the Known Genes with the promotor start and end position information on 
the header of the file.

I hope this information is helpful to you. If this doesn't answer your 
question completely, please feel free to write back.

Regards,

Archana
UCSC Genome Bioinformatics Group


Tiago Antão wrote:
> Hi all,
>
> I am trying to get (in bulk, from the download files) promoter zones
> (I want the start and end positions only, not the sequences) for
> genes. I was able to get all the exon positions from knownGenes, but I
> haven't got a clue where I can find promoter zones. I have the feeling
> that I am forgetting something pretty obvious while looking at data...
> What would be the best way to find promoter zones start and end
> points?
>
> Many thanks for any help,
> Tiago
> _______________________________________________
> Genome maillist  -  Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>   


More information about the Genome mailing list