[Genome] problem on mouse genome for promoter region prediction

Brooke Rhead rhead at soe.ucsc.edu
Mon Nov 26 12:16:34 PST 2007


Hello again Xu Jerry,

There has been some further discussion among our engineers about your 
question, which I would like to pass along to you.

First, it was pointed out that turnover in these regions is to be 
expected.  Here is one paper on the topic:

Evolutionary turnover of mammalian transcription start sites.
Genome Res. 2006 Jun;16(6):713-22. Epub 2006 May 10.
http://www.genome.org/cgi/content/abstract/16/6/713

Also, here is some input regarding liftOver vs. pslMap, and more on 
turnover:
~~~
pslMap would be a lot better for this application.  LiftOver might also 
just drop some promoters too, even though a few elements may be 
conserved.  Generally you do get a fair bit of turnover in regulatory 
elements and transcription start sites between human and mouse.  You 
don't expect to see the level of conservation you'd get with coding 
regions, even though the functionality may be as complex, and also 
conserved.  Regulatory protein binding elements can phase in and out of 
existence pretty easily, and one phasing out can be compensated for by 
another phasing in.  The 12-fly paper (see Kellis M, Kent WJ on pubMed) 
has some info on this too.
~~~

The paper referenced above is:

Discovery of functional elements in 12 Drosophila genomes using 
evolutionary signatures. Nature. 2007 Nov 8;450(7167):219-32

I hope this is helpful.

--
Brooke Rhead
UCSC Genome Bioinformatics Group


Brooke Rhead wrote:
> Hello Xu Jerry,
> 
> Here is a similar previously-answered question:
> 
> http://www.soe.ucsc.edu/pipermail/genome/2007-November/014963.html
> 
> (This is specific to TFBS, but you could also use the liftOver utility 
> for the Eponine TSS or FirstEF tracks.  However, please see my note 
> below about using liftOver -- you might want to use pslMap instead.)
> 
> Alternatively, you could intersect one of the human promoter tracks with 
> the human Conservation track to get the corresponding mouse regions, as 
> described in this answer:
> 
> http://www.soe.ucsc.edu/pipermail/genome/2007-June/013989.html
> 
> (Note that multiz17way has been replaced by multiz28way in hg18.  Also, 
> there is another tool run by Penn State University called Galaxy that is 
> very useful for working with MAF data, located here: 
> http://main.g2.bx.psu.edu/ -- look under the "Fetch Alignments" tool on 
> the left-hand side of the page.)
> 
> In either case, you could make a custom track in the mouse browser and 
> intersect it with the mouse regions you found.
> 
> Another option would be to go in the opposite direction: use your 
> compiled list of mouse coordinates to create a custom track in the human 
> browser (either using liftOver, pslMap or the Conservation track to 
> convert to human coordinates), then intersect that with one of the 
> promoter tracks on the human browser.
> 
> 
> NOTE about using liftOver:
> 
> liftOver is fairly coarse (just maps start and end), which makes it more 
> useful for same-species mapping than cross-species mapping.  The program 
> pslMap is much more detailed (it breaks the mapping down to the 
> chainLink/gapless-block level), so it is better than liftOver for 
> mapping gene-sized regions.
> 
> The pslMap program is available in our source tree, which is free for 
> academic, nonprofit and personal use:
> http://hgdownload.cse.ucsc.edu/downloads.html#source_downloads
> 
> Typical pslMap options for running with one of our .over.chain files 
> (for example, converting mm8 to hg18 coordinates) are:
> 
> pslMap -chainMapFile -swapMap mm8.input.psl mm8ToHg18.over.chain.gz 
> hg18.output.psl
> 
> Note that the input to pslMap needs to be in PSL format, described here: 
> http://genome.ucsc.edu/FAQ/FAQformat#format2 .  If you already have 
> coordinates in BED format, our program genePredToPsl -bedFormat can be used.
> 
> The output of pslMap can be uploaded as a custom track, and/or 
> translated back into BED format using the program pslToBed.
> 
> I hope this information is helpful.  If you have further questions, 
> please feel free to contact us again at the genome mailing list address.
> 


More information about the Genome mailing list