[Genome] liftOver using over.chain

Brooke Rhead rhead at soe.ucsc.edu
Mon Nov 12 13:44:26 PST 2007


Hello Sen Kwan,

I think you are referring to this previously-answered question:
http://www.soe.ucsc.edu/pipermail/genome/2005-December/009336.html

Let me clarify that answer a little bit.  The over.chain file is 
filtered so that there are no duplicate chains in any particular region 
(unlike the all.chain files, which can have multiple chains in a single 
region).

The all.chain files are the chained blastz alignments that correspond to 
what is displayed in the Chain tracks in the Genome Browser.  The 
over.chain files consist of chained and netted alignments (see this 
previously-answered question: 
http://www.soe.ucsc.edu/pipermail/genome/2006-February/009717.html) in 
the chain file format.

So, the over.chain files have been filtered to remove multiple chains, 
not to remove duplicate regions (repeats).

Repeats are considered during the chaining process.  Please see the 
chain description page for more information, including this paper listed 
in the references section:

Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's 
cauldron: Duplication, deletion, and rearrangement in the mouse and 
human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.
http://www.pnas.org/cgi/content/abstract/1932072100v1

You also might find this general description of chains and nets on our 
genomewiki useful: http://genomewiki.ucsc.edu/index.php/Chains_Nets .

I hope this information helps.  If you have further questions, please 
feel free to contact us again.

--
Brooke Rhead
UCSC Genome Bioinformatics Group


Tay Sen Kwan wrote:
> Hello,
> 
> I understand from past e-mail queries that the difference between the 
> all.chain and the over.chain liftover files were that the latter contain 
> "filtered results and that there are no duplicates."  I am using the 
> hg18To....over.chain files to perform liftover of individual exons of 
> single-copy, human cDNAs to other genomes, I would like to find out if 
> there were certain families of repeats that were not filtered out in 
> these files; or alternatively, what are the families of repeats that are 
> filtered out ?  Many thanks.
> 
> Regards,
> 
> Sen Kwan
> 
> _______________________________________________
> Genome maillist  -  Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome


More information about the Genome mailing list