[Genome] RepeatMasker track on Lizard genome (Feb 2007)

Angie Hinrichs angie at soe.ucsc.edu
Fri Jan 25 11:05:57 PST 2008


Hi Vladimir,

repeatmasker.org is running the latest version (open-3-1-9), with the 
latest library from RepBase Update (20071204).  The lizard 
RepeatMasker track was generated back in early 2007 with the latest 
versions at the time: open-3-1-6 and 20061006.  So it is possible that 
recent enhancements in the library and/or RepeatMasker program could 
explain the difference.  

Also, RepeatMasker results can vary based on the sequence boundaries 
because it factors in things like GC%.  We chop up the sequence into 
500,000-base chunks and then run RepeatMasker on each chunk in a 
compute cluster.  Your range falls into the middle of such a chunk, so 
the LINE is not split across chunk boundaries; just not recognized by 
the late-2006 version.  

Arian Smit and Robert Hubley (of repeatmasker.org / ISB) are the 
creators of RepeatMasker, so they could give a more authoritative 
answer.

Angie


On Fri, 25 Jan 2008, Vladimir Kuryshev wrote:

> Dear UCSC gurus,
> 
> Would you check pls why RepeatMasker track doesn't show relevant 
> information on the lizard genome?
> 
> E.g., take a look at a region:
> scaffold_68:1,284,500-1,285,440
> 
> contains clear part of LINE:
> sequences:             1
> total length:       1941 bp  (1941 bp excl N/X-runs)
> GC level:         50.90 %
> bases masked:       1061 bp ( 54.66 %)
> ==================================================
>                 number of      length   percentage
>                 elements*    occupied  of sequence
> --------------------------------------------------
> Retroelements            1         1094 bp   56.36 %
>     SINEs:                0            0 bp    0.00 %
>     Penelope              0            0 bp    0.00 %
>     LINEs:                1         1094 bp   56.36 %
>      CRE/SLACS            0            0 bp    0.00 %
>       L2/CR1/Rex          1         1094 bp   56.36 %
> ..
> This is a Repeatmasker output (http://www.repeatmasker.org/).
> 
> I would appreciate your feedback with some  explanations.
> 
> wbw,
> Vladimir
> 


More information about the Genome mailing list