[Genome] Question about % of exons represented by repeats?

Brooke Rhead rhead at soe.ucsc.edu
Thu Apr 5 12:07:35 PDT 2007


Hello Sarah,

You can use the Table Browser to intersect the RepeatMasker table (rmsk) 
with a custom track containing human exons.

To do this, first make the custom track.  Go to the Table Browser (the 
blue "Tables" link at the top of the page) and select March 2006 human 
assembly.  Select "group: genes and gene prediction tracks".  Here you 
will need to decide which of the gene tracks you wish to use for this 
calculation (you can see a description of each track by clicking the on 
the track name back on the Genome Browser page).  Once you have decided 
on a gene track, choose "region: genome" and "output format: custom 
track".  Hit "get output" and choose the option to make one BED record 
per coding exon.  (If you would like to include the untranslated exons, 
you will need to do these steps again and choose the 5' UTR or 3' UTR 
exon options.)  Hit the "get custom track in Table Browser" button.  You 
should now have a custom track that contains only the coding exons from 
the gene track.

Now choose your newly-made custom track in the Table Browser.  Be sure 
"region: genome" is still selected.  Hit the "intersection: create" 
button.  In the drop-down menus on the next page, select the rmsk table 
as the table with which you would like to intersect.  This table is under:

group: variation and repeats
track: RepeatMasker
table: rmsk

Here you will need to decide how much of an overlap between the exons 
and the repeats you wish to count in your calculation.  For instance, if 
you want to consider an exon blocked if ANY of it overlaps the 
RepeatMasker track, choose "All [custom track] records that have any 
overlap with RepeatMasker".  Hit "submit".  Now you can use the 
"summary/statistics" button to get a count of how many of the exons 
intersect with the RepeatMasker track and the number of bases that are 
covered by those exons.  You can compare these numbers with statistics 
from your custom track (just clear the intersection with the rmsk table 
and hit the "summary/statistics" button again).

I hope these instructions help you get the information you need.  Please 
let us know if you have further questions.

--
Brooke Rhead
UCSC Genome Bioinformatics Group



Sarah Finch wrote:
> Good afternoon,
> 
>  
> 
> I need to determine the % of exons in the human genome (March 2006
> build) that are represented by repeats.  Or in other words what % of all
> the human exons is blocked by the RepeatMasker program?
> 
>  
> 
> I would really appreciate any help you could give me.
> 
>  
> 
> Thanks,
> 
> Sarah Finch
> 
>  
> 
>  
> 
>  
> 
> Sarah E.W. Finch, Ph.D.
> 
> Postdoctoral Research Scientist
> 
> The Rothberg Institute for Childhood Diseases
> 
> 530 Whitfield Street
> 
> Guilford, CT  06437
> 
> 203.458.7100  ext. 217
> 
> sfinch at childhooddiseases.org <mailto:sfinch at childhooddiseases.org> 
> 
>  
> 
> _______________________________________________
> Genome maillist  -  Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome


More information about the Genome mailing list