[Genome] Question about % of exons represented by repeats?
Brooke Rhead
rhead at soe.ucsc.edu
Thu Apr 5 12:07:35 PDT 2007
Hello Sarah,
You can use the Table Browser to intersect the RepeatMasker table (rmsk)
with a custom track containing human exons.
To do this, first make the custom track. Go to the Table Browser (the
blue "Tables" link at the top of the page) and select March 2006 human
assembly. Select "group: genes and gene prediction tracks". Here you
will need to decide which of the gene tracks you wish to use for this
calculation (you can see a description of each track by clicking the on
the track name back on the Genome Browser page). Once you have decided
on a gene track, choose "region: genome" and "output format: custom
track". Hit "get output" and choose the option to make one BED record
per coding exon. (If you would like to include the untranslated exons,
you will need to do these steps again and choose the 5' UTR or 3' UTR
exon options.) Hit the "get custom track in Table Browser" button. You
should now have a custom track that contains only the coding exons from
the gene track.
Now choose your newly-made custom track in the Table Browser. Be sure
"region: genome" is still selected. Hit the "intersection: create"
button. In the drop-down menus on the next page, select the rmsk table
as the table with which you would like to intersect. This table is under:
group: variation and repeats
track: RepeatMasker
table: rmsk
Here you will need to decide how much of an overlap between the exons
and the repeats you wish to count in your calculation. For instance, if
you want to consider an exon blocked if ANY of it overlaps the
RepeatMasker track, choose "All [custom track] records that have any
overlap with RepeatMasker". Hit "submit". Now you can use the
"summary/statistics" button to get a count of how many of the exons
intersect with the RepeatMasker track and the number of bases that are
covered by those exons. You can compare these numbers with statistics
from your custom track (just clear the intersection with the rmsk table
and hit the "summary/statistics" button again).
I hope these instructions help you get the information you need. Please
let us know if you have further questions.
--
Brooke Rhead
UCSC Genome Bioinformatics Group
Sarah Finch wrote:
> Good afternoon,
>
>
>
> I need to determine the % of exons in the human genome (March 2006
> build) that are represented by repeats. Or in other words what % of all
> the human exons is blocked by the RepeatMasker program?
>
>
>
> I would really appreciate any help you could give me.
>
>
>
> Thanks,
>
> Sarah Finch
>
>
>
>
>
>
>
> Sarah E.W. Finch, Ph.D.
>
> Postdoctoral Research Scientist
>
> The Rothberg Institute for Childhood Diseases
>
> 530 Whitfield Street
>
> Guilford, CT 06437
>
> 203.458.7100 ext. 217
>
> sfinch at childhooddiseases.org <mailto:sfinch at childhooddiseases.org>
>
>
>
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
More information about the Genome
mailing list