[Genome] Build 36.1 Repeat Files
Ann Zweig
ann at soe.ucsc.edu
Wed May 7 11:56:57 PDT 2008
Hello David,
Each of the annotation tracks in the genome browser has one or more
underlying database tables. To find the correct name of the table,
simply hold your mouse over the 'mini-button' to the left of the actual
track display, or over the hyperlinked track name in the track controls
(below the display). The URL to which this item is linked will appear
at the bottom of your browser. At the end of the URL, you will see
&g=AAAA. The "AAAA" is the name of the table that holds the data. You
can download any database table from the download server here:
http://hgdownload.cse.ucsc.edu/downloads.html
For the human Build 36.1 (a.k.a. hg18), navigate the download page like
so: Human --> Mar. 2006 (hg18) .. Annotation Database. On this page,
you would download the AAAA.txt.gz table. This is the AAAA database
table in a tab-delimited format compressed with gzip.
That said, here are the tables you need:
TrackName tableName
--------- ---------
Repeat Masker rmsk
Segmental Dups genomicSuperDups
Simple Tandem Repeats simpleRepeats
You can read about the details behind a track (description, methods,
display, credits, references) by pressing on the 'mini-button' to the
left of the actual track display, or by clicking on the hyperlinked
track name in the track controls (below the display).
The Repeat Masker track displays the following types of repeats:
# Short interspersed nuclear elements (SINE), which include ALUs
# Long interspersed nuclear elements (LINE)
# Long terminal repeat elements (LTR), which include retroposons
# DNA repeat elements (DNA)
# Simple repeats (micro-satellites)
# Low complexity repeats
# Satellite repeats
# RNA repeats (including RNA, tRNA, rRNA, snRNA, scRNA, srpRNA)
# Other repeats, which includes class RC (Rolling Circle)
This should be enough to get you started. If you need more detailed
information, don't hesitate to write back to the list.
Regards,
----------
Ann Zweig
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu
Please feel free to search the Genome mailing list archives by visiting
our home page, clicking on "Contact Us", then typing a word or phrase
into the search box. On that same page
(http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome
mailing list.
> Subject:
> Build 36.1 Repeat Files
> From:
> David L Klinkebiel <dklinkebiel at unmc.edu>
> Date:
> Wed, 7 May 2008 12:58:43 -0500
> To:
> genome at soe.ucsc.edu
>
> To:
> genome at soe.ucsc.edu
>
>
> I am looking for the data files which contain the location of Segmental Duplications, Simple repeats, Satellite repeats, and Low Complexity repeats for NCBI Build 36.1. Does the ChromFa.zip file has all other repeats masked? Thanks.
>
> chromFa.zip - The assembly sequence in one file per chromosome. Repeats from RepeatMasker and Tandem Repeats Finder (with period of 12 or less) are shown in lower case; non-repeating sequence is shown in upper case.
>
>
> David Klinkebiel, Ph.D.
> Department of Biochemistry and Molecular Biology
> University of Nebraska Medical Center
> 985870 Nebraska Medical Center
> Omaha, NE 68198-5870
> Office Phone: 402-559-3842
> Lab Phone: 402-559-9303
> FAX: 402-559-6650
>
> ***The University of Nebraska Medical Center E-Mail Confidentiality
> Disclaimer***
> The information in this e-mail may be privileged and confidential, intended
> only for the use of the addressee(s) above.
> Any unauthorized use or disclosure of this information is prohibited. If
> you
> have received this email by mistake,
> please delete and immediately contact the sender.
More information about the Genome
mailing list