[Genome] uncertain annotations in UCSC

Angie Hinrichs angie at soe.ucsc.edu
Fri Mar 7 11:21:56 PST 2008


Hi Na,

The single FB4_DM in the library file is the consensus repeat 
sequence.  RepeatMasker aligns it to the genome using a sensitive 
alignment tool called cross_match, and indeed the sequence aligns to 
hundreds of places in the genome -- that is why it is considered a 
repetitive sequence.

For details about how RepeatMasker annotates repeats on the genome 
using those consensus sequences, the best people to answer your 
questions are the authors of RepeatMasker (see www.repeatmasker.org).  
For details about how the consensus sequences are generated, the GIRI 
scientists are probably the most knowledgeable.  

Hope that helps, and hope you have a good weekend too,

Angie


On Fri, 7 Mar 2008, Na Liu wrote:

> Hi, Angie,
> 
> I logged in Giri. I obtain such a file : repeatmaskerlibraries-20071204.tar.gz
> 
> the procedure for obtaining it is:
> 1. I click the link :  http://www.girinst.org/repbase/update/index.html
> 2. at the bottom of this page, I noticed the Drosophila. Then I click the
> drorep.ref.  Then I downloaded the above file
> (repeatmaskerlibraries-20071204.tar.gz)
> 3. When I search 'FB4' in the file, I found there was only one FB4_DM.  How do
> you get the results listed in UCSC? (According to my calculation, there are
> totally 389 entries annotated as FB4_DM. How did you get the 389 entries? What
> criteria did you use to annotate is as FB4_DM? )
> 
> 
> Look forward to your reply   and Have a good weekend.
> 
> Na
> 
> 
> On Mar 7, 2008, at 1:06 PM, Angie Hinrichs wrote:
> 
> > Hi Na,
> > 
> > UCSC does not assign the names; we simply run RepeatMasker on the
> > genome and display its results.  RepeatMasker works by aligning
> > consensus sequences from a library file to the genome.  The library
> > file is the source of the repeat name, class and family annotated by
> > RepeatMasker.  The library file is owned by RepBase Update (GIRI), but
> > can be viewed after completing a registration process.  To retrieve
> > the library file, visit this web page:
> > 
> > http://www.girinst.org/repbase/index.html
> > 
> > On the left there is a "Free registration" link.  After you have
> > completed the registration process, the information in RepBase Update
> > and/or RepBase Reports may be helpful.
> > 
> > Best wishes for your research,
> > Angie
> > 
> > 
> > On Fri, 7 Mar 2008, Na Liu wrote:
> > 
> > > Dear professors,
> > > 
> > > I want to obtain all FB elements information of Drosophila
> > > melanogaster  from UCSC. I am not sure if my extracting way is
> > > correct because the results are suspectable:
> > > 
> > > firstly , I choose 'Variation and Repeats ' in the group box by
> > > using   TableBrowser.  Below, at the output format box, I choose "all
> > > fields from selected table".
> > > 
> > > Then I obtained a long list.  I notice there are some entries
> > > annotated as FB4_DM. Are they meant FB elements? Why do you name them
> > > FB4, not FB?  What do you mean by the number '4'?  They are
> > > suspectable because some of them are very short (may be ~30nt, 40nt,
> > > 50nt,....)and can not form a hairpin structure(according to the
> > > definition, FB element has long inverted terminal repeats).
> > > 
> > > If my extracting method is not correct, could you please tell me the
> > > correct one?
> > > 
> > > Look forward to your reply
> > > sincerely .!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> > > Look forward to your reply
> > > sincerely .!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> > > Look forward to your reply
> > > sincerely .!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> > > 
> > > Best
> > > Na
> > > 
> > > 
> > > 
> > >     =====================================================================
> > > 
> > >     Please note that this e-mail and any files transmitted with it may be
> > >     privileged, confidential, and protected from disclosure under
> > >     applicable law. If the reader of this message is not the intended
> > >     recipient, or an employee or agent responsible for delivering this
> > >     message to the intended recipient, you are hereby notified that any
> > >     reading, dissemination, distribution, copying, or other use of this
> > >     communication or any of its attachments is strictly prohibited.  If
> > >     you have received this communication in error, please notify the
> > >     sender immediately by replying to this message and deleting this
> > >     message, any attachments, and all copies and backups from your
> > >     computer.
> > > 
> > > _______________________________________________
> > > Genome maillist  -  Genome at soe.ucsc.edu
> > > http://www.soe.ucsc.edu/mailman/listinfo/genome
> > 
> 

-- 
angie at soe.ucsc.edu
Software Developer, UCSC CBSE / Genome Bioinformatics Group


More information about the Genome mailing list