[Genome] Ensembl and UCSC Browsers and their differences...

Emma Grant Emma.Grant at stgeorges.nhs.uk
Thu Jun 28 07:18:46 PDT 2007


Hi there,
 
I've been in contact with the Ensembl helpdesk regarding differences
between the UCSC and Ensembl genome browsers in terms of the clone size
and bp positions given for some BAC clones I am working with at the
moment. Below is the information I have sent to them - some issues have
been resolved, others they are looking into on their side (CTA-299D3 and
CTA-397C4) but there are also some that I hope you can help me with,
namely - CTA-268H5, CTB-109B5, and RP3-477J10.
 
CTA-397C4:
 
UCSC 2006:                  Start:    43212592
                                    End:     43260940
                                    Size:    48349bp
                        
Ensembl 45:                 Start:    43212696
                                    End:     43260941
                                    Size:    48245bp
 
Ncbi clone registry:      Size:    48349bp (agrees with UCSC)
 
 
CTA-268H5:
 
UCSC 2006:                  Start:    43952888
                                    End:     44175871          
                                    Size:    222983bp          
                        
Ensembl 45:                 Start:    43952888
                                    End:     44141720
                                    Size:    188833bp
 
 
Ncbi clone registry:      Size:    188833bp (agrees with Ensembl)
 
 
 
CTB-109B5:
 
UCSC 2006:                  Start:    46192149
                                    End:     46273725
                                    Size:    81576bp
                        
Ensembl 45:                 Start:    46236441
                                    End:     46248283
                                    Size:    11843bp
 
Ncbi clone registry:      Size:    11843bp (agrees with Ensembl)
 
 
CTA-299D3:
 
UCSC 2006:                  Start:    47248203
                                    End:     47338938
                                    Size:    90736bp
                        
Ensembl 45:                 Start:    47248304
                                    End:     47338939
                                    Size:    90636bp
 
Ncbi clone registry:      Size:    90736bp (agrees with UCSC)
 
 
 
 
RP3-477J10:
 
UCSC 2006:                  Start:    45853159
                                    End:     46001750
                                    Size:    148591bp
                        
Ensembl 45:                 Start:    45853159
                                    End:     45949836
                                    Size:    96678bp
 
Ncbi clone registry:      
Accession
G. Center
State
Seqlen(bp)
AL021686
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=nucleotid
e&list_uids=6981739&dopt=GenBank> 
SC
unfinished
204155
AL096755
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=nucleotid
e&list_uids=5420192&dopt=GenBank> 
SC
finished
96678 (agrees with Ensembl)
AL096756
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=nucleotid
e&list_uids=5420194&dopt=GenBank> 
SC
finished
12666
 
 
I hope to hear from you soon!
 
Thanks,
Emma
 
Emma Grant
Trainee Cytogeneticist
St. George's Hospital
London
 
 
-----Original Message-----
From: Giulietta Spudich via RT [mailto:helpdesk at ensembl.org] 
Sent: 28 June 2007 14:33
To: Emma Grant
Subject: [Sanger #37424] F.A.O. Bert Overduin, Re: BAC clone positions
and size on the USCS and Ensembl genome browsers. 
 
Dear Emma,
 
Thank you for looking at your list again.  First, the only numbers to
look at for the clones in Ensembl are the ones that come up in the
pop-up window once you click on the clone.  For example for CTA-397C4,
look at the window (once you click on the gold bar that represents the
clone) that says:
 
Clone- CTA-397C4
bp: 43212696-43260941
length: 48246 bps
 
These are the numbers I will take into account.  The lengths are
different than the ones you have written down... I hope you didn't
calculate them as you can just read them in the pop-up window!
 
I hope this is clear so far!
 
I) For CTA-268H5, CTB-109B5, and RP3-477J10:
Ensembl looks fine.  The clone length in Ensembl matches the clone
accession number.  (This clone accession number should be the same in
the NCBI and EMBL repositories, and both Ensembl and UCSC should be
accessing the same clone sequence if the same accession number is
given.)
Only UCSC does not match the accession number entry in these cases. 
UCSC has long clones- much longer than the original entries.  Please
email them to ask about it.
 
II) For CTA-397C4 and CTA-299D3
Ensembl has shortened clones for these accession numbers.  They only
match part of the accession number sequence.  The reason for these short
clones is not clear.  We will investigate these two clones further.  
 
III) RP3-477J10
This clone looks OK to me:
http://www.ebi.ac.uk/cgi-bin/dbfetch?db=emblsva;id=AL096755.1
(This entry should match what you find in the NCBI clone registry).
Different accession numbers are listed in the entry, as there are
synonyms, but they should all bring you back to this one entry
(RP3-477J10).  (Over time, clones can be found to be the same thing, and
two names are merged).
 
OK, I hope that's a bit clearer now!  Let us know if you have more
questions.  
 
UCSC has a good helpdesk: genome at soe.ucsc.edu
Email them and ask about those three clones (CTA-268H5, CTB-109B5, and
RP3-477J10) asking why they are so much longer than the entry in the
NCBI clone registry.
 
Best Wishes,
Giulietta (Ensembl Helpdesk)
 
 


More information about the Genome mailing list