[Genome] KnownToLocusLink.txt missing IDs

Hiram Clawson hiram at soe.ucsc.edu
Wed Jan 30 16:13:00 PST 2008


Good Afternoon Gábor:

When you follow the NR_003364 link to NCBI from the UCSC gene
links, that NCBI record contains all the information about
that sequence.  The gene record BC018473 is mentioned in
that genbank entry.  You can follow the links from the NCBI
record:
http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&id=126090534
to the GeneID: 193217 which is BC018473 the new record.
http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene&cmd=Retrieve&dopt=full_report&list_uids=193217

You can follow all these links through the NCBI Entrez system.

--Hiram

Gabor Bartha wrote:
> Hi Ann,
> 
> Thank you for the explanation.  While the kgXref file is missing some of the fields the uc007mmb.1 row is associated with NR_003364.  Are you aware of a way to associate NR_003364 with a LocusLinkID perhaps by using a source from NCBI?  There are more recent updates to the UCSC annotations (refLink for example) but I couldn't find what I was looking for there.
> 
> Gábor
> 
> 
> -----Original Message-----
> From: Ann Zweig [mailto:ann at soe.ucsc.edu] 
> Sent: Wednesday, January 30, 2008 11:45 AM
> To: Gabor Bartha
> Cc: genome at soe.ucsc.edu
> Subject: Re: [Genome] KnownToLocusLink.txt missing IDs
> 
> Hello Gabor,
> 
> 	The mouse assembly is based on NCBI build 27 which was released in July 
> of 2007.  The UCSC Known Gene annotation track and associated tables 
> (including knownToLocusLink and kgXref) were built at that time and have 
> not been updated since then.  We typically only build the UCSC Known 
> Gene track with the initial release of an assembly.  The Entrez Gene you 
> are looking at (GeneID: 193217) was entered into the NCBI Entrez Gene 
> database on 09-Jan-2008.  This explains why this one example does not 
> appear in our knownToLocusLink table.
> 
> 
> Regards,
> 
> ----------
> Ann Zweig
> UCSC Genome Bioinformatics Group
> http://genome.ucsc.edu
> 
> 
> Gabor Bartha wrote:
>> There seems to be a fair number of cross references to the uc IDs that are missing from the KnownToLocuLink.txt file.  One example is uc007mmb.1 which is identified by the Genome Browser as NR_003364.  Doing a nucleotide look up on NCBI gives BC018473 which in turn has the Gene ID 193217 (should be the mapped LocusLink ID).  This example has a validated RefSeq status so it's not something unknown.  It is also missing from the kgXref file.  So why are these entries missing?  Is there a work around for getting these cross references?
>>
>>  
>>
>> Gábor
>>
>>  
>>
>> _______________________________________________
>> Genome maillist  -  Genome at soe.ucsc.edu
>> http://www.soe.ucsc.edu/mailman/listinfo/genome
> 
> _______________________________________________
> Genome maillist  -  Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
> 



More information about the Genome mailing list