[Genome] questions for knowngene to refseq

mirrian@iis.sinica.edu.tw mirrian at iis.sinica.edu.tw
Wed Nov 1 02:40:28 PST 2006


Dear Sir,
     I am trying to link these two kinds of data, refseq download from  
NCBI and known gene from UCSC.  I have downloaded these two tables,  
kgXref and knownToRefSeq.  I found that these two tables are  
different, but both contain the knowngene info and refseq info.  For  
example, kgXref contains 32750 records that related to refseq from  
NCBI homo build 36.1, whie knownToRefSeq contains 33961 records that  
related to refseq from NCBI homo build 36.1.  I'm wondering which one  
is more accurate than the other and what causes this difference.
     Furthermore, from the info in "The UCSC Known Genes" published in  
Feb.24, 2006, Genome analysis, if an mRNA has multiple proteins,  
choose the best from the order, PDB,Swiss-Port, TrEMBL.  And for the  
other hand, if one protein has multiple mRNA, choose the best in favor  
of longer and newer one with less mismatches.  Does that means Known  
Genes DB only contained the one to one relationship between protein  
and mRNA?  However, looking at the knownToRefSeq table, the  
relationship between known gene and refseq is not one to one.  About  
22747 records shows that one refseq has multiple known genes.  Would  
you mind to tell me what causes this?

Thanks for your help, and look forward to your response.

Best Regards,
MengRu




----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.




More information about the Genome mailing list