[Genome] C. elegans refGene table and sangerGene table
Hiram Clawson
hiram at soe.ucsc.edu
Thu Nov 2 12:08:08 PST 2006
Good Morning Andrew:
The Genbank refSeq alignments to the C. elegans genome are simply
alignments of the sequence to the genome. If you take the sequence
for NM_061506 you will find it will blat match almost %100 exactly
to locations on chrII, chrIV and chrV. Our daily genbank alignments
are simply alignments via blat to the genome. If they pass our criteria
for alignment, they are marked on the genome.
For your query about the discrepancy between the Genbank refGene alignment
tables and the sangerToRefSeq tables, this is a problem of coordination between
updates in Genbank vs. our original build of the C. elegans browser.
Our sangerToRefSeq table was created 07 June 2004 with the refSeq data
available at that time. The newer refGene tables are built daily with refSeq
data of today. Over time these two table contents will diverge as genbank incorporates
new data.
--Hiram
Andrew Kwon wrote:
>>From the C. elegans refGene table, searching for rows with name =
> 'NM_061506' returns 5 rows, 2 of which are from chrII, 2 from chrIV, and one
> from chrV. Searching for 'NM_061506' in sangerToRefSeq table returns 4
> rows, including K02E7.3, K02B7.1, W03G1.4, R09E12.6.
>
> I don't understand why this is happening. How can the same refseq id refer
> to genes from different chromosomes?
>
> Andrew T. Kwon
> Wasserman Lab
> CMMT UBC
> I downloaded the C. elegans refGene, sangerGene and sangerToRefSeq tables
> from UCSC ftp site. While going through the records, I ran into trouble with
> some of the records. More specifically, when you view the C. elegans genome
> browser, ZK686.2 is given the refseq id NM_066289, and ZK686.5 is given
> NM_001027859. However, from the tables I downloaded, ZK686.5 is associated
> with NM_066289 in sangerToRefSeq table. NM_001027859 is present in refGene
> table, but not in sangerToRefSeq table. Is this a faulty annotation, or am
> I making a mistake somewhere?
More information about the Genome
mailing list