[Genome] discrepancy

Brooke Rhead rhead at soe.ucsc.edu
Wed May 30 19:49:19 PDT 2007


Hi Gabriel,

The refSeqAli table contains results of the alignment (by BLAT) of 
RefSeq mRNAs to the genome.  These alignments are then used to make the 
refGene table.  When gene prediction tracks (like RefSeq Genes) are 
made, small gaps in the alignments are closed.  The gaps can be due mRNA 
sequence errors or, more interestingly, polymorphisms between the 
reference sequence and the RefSeq mRNA.  In this particular case, there 
is a gap of 3 bases in the 3' UTR relative to the genome.

I hope this explanation helps.  Please let us know if you have further 
questions by writing back to this list.

--
Brooke Rhead
UCSC Genome Bioinformatics Group

We invite you to give us your feedback on the UCSC Genome Browser
through May 31, 2007: http://www.surveymonkey.com/s.asp?U=881163743177



Gabriel Renaud wrote:
> Hi,
>     I am wondering why is there a discrepancy between the blocks in 
> refSeqAli and the exons in refGene ?
> 
> For instance, using the Zebrafish 2006 build with the following 
> coordinate: chr1:576,069-591,396
> 
> Using refGene you get:
> 
> #bin	name	chrom	strand	txStart	txEnd	cdsStart	cdsEnd	exonCount	exonStarts	exonEnds	id	name2	cdsStartStat	cdsEndStat	exonFrames
> 589	NM_178099	chr1	-	576068	591396	576655	591328	22	576068,581925,582111,582354,582591,582812,583023,583253,583513,583741,583956,584223,584526,586099,586489,588542,588746,588969,589300,590688,591069,591316,	576684,582017,582213,582485,582737,582936,583178,583422,583664,583875,584135,584416,584661,586408,586758,588660,588881,589083,589501,590748,591183,591396,	0	atp1a1a.5	cmpl	cmpl	1,2,2,0,1,0,1,0,2,0,1,0,0,0,1,0,0,0,0,0,0,0,
> 
> using refSeqAli:
> 
> #bin	matches	misMatches	repMatches	nCount	qNumInsert	qBaseInsert	tNumInsert	tBaseInsert	strand	qName	qSize	qStart	qEnd	tName	tSize	tStart	tEnd	blockCount	blockSizes	qStarts	tStarts
> 589	3422	54	247	1	1	2	22	11604	-	NM_178099	3838	5	3731	chr1	70589895	576068	591396	23	191,422,92,102,131,146,124,155,169,151,134,179,193,135,309,269,118,135,114,201,60,114,80,	107,300,722,814,916,1047,1193,1317,1472,1641,1792,1926,2105,2298,2433,2742,3011,3129,3264,3378,3579,3639,3753,	576068,576262,581925,582111,582354,582591,582812,583023,583253,583513,583741,583956,584223,584526,586099,586489,588542,588746,588969,589300,590688,591069,591316,
> 
> 
> If you look at the exon starts, you see that refSeqAli contains 23 
> blocks but refGene contains only 22 of them:
> 
> refGene:  576068,       581925,582111,582354,582591,582812,583023,583253,583513,583741,583956,584223,584526,586099,586489,588542,588746,588969,589300,590688,591069,591316,
> refSeqAli:576068,576262,581925,582111,582354,582591,582812,583023,583253,583513,583741,583956,584223,584526,586099,586489,588542,588746,588969,589300,590688,591069,591316,
> 
> 
> You see the block corresponding to 576262 is missing in refGene, why is 
> that ?
> 
> Thank you,
> 
> 


More information about the Genome mailing list