[Genome] discrepancy
Brooke Rhead
rhead at soe.ucsc.edu
Wed May 30 19:49:19 PDT 2007
Hi Gabriel,
The refSeqAli table contains results of the alignment (by BLAT) of
RefSeq mRNAs to the genome. These alignments are then used to make the
refGene table. When gene prediction tracks (like RefSeq Genes) are
made, small gaps in the alignments are closed. The gaps can be due mRNA
sequence errors or, more interestingly, polymorphisms between the
reference sequence and the RefSeq mRNA. In this particular case, there
is a gap of 3 bases in the 3' UTR relative to the genome.
I hope this explanation helps. Please let us know if you have further
questions by writing back to this list.
--
Brooke Rhead
UCSC Genome Bioinformatics Group
We invite you to give us your feedback on the UCSC Genome Browser
through May 31, 2007: http://www.surveymonkey.com/s.asp?U=881163743177
Gabriel Renaud wrote:
> Hi,
> I am wondering why is there a discrepancy between the blocks in
> refSeqAli and the exons in refGene ?
>
> For instance, using the Zebrafish 2006 build with the following
> coordinate: chr1:576,069-591,396
>
> Using refGene you get:
>
> #bin name chrom strand txStart txEnd cdsStart cdsEnd exonCount exonStarts exonEnds id name2 cdsStartStat cdsEndStat exonFrames
> 589 NM_178099 chr1 - 576068 591396 576655 591328 22 576068,581925,582111,582354,582591,582812,583023,583253,583513,583741,583956,584223,584526,586099,586489,588542,588746,588969,589300,590688,591069,591316, 576684,582017,582213,582485,582737,582936,583178,583422,583664,583875,584135,584416,584661,586408,586758,588660,588881,589083,589501,590748,591183,591396, 0 atp1a1a.5 cmpl cmpl 1,2,2,0,1,0,1,0,2,0,1,0,0,0,1,0,0,0,0,0,0,0,
>
> using refSeqAli:
>
> #bin matches misMatches repMatches nCount qNumInsert qBaseInsert tNumInsert tBaseInsert strand qName qSize qStart qEnd tName tSize tStart tEnd blockCount blockSizes qStarts tStarts
> 589 3422 54 247 1 1 2 22 11604 - NM_178099 3838 5 3731 chr1 70589895 576068 591396 23 191,422,92,102,131,146,124,155,169,151,134,179,193,135,309,269,118,135,114,201,60,114,80, 107,300,722,814,916,1047,1193,1317,1472,1641,1792,1926,2105,2298,2433,2742,3011,3129,3264,3378,3579,3639,3753, 576068,576262,581925,582111,582354,582591,582812,583023,583253,583513,583741,583956,584223,584526,586099,586489,588542,588746,588969,589300,590688,591069,591316,
>
>
> If you look at the exon starts, you see that refSeqAli contains 23
> blocks but refGene contains only 22 of them:
>
> refGene: 576068, 581925,582111,582354,582591,582812,583023,583253,583513,583741,583956,584223,584526,586099,586489,588542,588746,588969,589300,590688,591069,591316,
> refSeqAli:576068,576262,581925,582111,582354,582591,582812,583023,583253,583513,583741,583956,584223,584526,586099,586489,588542,588746,588969,589300,590688,591069,591316,
>
>
> You see the block corresponding to 576262 is missing in refGene, why is
> that ?
>
> Thank you,
>
>
More information about the Genome
mailing list