[Genome] multiple entries for a gene in refFlat table

Anil Jegga Anil.Jegga at cchmc.org
Mon Jun 11 16:35:06 PDT 2007


Hi Archana

This may not be just the issue of blat. If you look at this region,
there are several sequencing gaps (see the attached pdf). Most probably
there are errors with the contig assembly in this part of the genome
(with several segments repeating) and that could be leading to these
multiple hits. If you blat the mRNA sequence all the high scoring hits
are on chr 15 (identity ranging from 94.7% to 100%) within
chr15:19,027,714-26,577,158 (about 7.5 mbp). Or are these "real"
segmental duplications? 

Thanks
Anil
Anil Jegga
Assistant Professor
Department of Pediatrics and Division of Biomedical Informatics
Cincinnati Children's Hospital Medical Center and University of
Cincinnati
Tel: (513)-636-0261
Fax: (513)-636-2056
http://anil.cchmc.org 


>>> Archana Thakkapallayil <archanat at soe.ucsc.edu> 06/11/07 7:01 PM
>>>
Hello Amit,

When I search for 'GOLGA8G' in the Genome Browser, I get these three
hits:

GOLGA8G at chr15_random:258251-271612 - (NM_001012420) golgi 
autoantigen, golgin subfamily a, 8G
GOLGA8G at chr15:26563798-26577158 - (NM_001012420) golgi autoantigen,

golgin subfamily a, 8G
GOLGA8G at chr15:26297405-26310766 - (NM_001012420) golgi autoantigen,

golgin subfamily a, 8G

Here is a previously answered mailing list question which is similar to

yours:

http://www.soe.ucsc.edu/pipermail/genome/2007-May/013623.html 

Hope this is helpful to you. Please don't hesitate to contact us again

if you require further assistance.

Regards,

Archana
UCSC Genome Bioinformatics Group


Amit U Sinha wrote:
> For human build hg18, refFlat table has multiple entries for some
genes. 
> Eg for gene GOLGA8G (NCBI GeneID: 283768), the following entries
exist:
>
> geneName        name    chrom   strand  txStart txEnd
> GOLGA8G NM_001012420    chr15   +       26297404        26310766
> GOLGA8G NM_001012420    chr15   -       26563797        26577158
> GOLGA8G NM_001012420    chr15_random    +       258250  271612
>
> 1. What does chr15_random indicate when its physical location is
known
> 2. NCBI website shows only a single transcript, why is the transcript

> NM_00102420 repeated twice, with different start positions?
>
> Thanks,
> -- amit
> ____________________________________________________________________
> Amit U Sinha
> Graduate Student
> Univ of Cincinnati
> _______________________________________________
> Genome maillist  -  Genome at soe.ucsc.edu 
> http://www.soe.ucsc.edu/mailman/listinfo/genome 
>   



More information about the Genome mailing list