[Genome] A question about SNPs
Brooke Rhead
rhead at soe.ucsc.edu
Tue May 6 18:05:00 PDT 2008
Hi Shan,
We get all of our SNP data directly from dbSNP, including position
information. Sometimes the different pieces of data for a single SNP
contradict each other, and we try to annotate these cases. For
rs4030808 on chr1, I see this information:
dbSNP: rs4030808
Position: chr1:36267-36266
Genomic Size: 0
Observed: C/T
When the start position is one greater than the end position (with a
genomic size of zero), as in this case, that indicates an insertion
between the two bases. But when a SNP is an insertion, we expect to see
an allele of "-" among the observed alleles from dbSNP. In this case we
do not see a "-" (dbSNP just reports C and T). This is an inconsistency
in dbSNP's data, which we annotate with these notes (on the SNP details
page):
Annotations:
All observed alleles are single-base, but the annotation spans 0 bases.
(UCSC's re-alignment of flanking sequences to the genome may be
informative -- see below.)
UCSC reference allele does not match any observed allele from dbSNP.
In cases like this, it is often helpful to look at the alignments shown
on our SNP details pages. We re-align the flanking sequences provided
by dbSNP to the genomic sequence with the idea that it will be easier to
see how the two sequences relate.
--
Brooke Rhead
UCSC Genome Bioinformatics Group
Shan Yang wrote:
> Hi,
>
> I used the snp128 data for my research. And I found many cases where the start of SNP is greater than the end of SNP. If it is an indel, it may make some sense, however, a lot of these cases are single nucleotide change like this one shown here.
> http://genome.ucsc.edu/cgi-bin/hgc?hgsid=107011994&o=36266&t=36266&g=snp128&i=rs4030808&c=chr1&l=36264&r=36268&db=hg18&pix=1200
>
> What is the explaination to this?
>
> My guess is that dbsnp data came from various sources and some of them use 0 start, half close region and some of them use 1 start, close region. When you put them on genome browser, you treat them all as 0 start and half close, thus, when you convert them into 1 start, closed region, you'll add 1 to the start coordinate and keep the end coordinate. Thus, if start=end in the original coordinate, they will be start = end +1 in the genome browser.
>
> I don't know if this is right, but all the cases I've seen here all have start = end +1 problem.
>
> Thanks!
>
> Shan
>
>
>
> ____________________________________________________________________________________
> Be a better friend, newshound, and
> know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>
More information about the Genome
mailing list