[Genome] cds table

Galt Barber galt at soe.ucsc.edu
Tue Dec 18 14:48:42 PST 2007


Our cds table uses same format that is used by
genbank-style CDS records showing cdsStart..cdsEnd
e.g. NM_123456 34..305

http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html#BaseSpanB

-------------
Base span of the biological feature indicated to the left, in this case, a
CDS feature. (The CDS feature is described above, and its base span
includes the start and stop codons.) Features can be complete, partial on
the 5' end, partial on the 3' end, and/or on the complementary strand.
Examples:

   1. complete feature is simply written as n..m

      Example:    687..3158
      The feature extends from base 687 through base 3158 in the sequence
shown

   2. <     indicates partial on the 5' end

      Example:    <1..206
      The feature extends from base 1 through base 206 in the sequence
shown, and is partial on the 5' end

   3. >     indicates partial on the 3' end

      Example:    4821..5028>
      The feature extends from base 4821 through base 5028 and is partial
on the 3' end

   4. (complement)  indicates that the feature is on the complementary
strand

      Example:    complement(3300..4037)
      The feature extends from base 3300 through base 4037 but is actually
on the complementary strand. It is therefore read in the opposite
direction on the reverse complement sequence. (For an example, see the
third CDS feature in the sample record shown on this page. In this case,
the amino acid translation is generated by taking the reverse complement
of bases 3300 to 4037 and reading that reverse complement sequence in its
5' to 3' direction.)

-----------------

-Galt


On Tue, 18 Dec 2007, Ann Zweig wrote:

> Hello,
>
> 	I suggest that instead of the cds table, you use the all_mrna table to find
> what you're looking for.  The fields you will be interested in will be these:
>
> qName = name of mRNA
>
> strand = + or - for strand. First character query (mRNA), second character
> target (genome)
>
> tName = name of chromosome in genome
> tStart = start location in genome
> tEnd = end location in genome
>
> 	Note that if the mRNA aligns to the - strand of the genome, then the 'start' of
> the mRNA will be found in the tEnd field (not the tStart field).
>
> 	I hope this information is helpful to you.  Please don't hesitate to contact
> the mail list again if you require further assistance.
>
>
> Regards,
>
> ----------
> Ann Zweig
> UCSC Genome Bioinformatics Group
> http://genome.ucsc.edu
>
> Please feel free to search the Genome mailing list archives by visiting our home
> page, clicking on "Contact Us", then typing a word or phrase into the search
> box.  On that same page
> (http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome mailing
> list.
>
>
>
>
> st kumer wrote:
> > Hi,
> > I downloaded the cds table in order to find the start
> > point of the coding region in each mrna.
> > In the field 'name' I see that the format is not
> > always start..end and can be more complex, like:
> > <start..end
> > <start..end>
> > start..>end
> > complement(<start..>end)
> > join(<1..306,307..1065)
> > n/a
> >
> > Where can I find what is the meaning of each format?
> > Thanks!
> >
> >
> >       ____________________________________________________________________________________
> > Be a better friend, newshound, and
> > know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
> >
> > _______________________________________________
> > Genome maillist  -  Genome at soe.ucsc.edu
> > http://www.soe.ucsc.edu/mailman/listinfo/genome
> _______________________________________________
> Genome maillist  -  Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>


More information about the Genome mailing list