[Genome] GO annotation
Brooke Rhead
rhead at soe.ucsc.edu
Thu Jul 12 21:26:56 PDT 2007
Hi Jay,
When there is an "n/a" in the results list in the hg18.kgXref.geneSymbol
field, it means there is no hg18 UCSC Genes 'geneSymbol' associated with
that GO ID. That is, not every protein with a GO ID in the goaPart
table is associated with an annotated gene that is part of the UCSC
Genes set.
If you also include the field go.goaPart.dbObjectId in your Table
Browser query, you will at least get the UniProt accession number
associated with each GO ID. You can look up more information on UniProt
accessions in our Proteome Browser, here:
http://genome.ucsc.edu/cgi-bin/pbGateway
You might also be interested in two outside tools for working with GO IDs:
AmiGO: http://amigo.geneontology.org/cgi-bin/amigo/go.cgi
QuickGO: http://www.ebi.ac.uk/ego/
I hope this information helps. Good luck with your work.
--
Brooke Rhead
UCSC Genome Bioinformatics Group
Jay an wrote:
> Hi Brooke,
>
> I got a table consisting of GO:XXX and gene symbol.
> but I found many gene symbols
> are "n/a". what does it mean?
>
> sample:
> #go.goaPart.goId go.goaPart.aspect go.term.id
> hg18.kgXref.geneSymbol
> GO:0006915 P 4835 n/a
> GO:0007050 P 4966 n/a
> GO:0008284 P 5785 n/a
> GO:0005515 F 3520 ITGB1BP1
> GO:0007155 P 5063 ITGB1BP1
> GO:0007160 P 5068 ITGB1BP1
> GO:0007243 P 5149 ITGB1BP1
> GO:0008022 F 5544 ITGB1BP1
> GO:0016020 C 8812 ITGB1BP1
> GO:0005643 C 3640 n/a
> GO:0008150 P 5662 n/a
>
>
>
>
>
> Jay
>
>
> Brooke Rhead <rhead at soe.ucsc.edu> wrote:
>
> Hi Jay,
>
> The go.dbObjectId is the UniProt accession number.
> (But in the case of
> A0A000, the species is not human, but Streptomyces
> ghanaensis [TaxID:
> 35758]).
>
> The go database contains dbObjectId's for all
> assemblies, not just hg18.
> However, it is possible to distingush species in
> the goaPart table, as
> the species name is included as part of the
> dbObjectSymbol field. Here
> are some examples where "HUMAN" is included:
>
> mysql> select * from goaPart where dbObjectSymbol
> like '%_HUMAN' limit 5;
>
> +------------+----------------+-------+------------+--------+
> | dbObjectId | dbObjectSymbol | notId | goId |
> aspect |
>
> +------------+----------------+-------+------------+--------+
> | A0A184 | A0A184_HUMAN | | GO:0005764 | C |
> | A0A184 | A0A184_HUMAN | | GO:0006629 | P |
> | A0A184 | A0A184_HUMAN | | GO:0006665 | P |
> | A0A1K6 | A0A1K6_HUMAN | | GO:0004222 | F |
> | A0A1K6 | A0A1K6_HUMAN | | GO:0006508 | P |
>
> +------------+----------------+-------+------------+--------+
> 5 rows in set (0.00 sec)
>
> You can use the Table Browser to filter the
> goaPart table so that only
> human UniProt IDs are shown. To do this, hit the
> filter "create"
> button. In the free-form query box enter the text:
>
> dbObjectSymbol like '%HUMAN'
>
> and hit "submit". The output should be limited to
> only the UniProt
> symbols with "HUMAN" in the name.
>
> The filtered goaPart table may be all you need to
> map GO accessions to
> genes. But, as you have noticed, the goaPart table
> is linked to the
> hg18 kgXref table, too. If you would like to get
> the gene names
> corresponding to GO accessions from kgXref, you
> can do that, too, with
> the Table Browser:
>
> 1. You will likely want to leave the filter on the
> goaPart table in
> place (dbObjectSymbol like '%HUMAN').
>
> 2. Select the option for "output format: selected
> fields from primary
> and related tables" and hit "get output".
>
> 3. On the next screen, under the "Linked Tables"
> heading, select the box
> for the hg18 kgXref table. Scroll to the bottom of
> the page and hit
> "Allow selection from checked tables".
>
> 4. You should now see a section called at the top
> of the page called
> "hg18.kgXref fields", where you can select any of
> the identifiers from
> the kgXref table (like gene symbol).
>
> 5. Hit "get output". You should get a list of GO
> identifiers with
> associated gene names from kgXref. Keep in mind
> that not every GO ID
> will be associated with a gene in the kgXref
> table.
>
> I hope this information helps.
>
> --
> Brooke Rhead
> UCSC Genome Bioinformatics Group
>
>
> Jay an wrote:
> > thanks Brooke,
> >
> > I followed you instruction. but I got below:
> >
> > #dbObjectId dbObjectSymbol notId goId aspect
> > A0A000 A0A000_9ACTO GO:0003870 F
> > A0A000 A0A000_9ACTO GO:0006783 P
> > A0A000 A0A000_9ACTO GO:0009058 P
> >
> > there is not proteinID.
> > I found "hg18.kgXref
> > .spID
> > (via goaPart.dbObjectId",
> > how can I "via goaPart.dbObjectId"?
> >
> > thank you
> > Jay
> >
> >
> >
> > */Brooke Rhead /* wrote:
> >
> > Hello Jay,
> >
> > The GO accessions are linked to genes (that is,
> protein IDs) in the
> > table 'goaPart', which resides in our 'go'
> database.
> >
> > You can get to this table in the Table Browser
> by selecting "group: all
> > tables" and "database: go", then selecting
> "table: go.goaPart".
> >
> > I hope this information helps. If you have
> further questions, please
> > feel free to write back to this list.
> >
> > --
> > Brooke Rhead
> > UCSC Genome Bioinformatics Group
> >
> >
> > Jay an wrote:
> > > hello,
> > >
> > > every GO:XXXXX has related genes. can you tell
> me how to a matrix
> > > (GO:XXXXX and genes)?
> > >
> > >
> > > thanks
> > >
> > >
> > >
> > > ---------------------------------
> > > Get your own web address.
> > > Have a HUGE year through Yahoo! Small
> Business.
> > >
> _______________________________________________
> > > Genome maillist - Genome at soe.ucsc.edu
> > >
> http://www.soe.ucsc.edu/mailman/listinfo/genome
> >
> >
> >
> ------------------------------------------------------------------------
> > Get the Yahoo! toolbar and be alerted to new
> email
> > wherever
> > you're surfing.
>
>
>
>
>
> ____________________________________________________________________________________
> Looking for a deal? Find great prices on flights and hotels with Yahoo! FareChase.
> http://farechase.yahoo.com/
>
More information about the Genome
mailing list