[Genome] [Fwd: question UCSC Tables]
Ann Zweig
ann at soe.ucsc.edu
Tue Dec 11 12:00:26 PST 2007
Hello Katleen,
By slightly changing the order of your queries in the Table Browser, I think
you will be able to get what you want. I will explain by using an example:
Let's say this is your set of HUGO (HGNC) geneSymbols:
ATP2B3
ABCD1
CD99L2
GABRA3
Enter that set into the 'paste list' section of the Table Browser as before.
However, this time, choose the kgXref table (found in the 'table' drop-down list
of the UCSC Genes 'track' section of the Table Browser).
For 'output format' choose "selected fields from primary and related tables" as
before. To set the 'select fields' page up, start by choosing the geneSymbol
field of the hg18.kgXref table. Then choose the knownGene table from the list
of linked tables, and select the following fields:
chrom, txStart, txEnd (and whatever else you want)
Submit your query (get output). The output, in this example, looks like this:
#hg18.kgXref.geneSymbol hg18.knownGene.chrom hg18.knownGene.txStart
hg18.knownGene.txEnd
ATP2B3 chrX 152454773 152501581
ATP2B3 chrX 152454773 152501581
ATP2B3 chrX 152480786 152501581
ABCD1 chrX 152643529 152663374
ABCD1 chrX 152655356 152663374
ABCD1 chrX 152661852 152663374
CD99L2 chrX 149685466 149817837
CD99L2 chrX 149685466 149817837
CD99L2 chrX 149685466 149817837
GABRA3 chrX 151087185 151370486
The first column contains the HGNC geneSymbols, the other columns the position.
You will be able to clearly see which of your symbols did not match, as they
will not be listed.
If you would like only the canonical gene for each of your geneSymbols (instead
of all of the isoforms), you will need to include the knownIsoforms and
knownCanonical tables in your Table Browser query.
Regards,
----------
Ann Zweig
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu
Please feel free to search the Genome mailing list archives by visiting our home
page, clicking on "Contact Us", then typing a word or phrase into the search
box. On that same page
(http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome mailing
list.
Katleen De Preter wrote:
> Hello,
>
> As I have understood well from your answer, it is not possible to get
> the original search terms in the output table. As this is a mixture of
> HUGO and Alias and some other types of identifiers, it would be very
> helpful to check for which of the input terms Genome Browser has found a
> match...
>
> Best regards,
>
> Katleen De Preter
>
>
> Brooke Rhead schreef:
>> Hello Kathleen,
>>
>> Are you by any chance using the "UCSC Genes" track or the "RefSeq
>> Genes" track? Both of these tracks have some extra functionality in
>> the Table Browser that allow you to use an identifier that is NOT the
>> main identifier in the table you have selected.
>>
>> To see what I am referring to, look at the text at the top of the page
>> when you hit the "paste list" button. For UCSC Genes, the text at the
>> top is:
>>
>> "Please paste in the identifiers you want to include. The items must
>> be values of the name field of the currently selected table,
>> knownGene, or the alias field of the alias table kgAlias."
>>
>> What is happening when you paste in HUGO identifiers is that the Table
>> Browser is going through the kgAlias table to make selections from the
>> knownGene table, but only the fields from the knownGene table are
>> returned.
>>
>> To include information from the kgAlias table in your output, choose
>> the output format "selected fields from primary and related tables".
>> Now hit "get output", and then from the Linked Tables section, check
>> the box for the kgAlias table and hit the "allow selection from
>> checked tables" button at the bottom of the page. Be sure the
>> kgAlias.alias field is selected (as well as the fields you want to
>> retrieve from knownGene), and then hit "get output".
>>
>> You should see a new column on the end of your output that contains
>> the alias field from kgAlias. Note that since one UCSC Gene ID
>> generally corresponds to several aliases, there are several names
>> listed in that column. The identifier you originally entered should
>> be included in the list.
>>
>> I hope this information is helpful. If you have further questions,
>> please do not hesitate to contact us again. However, please send
>> future questions to genome at soe.ucsc.edu, our moderated forum for user
>> questions and support. (Note that this is a public mailing list, see
>> http://genome.ucsc.edu/contacts.html for details.)
>>
>> --
>> Brooke Rhead
>> UCSC Genome Bioinformatics Group
>>
>>
>>> -------- Original Message --------
>>> Subject: question UCSC Tables
>>> Date: Mon, 10 Dec 2007 17:29:16 +0100
>>> From: Katleen De Preter <Katleen.DePreter at ugent.be>
>>> To: cbseweb at cbse.ucsc.edu
>>>
>>>
>>>
>>> Dear Mr/Mrs,
>>>
>>> I would like to search the positions of a list of gene symbols (HUGO
>>> names and aliases). When I perform a search using the Tables
>>> function, I get a large list of results. However, in this results
>>> file, I cannot find the original Gene Symbols I have searched for.
>>> How can I obtain also the original list in the output file?
>>>
>>> Thank you in advance,
>>> Best regards,
>>>
>>> Katleen De Preter
>
More information about the Genome
mailing list