[Genome] annotation from CDS and mRNA coordinates
Rachel Harte
hartera at soe.ucsc.edu
Tue Oct 3 12:29:45 PDT 2006
Dear Ram Vinay,
There are several ways that you can search using gene symbol or Entrez
Gene ID in the Table Browser.
1) Select the assembly and then track and table of interest. Then for the
region, select position and type the gene symbol into the box and press
lookup. This will allow you to select the region that this gene aligns to
in the RefSeq, Known Genes or mRNA tracks. Then you can select information
from tables in just this region.
2) If you are looking at the Known Genes track, you could also use the
filter to look for a single gene symbol. You could select the Known Genes
track, then kgXref as the table. Select genome as the region then press
the create button for filter. You can type a gene symbol into the box for
the geneSymbol field. Then press submit. Alternatively, a short list of
gene symbols could be used by creating a mySQL query in the free form
query box.
If you would like to get position data for these genes then you will need
to do a "join" to another table. So for the output format, choose
"selected fields from primary and related tables". Then you can select
fields from the kgXref table and related positional tables such as
knownGene and refSeqAli. Select table(s) and press the "Allow Selection
from Checked Tables" button at the bottom of the page and then you can
select the fields from each table that you want in your output.
3) In the RefSeq Genes track, the refLink table has a locusLinkId field.
This can be filtered on in a similar manner as in (2) above. Also, the
same can be applied to the name field of the refLink table which contains
the gene name.
4) A batch search may be done using gene names or identifiers but this can
only be done when the the identifier is in the primary database field.
For the kgXref table, this is the accession. However, if you select the
RefSeq Genes track, then select refLink as the table. The name field of
refLink is the primary field and it contains gene names.
If you click on the "paste list" button next to the identifiers
(names/accessions) label, then you can paste in a list of gene symbols,
one per line. This is case-sensitive. Then press submit.
To link the output to positional tables then select the output format to
be "selected fields from primary and related tables" and choose related
tables and fields as in (2) above.
To look for a short list of symbols or IDs, a free-form query in
mySQL could be added to the Free-form query box on the Filter page.
When you select a table in the Table Browser, you can press the describe
table schema button to see the fields in that table and examples of items
in those fields.
I hope that this helps you. Please let us know if you have further
questions.
Rachel
>
>> From: "ram vinay" <ramvinayp at gmail.com>
>> Date: October 2, 2006 11:55:18 PM PDT
>> To: "Brooke Rhead" <rhead at soe.ucsc.edu>
>> Subject: Re: [Genome] annotation from CDS and mRNA coordinates
>>
>> Dear sir,
>>
>> Can I make search by Entrez Gene ID and Entrez Gene Symbol in The Table
>> Browser for annotation from CDS and mRNA coordinates along with Gene
>> coordinates.
>>
>> thank you,
>> Ram Vinay Pandey
>>
>> On 10/3/06, Brooke Rhead <rhead at soe.ucsc.edu> wrote:
>> Hello Ram Vinay,
>>
>> The Table Browser is the tool to use to extract information from the
>> Genome Browser databases. To get to it, use the "Tables" link (in the
>> blue bar at the top of the web page).
>>
>> You can use the Table Browser to upload a list of genes of interest and
>> then download data for just those genes. To do this, choose the genome
>> and assembly you are interested in. Choose "group: Genes and gene
>> prediction tracks". The track you choose will depend on which gene
>> dataset you would like to get information from (e.g., RefSeq Genes or
>> Ensembl Genes). Then, for "table:", use the table at the top of the
>> drop-down menu list (e.g., refGene or ensGene). Be sure to choose
>> "region: genome".
>>
>> Now you can paste or upload your list of genes in the "identifiers
>> (names/accessions):" section. The gene names must be in the correct
>> format for the track/table you have chosen, usually an accession number.
>> To see an example of the format your selected table uses, click the
>> "describe table schema" button and look at the 'name' field of the table.
>>
>> The output format you choose depends on what information you are looking
>> for. If you are only interested in coordinates of the genes (including
>> the transcription start and stop coordinates and the CDS start and stop
>> coordinates), you can choose to output "all fields from selected table"
>> or "selected fields from primary and related tables". If you would like
>> the coordinate information in BED format (read more about BED format
>> here: http://genome.ucsc.edu/FAQ/FAQformat#format1 ), choose to output
>> the data as a custom track, and then use the "get custom track in file"
>> button. Alternatively, if you are interested in obtaining the sequence
>> for these regions, use the "sequence" output format. Note that if you
>> choose to output the genomic sequence, there are additional formatting
>> options available for the untranslated regions and CDS regions.
>>
>> I hope this information is useful to you. If you have further
>> questions, please do not hesitate to contact the list again.
>>
>> --
>> Brooke Rhead
>> UCSC Genome Bioinformatics Group
>>
>>
>> ram vinay wrote:
>>> Dear collogue,
>>>
>>> I am a research scolor from India, I want to get annotation from CDS and
>>> mRNA coordinates along with Gene coordinates
>>> for a gene or multiple genes.please let me know from which page of UCSC
>>> Genome Browser I can get.
>>>
>>> For this help I shall be more greatful, and looking forward.
>>>
>>>
>>> thank tou,
>>> Ram Vinay Pandey
>>> India
>>> _______________________________________________
>>> Genome maillist - Genome at soe.ucsc.edu
>>> http://www.soe.ucsc.edu/mailman/listinfo/genome
>>
>
--
Rachel Harte
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu
More information about the Genome
mailing list