[Genome] Access of annotation for GenBank and RefSeq genes
Bushel, Pierre (NIH/NIEHS) [E]
bushel at niehs.nih.gov
Mon Mar 5 08:34:25 PST 2007
Yes it does. Thanks a lot. This is so helpful. You all are doing a
magnificent job!
Pierre
-----Original Message-----
From: Ann Zweig [mailto:ann at soe.ucsc.edu]
Sent: Friday, March 02, 2007 6:50 PM
To: Bushel, Pierre (NIH/NIEHS) [E]
Cc: genome at soe.ucsc.edu
Subject: Re: [Genome] Access of annotation for GenBank and RefSeq genes
Hello Pierre,
You can do this using our on line Table Browser tool (press the
'Tables' link
in the top blue navigation bar of the browser). Configure the Table
Browser for
the assembly you want, then set the rest of it up like so:
group: Genes and Gene Prediction Tracks
track: RefSeq Genes
table: refGene
region: genome
identifiers: [paste in your list of RefSeq IDs]
output format: selected fields from primary and related tables output
file: [name of output file to be saved on your machine]
Press the 'get output' button.
From the next screen, you will join the refGene table with the
kgXref table
like so:
1. From the "Select Fields from hg18.refGene" section, check the
following
fields of the refGene table:
name, chrom, strand, txStart, txEnd
2. From the "Linked Tables" section, check kgXref and press the 'Allow
Selection
From Checked Tables' button near the bottom of the screen.
3. From the "hg18.kgXref fields" section, check the description field.
4. Press the 'get output' button.
The tab-delimited file will be create and downloaded to your
machine.
For example, when I follow these instructions on the latest
human assembly
(hg18) for the RefSeq Gene ID # NM_000808, I get the following
tab-delimited output:
#hg18.refGene.name hg18.refGene.chrom hg18.refGene.strand
hg18.refGene.txStart hg18.refGene.txEnd hg18.kgXref.description
NM_000808 chrX - 151087187 151370486
gamma-aminobutyric acid A receptor, alpha 3
Hope this is helpful to you.
Regards,
----------
Ann Zweig
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu
Please feel free to search the Genome mailing list archives by visiting
our home page, clicking on "Contact Us", then typing a word or phrase
into the search box. On that same page
(http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome
mailing list.
Bushel, Pierre (NIH/NIEHS) [E] wrote:
> Greetings,
>
> I'd like to use the UCSC Genome Browser annotation files or MySQL
> database to obtain annotation for GenBank and RefSeq genes. In
> particular, I'd like to query the files or the database for the gene
> symbol(s), start and end locations in the genome, the chromosome
> number, the DNA strand the gene lies on and gene description. This
> would be for human genes which I have a GenBank accession number or
> RefSeq ID. Could you possibly provide me with the database tables,
> relationships and SQL required to query this information? I'm trying
> to obtain this data in a tab-delimited format file to use with a
> commercial software.
>
> Thanks,
>
> Pierre Bushel
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
More information about the Genome
mailing list