[Genome] Access of annotation for GenBank and RefSeq genes

Bushel, Pierre (NIH/NIEHS) [E] bushel at niehs.nih.gov
Mon Mar 5 08:34:25 PST 2007


Yes it does.  Thanks a lot.  This is so helpful.  You all are doing a
magnificent job!

Pierre

-----Original Message-----
From: Ann Zweig [mailto:ann at soe.ucsc.edu] 
Sent: Friday, March 02, 2007 6:50 PM
To: Bushel, Pierre (NIH/NIEHS) [E]
Cc: genome at soe.ucsc.edu
Subject: Re: [Genome] Access of annotation for GenBank and RefSeq genes


Hello Pierre,

	You can do this using our on line Table Browser tool (press the
'Tables' link 
in the top blue navigation bar of the browser).  Configure the Table
Browser for 
the assembly you want, then set the rest of it up like so:

group: Genes and Gene Prediction Tracks
track: RefSeq Genes
table: refGene
region: genome
identifiers: [paste in your list of RefSeq IDs]
output format: selected fields from primary and related tables output
file: [name of output file to be saved on your machine]

Press the 'get output' button.

	From the next screen, you will join the refGene table with the
kgXref table 
like so:

1. From the "Select Fields from hg18.refGene" section, check the
following 
fields of the refGene table:

name, chrom, strand, txStart, txEnd

2. From the "Linked Tables" section, check kgXref and press the 'Allow
Selection 
 From Checked Tables' button near the bottom of the screen.

3. From the "hg18.kgXref fields" section, check the description field.

4. Press the 'get output' button.

	The tab-delimited file will be create and downloaded to your
machine.


	For example, when I follow these instructions on the latest
human assembly 
(hg18) for the RefSeq Gene ID # NM_000808, I get the following
tab-delimited output:

#hg18.refGene.name	hg18.refGene.chrom	hg18.refGene.strand 
hg18.refGene.txStart	hg18.refGene.txEnd	hg18.kgXref.description

NM_000808	chrX	-	151087187	151370486
gamma-aminobutyric acid A receptor, alpha 3


	Hope this is helpful to you.


Regards,

----------
Ann Zweig
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu


Please feel free to search the Genome mailing list archives by visiting
our home page, clicking on "Contact Us", then typing a word or phrase
into the search box.  On that same page
(http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome
mailing list.


Bushel, Pierre (NIH/NIEHS) [E] wrote:
> Greetings,
> 
> I'd like to use the UCSC Genome Browser annotation files or MySQL 
> database to obtain annotation for GenBank and RefSeq genes.  In 
> particular, I'd like to query the files or the database for the gene 
> symbol(s), start and end locations in the genome, the chromosome 
> number, the DNA strand the gene lies on and gene description.  This 
> would be for human genes which I have a GenBank accession number or 
> RefSeq ID.  Could you possibly provide me with the database tables, 
> relationships and SQL required to query this information?  I'm trying 
> to obtain this data in a tab-delimited format file to use with a 
> commercial software.
> 
> Thanks,
> 
> Pierre Bushel
> _______________________________________________
> Genome maillist  -  Genome at soe.ucsc.edu 
> http://www.soe.ucsc.edu/mailman/listinfo/genome



More information about the Genome mailing list