[Genome] Access of annotation for GenBank and RefSeq genes
Ann Zweig
ann at soe.ucsc.edu
Fri Mar 2 15:50:14 PST 2007
Hello Pierre,
You can do this using our on line Table Browser tool (press the 'Tables' link
in the top blue navigation bar of the browser). Configure the Table Browser for
the assembly you want, then set the rest of it up like so:
group: Genes and Gene Prediction Tracks
track: RefSeq Genes
table: refGene
region: genome
identifiers: [paste in your list of RefSeq IDs]
output format: selected fields from primary and related tables
output file: [name of output file to be saved on your machine]
Press the 'get output' button.
From the next screen, you will join the refGene table with the kgXref table
like so:
1. From the "Select Fields from hg18.refGene" section, check the following
fields of the refGene table:
name, chrom, strand, txStart, txEnd
2. From the "Linked Tables" section, check kgXref and press the 'Allow Selection
From Checked Tables' button near the bottom of the screen.
3. From the "hg18.kgXref fields" section, check the description field.
4. Press the 'get output' button.
The tab-delimited file will be create and downloaded to your machine.
For example, when I follow these instructions on the latest human assembly
(hg18) for the RefSeq Gene ID # NM_000808, I get the following tab-delimited output:
#hg18.refGene.name hg18.refGene.chrom hg18.refGene.strand
hg18.refGene.txStart hg18.refGene.txEnd hg18.kgXref.description
NM_000808 chrX - 151087187 151370486 gamma-aminobutyric acid A receptor, alpha 3
Hope this is helpful to you.
Regards,
----------
Ann Zweig
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu
Please feel free to search the Genome mailing list archives by visiting
our home page, clicking on "Contact Us", then typing a word or phrase
into the search box. On that same page
(http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome
mailing list.
Bushel, Pierre (NIH/NIEHS) [E] wrote:
> Greetings,
>
> I'd like to use the UCSC Genome Browser annotation files or MySQL
> database to obtain annotation for GenBank and RefSeq genes. In
> particular, I'd like to query the files or the database for the gene
> symbol(s), start and end locations in the genome, the chromosome number,
> the DNA strand the gene lies on and gene description. This would be for
> human genes which I have a GenBank accession number or RefSeq ID. Could
> you possibly provide me with the database tables, relationships and SQL
> required to query this information? I'm trying to obtain this data in a
> tab-delimited format file to use with a commercial software.
>
> Thanks,
>
> Pierre Bushel
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
More information about the Genome
mailing list