[Genome] Getting a list of genes and their details from coordinates

Rachel Harte hartera at soe.ucsc.edu
Fri Mar 30 11:37:54 PDT 2007


Hi Alice,

Would it help if you did an intersection the other way round. So, you
could make a custom track of your regions. Then with the Table Browser,
start by selecting the Known Genes or RefSeq genes tracks and then
intersect that with your regions. This will give you a list of the genes
that overlap with your regions. If you get the output as a BED file then
you will get the positions of the exons. You can also select to get positions
of only introns or only exons in BED format.

To get the gene symbols for these genes, if you then select the Known
Genes track and the kgXref table, then you can paste in a list of
accessions from your intersection query output and then you can retrieve
the gene symbols from this kgXref table. Using the "selected fields from
primary and related tables" output option, you can also get the
information from the Known Genes table in your output (i.e. gene and exon
positions).  For RefSeqs, you would need to select the RefSeq Genes track
and then select the refSeqAli table and paste in a list of RefSeq
identifiers. For output format, select "selected fields from primary and
related tables" and then you can select fields from refSeqAli and from
either refFlat or refLink which would give you the gene symbol.  RefFlat
and refLink can only be searched by pasting a list of gene symbols so that
is why it must be done this way.

Finally, another tool that may be helpful to you is the Galaxy software at
Penn State University: http://g2.trac.bx.psu.edu/

They have an interface that is built on top of our Table Browser. They
allow you to do a join that is like the Table Browser intersection but it
will return information from both tables that are intersected instead of
losing the identifiers and positions from the second table which is a
limitation of the Table Browser. Galaxy has some helpful tutorials, but,
briefly, you can use the "Get Data" link on the side menu of Galaxy to use
the Table Browser interface to do a query to get data firstly from your
custom track and then from the RefSeq or Known Genes track. Once you have
the output of these two queries (links to output appear on the right side
of Galaxy), then you can create a join. To do this, click on the link
"Operate on Genomic Intervals" on the menu at the left side. This will
expand and you can select "Join" and then do a join on the output of your
two queries to your custom track and the gene track.

I hope that you find this helpful. Please let us know if you have further
questions about the Genome Browser or the Table Browser. If you have
questions about Galaxy, then please contact the developers at Penn State.
There is contact information on the Galaxy web pages.

Rachel

 Rachel Harte
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu


On Thu, 29 Mar 2007 alice_yamada at agilent.com wrote:

> Hi,
>
>
>
> I was wondering if you could help me figure out the most efficient way to
>go from chr coordinates to gene symbols, intron/exon locations, and other
>details of the genes.  I have a list of coordinates that I am trying to
>identify whether they reside in genes or not and if so, whether they
>reside in exons or not.
>
>
>
> To start, I have uploaded my regions as custom tracks to find overlap with
>Known or RefSeq gene.  This gives me the regions that have genes, but not
>which ones in the Table output.  I can get the info I want from the
>Genome Browser links, but when I have so many of them, this gets painful.
>
>
>
> Any insight on how I can convert chr coordinates into annotated lists of genes and their exon/intron overlap structure will be much appreciated!
>
>
>
> Thanks,
>
> Alice
>
>
>
>
>
> _______________________________________________
> Genome maillist  -  Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>


More information about the Genome mailing list