[Genome] [Fwd: intron question]

Rachel Harte hartera at soe.ucsc.edu
Tue Nov 14 16:32:40 PST 2006


Eric,
It is not possible to get the intron sizes directly from the Browser. In
order to do this, you will need to select a table that contains a gene set
that you wish to use for each organism e.g. knownGene which contains
genes from the Known Genes track on the human, rat and mouse Browser. The
RefSeq Genes are a track that is available for most organisms. Both of
these tables have a columns, exonStarts and exonEnds, that are comma
separated lists of the exon starts and ends. You would need to write
a program to parse this information and get the intron sizes by
subtracting the first item in exonStarts from the first item in
exonEnds, then subtracting the second item in exonStarts from the second
item in exonEnds etc. All start positions in the tables are on a 0-based
scale so base 1 is represented as 0.

The contents of all the tables in our databases are downloadable from our
Downloads server:

http://hgdownload.cse.ucsc.edu/downloads.html

If you click on an organism link, there is a list of assemblies. Each
assembly has an Annotation database link which will lead to the site from
which you can download the contents of tables for that assembly.

I hope that this helps you. Please let us know if you have further
questions. In future, please direct questions to the genome mailins list
at genome at soe.ucsc.edu. Thank you.

Rachel

Rachel Harte UCSC Genome Bioinformatics Group
http://genome.ucsc.edu

> -------- Original Message --------
> Subject: 	intron question
> Date: 	Tue, 14 Nov 2006 13:46:03 -0500
> From: 	Eric Lai <laie at mskcc.org>
> To: 	cbseweb at cbse.ucsc.edu
>
>
>
> hi,
>
> i'm not sure if this goes to anyone in particular.
>
> we would like to extract the introns of a particular size range from all
> the sequenced species.
> is there a way to do that for the species that are loaded at UCSC,
> or would you have any advice on how to obtain these datasets?
>
> thanks,
> .eric
>
>
>
> ***************************
> Eric Lai
> Assistant Member, Sloan-Kettering Institute
> 521 Rockefeller Research Labs
> 1275 York Avenue, Box 252
> New York, NY 10021
>
> ph: 212-639-5578
> fax: 212 717-3604
> site: http://www.mskcc.org/lai
>
>
>
>
>      =====================================================================
>
>      Please note that this e-mail and any files transmitted with it may be
>      privileged, confidential, and protected from disclosure under
>      applicable law. If the reader of this message is not the intended
>      recipient, or an employee or agent responsible for delivering this
>      message to the intended recipient, you are hereby notified that any
>      reading, dissemination, distribution, copying, or other use of this
>      communication or any of its attachments is strictly prohibited.  If
>      you have received this communication in error, please notify the
>      sender immediately by replying to this message and deleting this
>      message, any attachments, and all copies and backups from your
>      computer.
>
>
> --
> Branwyn Stewart Wagman
> Communications & Human Resources
> Center for Biomolecular Science and Engineering (CBSE)
> Institute for Quantitative Biomedical Research (QB3)
> 501C Engineering 2 Building
> UC Santa Cruz
> 1156 High Street, MS: CBSE/ITI
> Santa Cruz CA 95064
> Tel: (831) 459-3077
> Fax: (831) 459-1809
> bwagman at soe.ucsc.edu
> http://www.cbse.ucsc.edu
>


More information about the Genome mailing list