[Genome] questions
Rachel Harte
hartera at soe.ucsc.edu
Thu May 24 12:14:38 PDT 2007
Hello Michelle,
When we run genscan we used the default for suboptimal exons. We just
use the "-subopt" option but we do not specify a cutoff value. Here is what
I found about the -subopt option in the genscan documentation:
-subopt This option displays suboptimal exons with P > cutoff
(optional). Suboptimal exon probability cutoff (minimum: 0.01). The
default output of the program is the optimal "parse" of the sequence, i.e.
the highest probability gene structure(s) which is present: the exons in
this optimal parse are referred to as "optimal exons" and are always
printed out by GENSCAN.
Suboptimal exons, on the other hand, are defined as potential exons which
have probability above a certian threshold but which are not contained in
the optimal parse of the sequence. Suboptimal exons have a variety of
potential uses. First, suboptimal exons sometimes correspond to real exons
which were missed for whatever reason by the optimal parse of the
sequence. Second, regions of a prediction which contain multiple
overlapping and/or incompatible optimal and suboptimal exons may in some
cases indicate alternatively spliced regions of a gene (Burge & Karlin, in
preparation). The argument "cutoff" is the probability cutoff used to
determine which potential exons qualify as suboptimal exons. This argument
should be a number between 0.01 and 0.99. For most applications, a cutoff
value of about 0.10 is recommended. Setting the value much lower than 0.10
will often lead to an explosion in the number of suboptimal exons, most of
which will probably not be useful. On the other hand, if the value is set
much higher than 0.10, then potentially interesting suboptimal exons may
be missed.
We store the results for suboptimal exons in a table called genscanSubopt
but we do not display these exons. This table is not available on our
public web site but only on our development server. If you go to our
development server:
http://genome-test.cse.ucsc.edu
Then choose the "Tables" link on the top blue menu bar. Using the Table
Browser, you can download the contents of the genscanSubopt table. To do
this, select the organism and assembly and select "Genes and Gene
Predictions" as the group, select "Genscan" as the track and genscanSubopt
will appear in the tables list. Please not that data on our test server,
that is not on our public site, has not been tested by our QA team so it
could be incorrect.
For our human Genome Browsers, we have two tracks for more recent gene
predicion programs: Augustus and N-SCAN. Also, Exoniphy predicts coding
exons using conservation information from human, mouse, rat and dog. You
may want to check out these tracks too.
Finally, here is an article that compares a number of gene prediction
programs and assesses their performance:
http://genomebiology.com/2006/7/S1/S2
I hope that this helps you. Please let us know if you have further
questions.
Rachel
Rachel Harte
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu
On Thu, 24 May 2007, michelle wrote:
> hi,
> we downloaded the genscan genes from your tablw browser,and we want to use
>them to do some searching work.
> But I still have two questiones about the data,
> one is,genscan gave exon probability value for their predication, and as
> they suggested,exons with p-value above 0.99 can be used for other
> purpose with confidence. I do not know how you choose the genes you
> provided in your website;
> the other is, genscan also mentioned about the optimal and subopitimal
> exons, do you have such classification?
>
> thanks for your attention. we will be waiting for your reply.
>
> michelle
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>
More information about the Genome
mailing list