[Genome] Understanding alternative splice data tables
Clancy, Kevin
Kevin.Clancy at invitrogen.com
Mon Apr 21 16:26:56 PDT 2008
Brooke
Thank you very much for your reply. Both the paper and the previous
reply were most helpful. Looking at the paper I see a link to the Kent
lab source tree. Would it be possible to access this source tree purely
to audit the code for the altSplice and orthoSplice programs? If so, do
I need any specific account information?
Thanks
kevin
Kevin Clancy
Phone: x84401
Cell: (240) 417 8604
-----Original Message-----
From: Brooke Rhead [mailto:rhead at soe.ucsc.edu]
Sent: Wednesday, April 16, 2008 4:31 PM
To: Clancy, Kevin
Cc: genome at soe.ucsc.edu
Subject: Re: [Genome] Understanding alternative splice data tables
Hello Kevin,
The developer here who created the Sib Alt-Splice track is unavailable
at the moment, but maybe I can help you find the information you need.
The data for the track were contributed by Christian Iseli
(Christian.Iseli at licr.org); web site:
http://www.isrec.isb-sib.ch/tromer/ . I suggest contacting him for an
explanation of the Sib Alt-Splice data.
Another resource that might be helpful in understanding our table format
is a very similar track called "Alt-Splicing" (available on the hg17,
Human May 2004 browser). Here is the paper linked to on the details
page for that track:
Sugnet, C.W. et al., Transcriptome and genome conservation of
alternative splicing events in humans and mice. Pacific Symposium on
Biocomputing (PSB) 2004 Online Proceedings.
http://helix-web.stanford.edu/psb04/sugnet.pdf
There is also a bit of relevant information in this previously-answered
mailing list question:
http://www.soe.ucsc.edu/pipermail/genome/2005-September/008530.html
I hope this information helps. If I get a response from the developer
here who helped create this track, I will send you more information.
--
Brooke Rhead
UCSC Genome Bioinformatics Group
Clancy, Kevin wrote:
> Hello Ann
> Thank you very much for your reply. I appreciate you pointing me to
the
> Spliced ESTs track - that is very useful indeed. However I am still
> interested in understanding better the data in the Sib Alt-Splicing
> track as well. So I think I understand part of the Sib Alt-Splice
table
> based upon the table schema descriptions. Could you please provide an
> explanation of the values for the vTypes digits in the table? They
seem
> to have the values 0,1,2,and 3 but I don't know if there are more and
> what the values actually stand for.
>
> Secondly, could you please provide an explanation of the organization
of
> the evidence field data. I can see there is some form of {} organized
> data there but I don't understand it.
>
> For the example you provided, the evidence field begins with:
> {11,{36,87,88,89,110,111,118,119,120,121,122,},}
> And continues on. What do these numbers refer to? At the moment I
think
> this means for the first edge, there are 11 ESTS that support this
start
> site and they are elements 36,87,88, etc of the mRNA Refs field. Is
the
> evidence for the alternative spicing contained in a third level of
> bracketing within the evidence field?
>
> Thanks
> kevin
>
> Kevin Clancy, PhD
> Senior Scientist, Informatic Sciences
> Invitrogen Corp
> Carlsbad CA 92008
> Phone: (240) 379 4401
> Cell: (240) 417 8604
> Email: kevin.clancy at invitrogen.com
>
>
> -----Original Message-----
> From: Ann Zweig [mailto:ann at soe.ucsc.edu]
> Sent: Saturday, April 12, 2008 4:09 PM
> To: Clancy, Kevin
> Cc: 'genome at soe.ucsc.edu'
> Subject: Re: [Genome] Understanding alternative splice data tables
>
> Hello Kevin,
>
> You can read about the details behind a track (description,
> methods,
> display, credits, references) by pressing on the 'mini-button' to the
> left of the actual track display, or by clicking on the hyperlinked
> track name in the track controls (below the display). The track that
I
> think
> will be the most helpful to you will be the "Spliced ESTs" track.
>
> The answer to your question about the "sum of the block sizes
> adding up to the
> query EST size" is no. The sum of the block sizes include only the
> parts of the
> ESTs that actually align. Take for example EST AA971065 on the hg18
> assembly.
> The first 7 exons of the EST do not align to the genome. That
accounts
> for the
> 'mismatch': (300 + 45) + 7 = 352.
>
> mysql> select * from chr21_intronEst where qName = 'AA971065'\G
> *************************** 1. row ***************************
> bin: 660
> matches: 281
> misMatches: 0
> repMatches: 64
> nCount: 0
> qNumInsert: 0
> qBaseInsert: 0
> tNumInsert: 1
> tBaseInsert: 1784
> strand: +
> qName: AA971065
> qSize: 352
> qStart: 7
> qEnd: 352
> tName: chr21
> tSize: 46944323
> tStart: 9928611
> tEnd: 9930740
> blockCount: 2
> blockSizes: 300,45,
> qStarts: 7,307,
> tStarts: 9928611,9930695,
>
> Note in the browser image, the blue vertical line at the
> beginning of this EST.
> This line denotes an "insertion at the beginning or end of the
query"
> (as
> noted on the track description page).
>
> It definitely will take some work to extract what you want, but
> I think the
> data is there. You might look at some of the programs in our source
> tree that
> make use of the txGraph format. The Genome Browser and Blat software
> are free
> for academic, nonprofit, and personal use. A license is required for
> commercial use.
>
> How to download the software:
> http://genome.cse.ucsc.edu/FAQ/FAQlicense#license3
>
> You can obtain the source tree either via CVS:
> http://genome.ucsc.edu/admin/cvs.html
> or a zip file:
> http://hgdownload.cse.ucsc.edu/admin/jksrc.zip
>
> Please note the build instructions:
> http://genome.ucsc.edu/admin/jk-install.html
>
> All of the kent utilities output their usage message and command
> line options
> by running them with no arguments.
>
>
> Regards,
>
> ----------
> Ann Zweig
> UCSC Genome Bioinformatics Group
> http://genome.ucsc.edu
>
>
>
>
>
> Clancy, Kevin wrote:
>> Dear Sirs
>> I am interested in using Tables to extract information on alternative
>> spice exons in human chromosome 21. So I would like to have a measure
> of
>> for each exon seen by aligning ESTs along the genome, how many times
>> that exon is present in all the represented genomes and how many
times
>> it binds to the upstream and downstream exons on either side of it.
>> Ideally I would like to use the information from the table to extract
>> all these alternative splice products from the genomic sequence with
>> some statistics on the prevalence of each exon triplet. Ultimately
I'm
>> interested in generating a probability based tool to generate
>> alternative spice products.
>>
>> Looking at the Spliced ESTs/intronEST table, I can see that you have
>> nucleotide sizes, start positions and block sizes in both the query
> EST
>> and the chromosome. Should the sum of the block sizes add up to the
>> query EST size? If not, why is there a difference in the two?
>>
>> Secondly I have looked at the Sib Alt-Slicing/sib TX graph table and
> you
>> have a network representation of vertices and edges fields
> corresponding
>> to the EST but I don't quite understand how to use the information
you
>> have there to tackle my problem. Any help or simple example you can
>> point me towards would be very appreciated.
>>
>> Finally are these the best tables for me to look at to try and
> generate
>> this type of information? If not, where would be a better table?
>> Thanks kevin
>>
>> Kevin Clancy, PhD
>> Senior Scientist, Informatic Sciences
>> Invitrogen Corp
>> Carlsbad, CA 92008
>> Phone: (240) 379 4401 x84401
>> Cell: (240) 417 8604
>> Email: kevin.clancy at invitrogen.com
>>
>> _______________________________________________
>> Genome maillist - Genome at soe.ucsc.edu
>> http://www.soe.ucsc.edu/mailman/listinfo/genome
>
>
>
>
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
More information about the Genome
mailing list