[Genome] non-redundant set of refseq or knownGenes exons
Khrebtukova, Irina
ikhrebtukova at illumina.com
Thu Aug 9 06:05:20 PDT 2007
Hiram,
the same exactly result is received when intersecting a custom track of
"exons only" table to itself by base-pair intersection. No genome track
necessary.
Irina
-----Original Message-----
From: Hiram Clawson [mailto:hiram at soe.ucsc.edu]
Sent: Tuesday, August 07, 2007 9:15 PM
To: archanat at soe.ucsc.edu; Khrebtukova, Irina
Cc: genome at soe.ucsc.edu
Subject: Re: [Genome] non-redundant set of refseq or knownGenes exons
Good evening Irina:
If I understand your query correctly, what you would like to know are
the areas of the genome that are covered by any exon. This is an
intersection of the exons with the genome. The only item you do not
readily have is "the genome." For this, you need the information from
the chromInfo table. For example, the chrom extents of hg18 in the form
of a bed file to be used as a custom track can be created by:
echo 'track name=hg18_extent description="hg18 chromosome extents"' \
> hg18.extent.bed
mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -N \ -e "select
chrom,0,size,chrom from chromInfo;" hg18 >> hg18.extent.bed
Or, from the chromInfo file:
http://hgdownload.cse.ucsc.edu/goldenPath/hg18/database/chromInfo.txt.gz
Or from the table browser, "all tables" then "chromInfo" and selected
fields from that table, the chrom and size. Insert a 0 in a middle
column from that output to obtain a bed file.
Or, from a copy I made from the mysql command above:
http://genome-test.cse.ucsc.edu/~hiram/hg18/hg18.extent.bed
With hg18.extent.bed in hand, load that as a custom track.
Then, create a custom track of "exons only" from your favorite gene set.
Then, with the hg18 extent track chosen as first track, run a base-pair
intersection of that with the exon custom track, obtain the results as a
third track.
You can view these tracks in the browser to verify they have the meaning
you want from this exercise.
Then, using the result of your intersection, obtain the fasta for those
areas via the table browser "sequence" output.
I have saved this exercise as a session "hg18 exon locus":
http://genome.ucsc.edu/cgi-bin/hgTracks?hgS_doOtherUser=submit&hgS_other
UserName=Hiram&hgS_otherUserSessionName=hg18%20exon%20locus
Note the three custom tracks.
--Hiram
> From: "Khrebtukova, Irina" <ikhrebtukova at illumina.com>
> Subject: Re: [Genome] non-redundant set of refseq or knownGenes exons
> Hi Archana,
> my problem is that I do know now how to make a custom track and how to
> get a sequence from custom track. I was wondering (and I guess it's
> now a FAQ for you guys) how to get NON_REDUNDANT set of exons?
> Preferably of course dealing with overlapping exons too. Like
> selecting largest exon of all overlapping.
> ok, I guess it's still not that easy solution. But my prediction would
> be that more and more people would ask this question...
> thanks! and of course I just LOVE your browser! really it's the best
> despite of my minor desires of making even better.
> Irina
More information about the Genome
mailing list