[Genome] 12 drosophila genome intron alignments
Vanessa Bauer
vlb2 at cornell.edu
Mon Dec 11 08:50:44 PST 2006
Hello,
Below you will find a few e-mails that I have had with Angie Hinrichs
regarding intron alignments. She was very helpful and I now have a
better understanding of how your data is organized. That being said,
I still have questions. Basically, what we would like to have are
intron alignments specifically for the species in the melanogaster
subgroup. With Angie's help I can not get such alignments directly
but they are based on 15 species. Will this led to a bias toward
more conserved sequence being available. In other words, would there
be more aligned intron nucleotides if the alignments were based on a
smaller more closely related set of species?
thanks, Tessa Bauer DuMont
Hi Tessa,
I'm not sure if this will get you exactly what you're looking for, but
hopefully it will be a start: our Table Browser tool can make a track
of introns, and it can "intersect" (find the overlap between) tracks.
So it's possible to extract multi-species alignments from intronic
regions.
First, to make a custom track containing the introns of FlyBase genes:
0. Start on our test server, genome-test.cse.ucsc.edu, since the main
server does not yet have alignments that include sechellia.
1. Open the Table Browser (click the Tables link in the top blue bar).
2. Make the following selections on the Table Browser main page:
clade: insect
genome: D. melanogaster
assembly: Apr. 2004
group: Genes and Gene Predictions
track: FlyBase Genes
region: genome
output format: custom track
[If an intersection or filter was defined previously, clear it]
Click "get output".
3. If you like, you can change the name or description to something
more descriptive than the defaults, like "flyBaseIntrons".
Where it says " Create one BED record per:", choose Introns.
Click "get custom track in Table Browser".
Now, to get MAF alignments in all intronic regions:
4. Make the following selections on the Table Browser main page:
group: Comparative Genomics
track: Conservation (15way)
table: multiz15way
output format: MAF
intersection: Click "create".
5. Select your intron custom track as the secondary table:
group: Custom Tracks
track: flyBaseIntrons (or whatever name was used in step 3).
Choose "Base-pair-wise intersection (AND) of Conservation and ..."
as the method of finding overlap.
Click submit.
6. Back on the Table Browser main page, click "get output".
[You may want to download directly to a file -- in that case,
before clicking "get output", enter a file name into the "output
file" box.]
That will return multi-species alignments for all intronic regions, in
the MAF format (see http://genome.ucsc.edu/FAQ/FAQformat#format5 for a
format description). If you are interested in only a subset of the
species, you can use "grep -v" or a similar program to remove the
species that you don't want.
Hope that helps!
There is a really great email list to which questions about our data
and programs can be sent: genome at cse.ucsc.edu. Several of my
colleagues rotate responsibilities for answering questions sent there,
and are very diligent about responding to questions within a day.
Soon I will be taking some extended leave, and during that time you'll
most definitely hear back more promptly from them than from me! :)
Angie [not a Dr. but thanks]
On Wed, 6 Dec 2006, Vanessa Bauer wrote:
> Hello Dr Hinrichs
>
> My name is Tessa Bauer DuMont and I work with Prof. Chip Aquadro. We are
> working on a companion paper as part of the analysis of the 12 Drosophila
> genomes. Although it is not crucial for the completion of this paper, we
> would like to obtain intron alignments associated with each of the coding
> regions for which independent gene by gene alignments already exist. Our
> analysis focuses on the closely related species D.mel, D.sec and D.yak. At
> this point the whole genome alignments are as close as I have gotten to what
> we want (an alignment of the introns for each gene), but it is not
>clear to me
> how one would progress from here. Do you know of any other groups that may
> have already extracted the alignments for the introns, or do you have any
> suggestions as to the easiest way to pull the introns out of the database.
>
> Thank-you for your consideration,
> Tessa
>
--
angie at soe.ucsc.edu
Software Developer, UCSC CBSE / Genome Bioinformatics Group
--
"If a nation expects to be free and ignorant--it expects what never
was and never will be."
-- Thomas Jefferson
More information about the Genome
mailing list