[Genome] possible reasons for sequence masking
Kayla Smith
kayla at soe.ucsc.edu
Fri Feb 9 15:14:36 PST 2007
Vanessa,
In the assembly sequence that we make available for download, we mask
repeats only. However, if you click to get sequence for a
gene/transcript, you have the option of having coding regions in upper
case and introns in lower case -- and in that case it's not masking,
it's just the use of case for a different purpose. And if there are
different splice forms of a gene, the sequences returned will have upper
and lower case in different places.
I hope that helps to clear up the "masking" you might be seeing. Please
don't hesitate to contact us again if you require more assistance.
Kayla Smith
UCSC Genome Bioinformatics Group
Vanessa Bauer wrote:
> Hello,
>
> Sorry to bother you but I was unsuccessful answering the following
> question from browsing your web site. In short, I am curious if
> there are various reasons for sequences to be masked in an alignment.
> We have downloaded introns for a specific set of loci (roughly 8500)
> for Drosophila genomes from the Comparative Genomics "group"
> (multiz15way alignments). We our now attempting to get this data in
> the format that we want (i.e., each alignment block linked to its
> corresponding transcript and to mask any part of a intron that is
> also, at times, coding sequence) using the dm2 annotation. We have
> noticed upper and lower case letters in the alignments. While I did
> notice that repeats are masked on the web site I was also wondering
> if there is any other reason for masking. More specifically, have
> intron sequence that are also coding (due to alternative splicing or
> coding regions within introns of other coding regions) been masked?
>
> thanks, Vanessa
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
More information about the Genome
mailing list