[Genome] motif quality assessment
Ann Zweig
ann at soe.ucsc.edu
Fri Jun 1 11:43:26 PDT 2007
Hello T. Joshi,
To get conserved sequence for the 5' and 3' regions, you can use the
Table Browser ('Tables' from the top blue navigation bar). First, you
will create a Custom Track made up of the 5' and 3' regions of the gene
set you are interested in. For more details on creating a Custom Track,
see this User's Guide:
http://genome.ucsc.edu/goldenPath/help/customTrack.html
Then, you will intersect your Custom Track with the multiz17way table.
This is the table that underlies the Conservation track. For more
details on performing a table intersection using the Table Browser, see
this User's Guide:
http://hgw8.cse.ucsc.edu/goldenPath/help/hgTablesHelp.html#SimpleIntersection
As for your second question, you can find the most conserved sections
of your 5' and 3' UTRs by intersection your original Custom Track with
the phastConsElements17way table. This is the table which underlies the
Most Conserved track.
See this very similar previously-answered mail list question for more
details:
http://www.soe.ucsc.edu/pipermail/genome/2006-May/010525.html
Regards,
----------
Ann Zweig
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu
Please feel free to search the Genome mailing list archives by visiting
our home page, clicking on "Contact Us", then typing a word or phrase
into the search box. On that same page
(http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome
mailing list.
T Joshi wrote:
> Hi !
> Thanks Kayla for your reply.
>
> I have some more questions, I hope someone can help :
> 1)How can I get conserved sequences from 5' and 3' UTR region ?
> I tried UCSC's conserved track, which gives me the entire
> genome/chromosome's conserved sequence with its postion coordinates.
> But I am only interested in only the 5' or 3' region of the sequence
> which is conseved.
> Tools such as Blast on NCBI allows to query conserved amino acid
> sequence, but not nucleotide sequences.
>
> 2) Provided I have a set of sequences from these UTR regions, is there
> any tool which lets me find only conserved sub-sequences from the
> input set of sequences?
>
> Thanks,
> T Joshi
>
> On 29/05/07, Kayla Smith <kayla at soe.ucsc.edu> wrote:
>> TJoshi,
>>
>> You may want to use the Improbizer:
>> http://genome-test.cse.ucsc.edu/Improbizer/
>>
>> This is a program that slowly crawls through DNA or RNA sequence
>> looking for consensus motifs that happen improbably often. Note
>> that this is on our test server and data/tools found here have not
>> gone through our rigorous QA process.
>>
>> I hope this is helpful to you. Please don't hesitate to contact us again
>> if you require further assistance.
>>
>> Kayla Smith
>> UCSC Genome Bioinformatics Group
>>
>> On Sun, 27 May 2007, T Joshi wrote:
>>
>>> Hello All !
>>> This is my first email to the list, and I am not sure if I am posting
>>> at a right place.
>>> Anyways, my problem refers to the quality assessment of motifs
>>> discovered any of the motif discovery algorithms such as MEME, YMF or
>>> Gemoda, applied to DNA sequence data. Given a set of known motifs, I
>>> want to compare them, to evaluate predicted motifs generated by one of
>>> these algorithms. I want to find the statistics such as sensibility,
>>> specificity, false positives, false negatives, etc.
>>> I searched for the tool for this purpose, but it gives me those for
>>> the protein sequences and not for DNA sequences. Some of them compare
>>> the predicted motifs with their own set of knownmotifs, but not with
>>> user-given set of known motifs.
>>>
>>> thoughts? pointers?
>>> Please help.
>>> Thanks,
>>> TJoshi
>>> _______________________________________________
>>> Genome maillist - Genome at soe.ucsc.edu
>>> http://www.soe.ucsc.edu/mailman/listinfo/genome
>>>
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
More information about the Genome
mailing list