[Genome] MSI markers
BIJU JOSEPH
bjoseph5 at jhmi.edu
Sat Feb 24 21:28:04 PST 2007
How do I can select a panel of micro satellite markers for MSI analysis if there is no previous reports available ?
Biju Joseph
Division of Endocrinology and Metabolism
Johns Hopkins School of Medicine
Suite 813, Hunterian building
1915, East Madison st.
Baltimore, MD 21287
Phone: 410-502-3046
----- Original Message -----
From: genome-request at soe.ucsc.edu
Date: Saturday, February 24, 2007 3:10 pm
Subject: Genome Digest, Vol 49, Issue 33
To: genome at soe.ucsc.edu
> Send Genome mailing list submissions to
> genome at soe.ucsc.edu
>
> To subscribe or unsubscribe via the World Wide Web, visit
>
> or, via email, send a message with subject or body 'help' to
> genome-request at soe.ucsc.edu
>
> You can reach the person managing the list at
> genome-owner at soe.ucsc.edu
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Genome digest..."
>
>
> Today's Topics:
>
> 1. Conservation scores (Goel, Manisha)
> 2. Re: Transcription Factors Binding sites (Matt Weirauch)
> 3. Re: Conservation scores (Ann Zweig)
> 4. hg18 chr3 strange characters (Jeltje van Baren)
> 5. addition hg18 chr3 (Jeltje van Baren)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 23 Feb 2007 15:11:00 -0600
> From: "Goel, Manisha" <MAG at stowers-institute.org>
> Subject: [Genome] Conservation scores
> To: <genome at soe.ucsc.edu>
> Message-ID:
> <C28BAF593DC3314E9C0F3A50191C2E7804D0FC2F at EXCHKC03.stowers-institute.org>
>
> Content-Type: text/plain; charset="us-ascii"
>
> Hello,
>
> I want to get arrive at some kind of sequence conservation score between
> D.melanogaster and D.pseudoobscura.
> I have a list of corodinates for the regions of interest in
> D.melanogaster genome.
> I tried using the table browser but that gives me the multiz alignment
> (and score) for all 15 related species.
> Is it possible to somehow see the level of conservation only for the
> two
> species ?
>
>
> Thanks for your advice,
> -Manisha
>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Fri, 23 Feb 2007 13:49:05 -0800
> From: "Matt Weirauch" <weirauch at soe.ucsc.edu>
> Subject: Re: [Genome] Transcription Factors Binding sites
> To: "Pablo Minguez" <pminguez at cipf.es>
> Cc: genome at soe.ucsc.edu
> Message-ID:
> <ce8e152d0702231349x5c8fd7c6j5025e6124380928d at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Hi Pablo,
>
> There are several reasons for the differences. Here are a few that I
> can think of.
>
> 1) different versions of TransFac.
> We are using version 7.0, which is the publicly available version of
> transfac. Because of this, the matrices used to score matches might
> be different. In particular, two of the matrices that you found
> (V$GATA3_03 and V$VMYB_01) are not present in version 7.0, so there
> was no way of identifying them. Additionally, we only use matrices
> that encode TFs found in human, mouse and rat.
>
> 2) different algorithms
> The basic algorithms of MATCH and TFLOC are similar, but I am sure
> there might be some small differences that result in different scores
> for a given sequence.
>
> 3) different cutoffs
> MATCH provides several different cutoffs to consider a sequence a hit
> to a matrix (I am not sure which ones you are using.) TFLOC's cutoff
> is based on a Z-score, which estimates how much more strong a sequence
> match is to a matrix than what you would expect by chance in the
> upstream regions of all genes in the genome (see documentation page).
> You can see, for instance, that by lowering the Z-score cutoff in the
> track, more hits come up int he browser.
>
> I have never published the TFLOC method because its intention is to
> provide a simple but reasonable algorithm for identifying conserved
> binding sites. I have only used it to make this track, which is
> intended to be a tool for biologists to quickly identify strongly
> conserved matches to known binding sites. There are probably many
> methods that are fancier and might do better than this one, there have
> literally been hundreds of papers written on this sort of thing.
>
> You are welcome to the source code if you would like, but be
> forewarned that it is intended for our internal use only, so it is not
> well-documented, and is optimized for our own data types and this
> problem in particular, so it might not be the general sort of tool
> that would be useful to you.
>
> Matt
>
> On 2/22/07, Pablo Minguez <pminguez at cipf.es> wrote:
> > Hi,
> > I am interesting in location of the TFs binding sites within the promoter
> > region of genes. I find the mapping you provide very useful. Before
> I
> > discovered this, I were using the Transfac web page to map the
> matrices to
> > the regions of my interest. I found that the TFs that the Match program
> > (transfac) displays doesn't match with the ones the genome browser
> show. I
> > understand you only show the conserved binding sites over human,
> mouse and
> > rat and also all the restrictions that you make to the scores, but
> even
> > though, as you are using the same matrices I don't understand why the
> > information from both sides is totally different, at least both resources
> > should share the ones you show.
> > I am sure I am missing something, have you ever compared your
> mapping with
> > Match (Transfac) or other algorithms mapping?
> > Could be a matter of the differences in the algorithms? If so,
> could you
> > explain me how the mapping is improved with tfloc? Is it free-available?
> >
> > I attach an example of matrices found for a gene in UCSC browser
> and Transfac.
> >
> > Many thanks for your help,
> > bests regards,
> > Pablo.
> >
> > --------------------------------------------------------------------------------------------------------------------------
> > Gene: C14orf166
> > Region searching for binding sites: chr14:51,520,943-51,525,943
> >
> > * UCSC Genome browser:
> >
> >
> > Matrix ids found: V$HTF_01, V$P53_01, V$HFH1_01
> >
> > * Results using Match (transcfac):
> > Sequence: 5000 bases upstream of gene C14orf166 (chr14:51,520,943-51,525,943)
> >
> > matrix id (factor name)
> > V$OCT1_02 (Oct-1)
> > V$CREL_01 (c-Rel)
> > V$CEBP_C (C/EBP)
> > V$IK1_01 (Ik-1)
> > V$FOXJ2_02 (FOXJ2)
> > V$FOXJ2_02 (FOXJ2)
> > V$VMYB_01 (v-Myb)
> > V$HNF4_01 (HNF-4)
> > V$EVI1_04 (Evi-1)
> > V$HNF3B_01 (HNF-3beta)
> > V$FOXD3_01 (FOXD3)
> > V$NFY_Q6 (NF-Y)
> > V$NKX25_02 (Nkx2-5)
> > V$NKX25_01 (Nkx2-5)
> > V$HNF3B_01 (HNF-3beta)
> > V$OCT1_Q6 (Oct-1)
> > V$PAX6_01 (Pax-6)
> > V$GATA3_03 (GATA-3)
> >
> >
> > --
> > ---------------------------------------------
> > Pablo M?nguez Paniagua
> > PhD Student
> > Bioinformatics Departament
> > Centro de Investigaci?n Pr?ncipe Felipe, CIPF.
> > Av. Autopista del Saler 16, 46013, Valencia, Spain
> > Phone: +34 96 328 96 80 (Ext: 1011)
> >
> >
> > _______________________________________________
> > Genome maillist - Genome at soe.ucsc.edu
> >
> >
>
>
>
> ------------------------------
>
> Message: 3
> Date: Fri, 23 Feb 2007 14:14:32 -0800
> From: Ann Zweig <ann at soe.ucsc.edu>
> Subject: Re: [Genome] Conservation scores
> To: "Goel, Manisha" <MAG at stowers-institute.org>
> Cc: genome at soe.ucsc.edu
> Message-ID: <45DF6748.10202 at soe.ucsc.edu>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Hello Manisha,
>
> The multiz alignment scores cover all 15 species, as you have noted.
> There is
> no way to extract a score for only a pair of species.
>
> However, depending on exactly what you are looking for, there is
> another set of
> tracks that may help you. In the dm browser, open the two tracks
> called Chains
> and Nets to dp. From the dp Chain track description page:
>
> This track shows alignments of D. pseudoobscura (dp3, Nov. 2004) to
> the D.
> melanogaster genome using a gap scoring system that allows longer
> gaps than
> traditional affine gap scoring systems. It can also tolerate gaps in
> both D.
> pseudoobscura and D. melanogaster simultaneously. These
> "double-sided" gaps can
> be caused by local inversions and overlapping deletions in both species.
>
> The dp Nets are the best D. pseudoobscura/D. melanogaster chain for
> every part
> of the D. melanogaster genome. It is useful for finding orthologous
> regions and
> for studying genome rearrangement.
>
> Perhaps these tracks will be helpful to you.
>
>
> Regards,
>
> ----------
> Ann Zweig
> UCSC Genome Bioinformatics Group
>
>
>
> Goel, Manisha wrote:
> > Hello,
> >
> > I want to get arrive at some kind of sequence conservation score between
> > D.melanogaster and D.pseudoobscura.
> > I have a list of corodinates for the regions of interest in
> > D.melanogaster genome.
> > I tried using the table browser but that gives me the multiz alignment
> > (and score) for all 15 related species.
> > Is it possible to somehow see the level of conservation only for
> the two
> > species ?
> >
> >
> > Thanks for your advice,
> > -Manisha
> >
> >
> > _______________________________________________
> > Genome maillist - Genome at soe.ucsc.edu
> >
>
>
> ------------------------------
>
> Message: 4
> Date: Sat, 24 Feb 2007 10:39:22 -0600
> From: Jeltje van Baren <jeltje at cse.wustl.edu>
> Subject: [Genome] hg18 chr3 strange characters
> To: genome at soe.ucsc.edu
> Message-ID: <45E06A3A.3090200 at cse.wustl.edu>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Hi Browserians
>
> One of my programs died because of strange characters in hg18 chr3:
>
> [jeltje at ardor chr3]$ grep -n R chr3.fa
> 1216118:CCRRGCTTGGTTCTAACAATGAATTTAATAAGAATTGTATTTAATCAATG
> [jeltje at ardor chr3]$ grep -n M chr3.fa
> 1216113:TCTTCATTAGCGCTACATAGCTGMCTTATTATTCGTGGTCCCCTATGACC
>
> On the Browser, these are represented as N
> I did not check the other chromosomes.
>
> -Jeltje
>
>
> ------------------------------
>
> Message: 5
> Date: Sat, 24 Feb 2007 10:41:09 -0600
> From: Jeltje van Baren <jeltje at cse.wustl.edu>
> Subject: [Genome] addition hg18 chr3
> To: genome at soe.ucsc.edu
> Message-ID: <45E06AA5.1030503 at cse.wustl.edu>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> I realize that information was incomplete:
>
> The chromosome was downloaded today from hgdownload hg18/chromosomes.
>
> -Jeltje
>
>
> ------------------------------
>
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
>
>
>
> End of Genome Digest, Vol 49, Issue 33
> **************************************
More information about the Genome
mailing list