[Genome] MSI markers

BIJU JOSEPH bjoseph5 at jhmi.edu
Sat Feb 24 21:28:04 PST 2007


How do I can select a panel of micro satellite markers for MSI analysis if there is no previous reports available ?
Biju Joseph
Division of Endocrinology and Metabolism
Johns Hopkins School of Medicine
Suite 813, Hunterian building
1915, East Madison st.
Baltimore, MD   21287
Phone:  410-502-3046

----- Original Message -----
From: genome-request at soe.ucsc.edu
Date: Saturday, February 24, 2007 3:10 pm
Subject: Genome Digest, Vol 49, Issue 33
To: genome at soe.ucsc.edu


> Send Genome mailing list submissions to
>  	genome at soe.ucsc.edu
>  
>  To subscribe or unsubscribe via the World Wide Web, visit
>  	
>  or, via email, send a message with subject or body 'help' to
>  	genome-request at soe.ucsc.edu
>  
>  You can reach the person managing the list at
>  	genome-owner at soe.ucsc.edu
>  
>  When replying, please edit your Subject line so it is more specific
>  than "Re: Contents of Genome digest..."
>  
>  
>  Today's Topics:
>  
>     1. Conservation scores (Goel, Manisha)
>     2. Re: Transcription Factors Binding sites (Matt Weirauch)
>     3. Re: Conservation scores (Ann Zweig)
>     4. hg18 chr3 strange characters (Jeltje van Baren)
>     5. addition hg18 chr3 (Jeltje van Baren)
>  
>  
>  ----------------------------------------------------------------------
>  
>  Message: 1
>  Date: Fri, 23 Feb 2007 15:11:00 -0600
>  From: "Goel, Manisha" <MAG at stowers-institute.org>
>  Subject: [Genome] Conservation scores
>  To: <genome at soe.ucsc.edu>
>  Message-ID:
>  	<C28BAF593DC3314E9C0F3A50191C2E7804D0FC2F at EXCHKC03.stowers-institute.org>
>  	
>  Content-Type: text/plain;	charset="us-ascii"
>  
>  Hello,
>  
>  I want to get arrive at some kind of sequence conservation score between
>  D.melanogaster and D.pseudoobscura.
>  I have a list of corodinates for the regions of interest in
>  D.melanogaster genome.
>  I tried using the table browser but that gives me the multiz alignment
>  (and score) for all 15 related species.
>  Is it possible to somehow see the level of conservation only for the 
> two
>  species ?
>  
>  
>  Thanks for your advice,
>  -Manisha
>  
>   
>  
>  
>  ------------------------------
>  
>  Message: 2
>  Date: Fri, 23 Feb 2007 13:49:05 -0800
>  From: "Matt Weirauch" <weirauch at soe.ucsc.edu>
>  Subject: Re: [Genome] Transcription Factors Binding sites
>  To: "Pablo Minguez" <pminguez at cipf.es>
>  Cc: genome at soe.ucsc.edu
>  Message-ID:
>  	<ce8e152d0702231349x5c8fd7c6j5025e6124380928d at mail.gmail.com>
>  Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>  
>  Hi Pablo,
>  
>  There are several reasons for the differences.  Here are a few that I
>  can think of.
>  
>  1) different versions of TransFac.
>  We are using version 7.0, which is the publicly available version of
>  transfac.  Because of this, the matrices used to score matches might
>  be different.  In particular, two of the matrices that you found
>  (V$GATA3_03 and V$VMYB_01) are not present in version 7.0, so there
>  was no way of identifying them.  Additionally, we only use matrices
>  that encode TFs found in human, mouse and rat.
>  
>  2) different algorithms
>  The basic algorithms of MATCH and TFLOC are similar, but I am sure
>  there might be some small differences that result in different scores
>  for a given sequence.
>  
>  3) different cutoffs
>  MATCH provides several different cutoffs to consider a sequence a hit
>  to a matrix (I am not sure which ones you are using.)  TFLOC's cutoff
>  is based on a Z-score, which estimates how much more strong a sequence
>  match is to a matrix than what you would expect by chance in the
>  upstream regions of all genes in the genome (see documentation page).
>  You can see, for instance, that by lowering the Z-score cutoff in the
>  track, more hits come up int he browser.
>  
>  I have never published the TFLOC method because its intention is to
>  provide a simple but reasonable algorithm for identifying conserved
>  binding sites.  I have only used it to make this track, which is
>  intended to be a tool for biologists to quickly identify strongly
>  conserved matches to known binding sites.  There are probably many
>  methods that are fancier and might do better than this one, there have
>  literally been hundreds of papers written on this sort of thing.
>  
>  You are welcome to the source code if you would like, but be
>  forewarned that it is intended for our internal use only, so it is not
>  well-documented, and is optimized for our own data types and this
>  problem in particular, so it might not be the general sort of tool
>  that would be useful to you.
>  
>  Matt
>  
>  On 2/22/07, Pablo Minguez <pminguez at cipf.es> wrote:
>  > Hi,
>  > I am interesting in location of the TFs binding sites within the promoter
>  > region of genes. I find the mapping you provide very useful. Before 
> I
>  > discovered this, I were using the Transfac web page to map the 
> matrices to
>  > the regions of my interest. I found that the TFs that the Match program
>  > (transfac) displays doesn't match with the ones the genome browser 
> show. I
>  > understand you only show the conserved binding sites over human, 
> mouse and
>  > rat and also all the restrictions that you make to the scores, but 
> even
>  > though, as you are using the same matrices I don't understand why the
>  > information from both sides is totally different, at least both resources
>  > should share the ones you show.
>  > I am sure I am missing something, have you ever compared your 
> mapping with
>  > Match (Transfac) or other algorithms mapping?
>  > Could be a matter of the differences in the algorithms? If so, 
> could you
>  > explain me how the mapping is improved with tfloc? Is it free-available?
>  >
>  > I attach an example of matrices found for a gene in UCSC browser 
> and Transfac.
>  >
>  > Many thanks for your help,
>  > bests regards,
>  > Pablo.
>  >
>  > --------------------------------------------------------------------------------------------------------------------------
>  > Gene: C14orf166
>  > Region searching for binding sites: chr14:51,520,943-51,525,943
>  >
>  > * UCSC Genome browser:
>  > 
>  >
>  > Matrix ids found: V$HTF_01, V$P53_01, V$HFH1_01
>  >
>  > * Results using Match (transcfac):
>  > Sequence: 5000 bases upstream of gene C14orf166 (chr14:51,520,943-51,525,943)
>  >
>  > matrix id       (factor name)
>  > V$OCT1_02       (Oct-1)
>  > V$CREL_01       (c-Rel)
>  > V$CEBP_C        (C/EBP)
>  > V$IK1_01        (Ik-1)
>  > V$FOXJ2_02      (FOXJ2)
>  > V$FOXJ2_02      (FOXJ2)
>  > V$VMYB_01       (v-Myb)
>  > V$HNF4_01       (HNF-4)
>  > V$EVI1_04       (Evi-1)
>  > V$HNF3B_01      (HNF-3beta)
>  > V$FOXD3_01 (FOXD3)
>  > V$NFY_Q6        (NF-Y)
>  > V$NKX25_02 (Nkx2-5)
>  > V$NKX25_01 (Nkx2-5)
>  > V$HNF3B_01      (HNF-3beta)
>  > V$OCT1_Q6       (Oct-1)
>  > V$PAX6_01       (Pax-6)
>  > V$GATA3_03 (GATA-3)
>  >
>  >
>  > --
>  > ---------------------------------------------
>  > Pablo M?nguez Paniagua
>  > PhD Student
>  > Bioinformatics Departament
>  > Centro de Investigaci?n Pr?ncipe Felipe, CIPF.
>  > Av. Autopista del Saler 16, 46013, Valencia, Spain
>  > Phone: +34 96 328 96 80 (Ext: 1011)
>  > 
>  >
>  > _______________________________________________
>  > Genome maillist  -  Genome at soe.ucsc.edu
>  > 
>  >
>  
>  
>  
>  ------------------------------
>  
>  Message: 3
>  Date: Fri, 23 Feb 2007 14:14:32 -0800
>  From: Ann Zweig <ann at soe.ucsc.edu>
>  Subject: Re: [Genome] Conservation scores
>  To: "Goel, Manisha" <MAG at stowers-institute.org>
>  Cc: genome at soe.ucsc.edu
>  Message-ID: <45DF6748.10202 at soe.ucsc.edu>
>  Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>  
>  Hello Manisha,
>  
>  	The multiz alignment scores cover all 15 species, as you have noted. 
>  There is 
>  no way to extract a score for only a pair of species.
>  
>  	However, depending on exactly what you are looking for, there is 
> another set of 
>  tracks that may help you.  In the dm browser, open the two tracks 
> called Chains 
>  and Nets to dp.  From the dp Chain track description page:
>  
>  This track shows alignments of D. pseudoobscura (dp3, Nov. 2004) to 
> the D. 
>  melanogaster genome using a gap scoring system that allows longer 
> gaps than 
>  traditional affine gap scoring systems. It can also tolerate gaps in 
> both D. 
>  pseudoobscura and D. melanogaster simultaneously. These 
> "double-sided" gaps can 
>  be caused by local inversions and overlapping deletions in both species.
>  
>  	The dp Nets are the best D. pseudoobscura/D. melanogaster chain for 
> every part 
>  of the D. melanogaster genome. It is useful for finding orthologous 
> regions and 
>  for studying genome rearrangement.
>  
>  	Perhaps these tracks will be helpful to you.
>  
>  
>  Regards,
>  
>  ----------
>  Ann Zweig
>  UCSC Genome Bioinformatics Group
>  
>  
>  
>  Goel, Manisha wrote:
>  > Hello,
>  > 
>  > I want to get arrive at some kind of sequence conservation score between
>  > D.melanogaster and D.pseudoobscura.
>  > I have a list of corodinates for the regions of interest in
>  > D.melanogaster genome.
>  > I tried using the table browser but that gives me the multiz alignment
>  > (and score) for all 15 related species.
>  > Is it possible to somehow see the level of conservation only for 
> the two
>  > species ?
>  > 
>  > 
>  > Thanks for your advice,
>  > -Manisha
>  > 
>  >  
>  > _______________________________________________
>  > Genome maillist  -  Genome at soe.ucsc.edu
>  > 
>  
>  
>  ------------------------------
>  
>  Message: 4
>  Date: Sat, 24 Feb 2007 10:39:22 -0600
>  From: Jeltje van Baren <jeltje at cse.wustl.edu>
>  Subject: [Genome] hg18 chr3 strange characters
>  To: genome at soe.ucsc.edu
>  Message-ID: <45E06A3A.3090200 at cse.wustl.edu>
>  Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>  
>  Hi Browserians
>  
>  One of my programs died because of strange characters in hg18 chr3:
>  
>  [jeltje at ardor chr3]$ grep -n R chr3.fa
>  1216118:CCRRGCTTGGTTCTAACAATGAATTTAATAAGAATTGTATTTAATCAATG
>  [jeltje at ardor chr3]$ grep -n M chr3.fa
>  1216113:TCTTCATTAGCGCTACATAGCTGMCTTATTATTCGTGGTCCCCTATGACC
>  
>  On the Browser, these are represented as N
>  I did not check the other chromosomes.
>  
>  -Jeltje
>  
>  
>  ------------------------------
>  
>  Message: 5
>  Date: Sat, 24 Feb 2007 10:41:09 -0600
>  From: Jeltje van Baren <jeltje at cse.wustl.edu>
>  Subject: [Genome] addition hg18 chr3
>  To: genome at soe.ucsc.edu
>  Message-ID: <45E06AA5.1030503 at cse.wustl.edu>
>  Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>  
>  I realize that information was incomplete:
>  
>  The chromosome was downloaded today from hgdownload hg18/chromosomes.
>  
>  -Jeltje
>  
>  
>  ------------------------------
>  
>  _______________________________________________
>  Genome maillist  -  Genome at soe.ucsc.edu
>  
>  
>  
>  End of Genome Digest, Vol 49, Issue 33
>  ************************************** 


More information about the Genome mailing list