[Genome] Genome Digest, Vol 50, Issue 15

BIJU JOSEPH bjoseph5 at jhmi.edu
Tue Mar 13 08:08:36 PDT 2007


 Hi 
I got the following marker for THRB from GDB. 
THRB-1	GATCACAAGGATGCTAGAGT
THRB-2	TCAAAGGAGTCAGGCTGTAG
Amplified product sequence form GDB 
LOCUS       HSU08276                 216 bp    DNA     linear   PRI 29-AUG-1996
DEFINITION  Human clone 83-7601 MHC class II histocompatibility antigen
            (HLA-DRB1) gene, partial cds.
1 tgtcattact tcaatgggac ggagcgggtg cggttcctgg agagatactt ccataaccag
       61 gaggagaacg tgcgcttcga cagcgacgtg ggggagttcc gggcggtgac ggagctgggg
      121 cggcctgatg ccgagtactg gaacagccag aaggacatcc tggaagacga gcgggccgcg
      181 gtggacacct actgcagaca caactacggg gttgtg
This region is not showing any repeats necesssray for a marker. besides not aligning with 
 with chromosome 3p24.3 region where THRB gene lies.But with regions on other chromosomes.
Could you help in correct mapping of the region?
Biju




Biju Joseph
Division of Endocrinology and Metabolism
Johns Hopkins School of Medicine
Suite 813, Hunterian building
1915, East Madison st.
Baltimore, MD   21287
Phone:  410-502-3046

----- Original Message -----
From: genome-request at soe.ucsc.edu
Date: Monday, March 12, 2007 2:44 pm
Subject: Genome Digest, Vol 50, Issue 15
To: genome at soe.ucsc.edu


> Send Genome mailing list submissions to
>  	genome at soe.ucsc.edu
>  
>  To subscribe or unsubscribe via the World Wide Web, visit
>  	
>  or, via email, send a message with subject or body 'help' to
>  	genome-request at soe.ucsc.edu
>  
>  You can reach the person managing the list at
>  	genome-owner at soe.ucsc.edu
>  
>  When replying, please edit your Subject line so it is more specific
>  than "Re: Contents of Genome digest..."
>  
>  
>  Today's Topics:
>  
>     1. Re: dbsnp 127 (Heather Trumbower)
>     2. get the sequnence by using an Ensembl gene ID (Yu Zhou)
>     3. SOD1 gene and its 5' upstream sequence (Sue Copland)
>     4. A problem with the soft-masking! (wang xiaosong)
>     5. Multiple alignments with Chimp 6x sequence? (Scott Doniger)
>     6. UCSC In-Silico PCR (Kasper Thorsen)
>     7. Re: get the sequnence by using an Ensembl gene ID
>        (Archana Thakkapallayil)
>     8. Re: Multiple alignments with Chimp 6x sequence?
>        (Archana Thakkapallayil)
>     9. Re: UCSC In-Silico PCR (Archana Thakkapallayil)
>  
>  
>  ----------------------------------------------------------------------
>  
>  Message: 1
>  Date: Sun, 11 Mar 2007 20:59:15 -0700 (PDT)
>  From: Heather Trumbower <heather at soe.ucsc.edu>
>  Subject: Re: [Genome] dbsnp 127
>  To: dmitriy <dms700 at gmail.com>
>  Cc: genome at soe.ucsc.edu
>  Message-ID: <Pine.LNX.4.63.0703112050350.26654 at growl.cse.ucsc.edu>
>  Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
>  
>  Dmitriy:
>  
>  I have done an initial load of dbSNP build 127 for hg18.   It is 
> available 
>  at    Please note that it has not yet been 
>  validated or reviewed.   That work will be happening over the course 
> of 
>  the next few weeks.
>  
>  I notice an increase in the number of SNPs with the reference allele 
> not 
>  reverse complemented.   Build 126 for hg18 has 120K such SNPs; build 
> 127
>  appears to have 1.5 million.
>  
>  I also notice an increase in the number of SNPs with an unexpected 
> format 
>  for the observed string.  Build 126 for hg18 has 357 such SNPs; build 
> 127 
>  appears to have over 6,000.
>  
>  Feel free to write to me directly if you have any questions or 
> comments 
>  about this data.
>  
>  Are you primarily interested in human data?   If other species, which 
> 
>  ones?
>  
>  Heather Trumbower
>  UCSC Genome Bioinformatics Group
>  
>  
>  On Fri, 9 Mar 2007, dmitriy wrote:
>  
>  > Hi
>  >
>  > With today's release of dbsnp 127, what is the expected release of
>  > UCSC that will include it?
>  >
>  > Thanks
>  >
>  > Dmitriy Sonkin
>  > Software engineer / bioinformatician
>  > HPCGG (Harvard Partners Center for Genetics and Genomics)
>  > _______________________________________________
>  > Genome maillist  -  Genome at soe.ucsc.edu
>  > 
>  >
>  
>  
>  ------------------------------
>  
>  Message: 2
>  Date: Mon, 12 Mar 2007 15:30:30 +0800
>  From: "Yu Zhou" <zhouyubio at gmail.com>
>  Subject: [Genome] get the sequnence by using an Ensembl gene ID
>  To: Genome at soe.ucsc.edu
>  Message-ID:
>  	<613ffb490703120030m7eab2087u6bc19c619a6f7f50 at mail.gmail.com>
>  Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>  
>  Hi,
>  
>  If I know the Ensembl gene ID of a gene, how can I get the sequence (5'UTR,
>  exons, introns, 3'UTR) of the gene by using a SQL query to the database?
>  Because I have many genes to query and to do further processing, I 
> would not
>  use the Table Browser which is very good!
>  
>  I know connect to the database, but I don't know exactly the relationships
>  between the tables. Could you tell me how to write the SQL clause to 
> do
>  that?
>  
>  
>  -- 
>  Best Wishes!
>  
>  Yu ZHOU
>  
>  
>  ------------------------------
>  
>  Message: 3
>  Date: Mon, 12 Mar 2007 10:34:56 +1300
>  From: "Sue Copland" <sue.copland at auckland.ac.nz>
>  Subject: [Genome] SOD1 gene and its 5' upstream sequence
>  To: <genome at soe.ucsc.edu>
>  Message-ID:
>  	<1D2026B2CBA11847B5DB2645487DEA8C3C1C52 at FMHSX1.fmhs.auckland.ac.nz>
>  Content-Type: text/plain; charset="us-ascii"
>  
>  Hi
>  
>   
>  
>  I've got a concern about sequence information acquired through your UCSC
>  genome browser (March 2006 assembly, clade vertebrate, genome human).
>  
>   
>  
>  The problem is the sequence upstream of the SOD 1 gene in the genome
>  browser sequence (chr21:31953106-31954728) does not match published
>  genomic sequence for that region (please refer to the attached papers).
>  
>   
>  
>  I searched your site for CpG islands upstream of superoxide dismutase 
> 1
>  (SOD 1) and focused on CpG 91. I downloaded sequence
>  chr21:31953106-31954728 which includes 500 bases flanking the 5' region
>  and 200 bases flanking the 3' region of CpG 91. 
>  
>   
>  
>  This sequence runs in the same direction as the SOD 1 gene (+ strand)
>  and overlaps the 5' region of SOD 1. I confirmed this by locating the 
> 5'
>  region of the SOD 1 gene (exon1 and part of intron1) within
>  chr21:31953106-31954728. 
>  
>   
>  
>  Sequence upstream of the human SOD 1 gene has been extensively studied
>  and found to contain promoter elements such as the TATA box and well
>  known transcription factor binding sites and it's this sequence that 
> I
>  want. Am I able to get the real sequence upstream of SOD 1 from the
>  genome browser?
>  
>   
>  
>   
>  
>  Kind regards
>  
>   
>  
>   
>  
>  Sue Copland
>  
>  Research Technician
>  
>  Liggins Institute
>  
>  The University of Auckland
>  
>  2-6 Park Ave, Grafton 1023
>  
>  Auckland, New Zealand
>  
>  Ph +649 373 7599 ext 83439
>  
>  fax +649 373 8763
>  
>  sue.copland at auckland.ac.nz
>  
>   
>  
>  
>  ------------------------------
>  
>  Message: 4
>  Date: Mon, 12 Mar 2007 06:41:08 +0800
>  From: "wang xiaosong" <dr.wang at hotmail.com>
>  Subject: [Genome] A problem with the soft-masking!
>  To: galt at soe.ucsc.edu
>  Cc: xiaosong at med.umich.edu, genome at soe.ucsc.edu
>  Message-ID: <BAY134-F33DDFE07950412A62DDB3AE87E0 at phx.gbl>
>  Content-Type: text/plain; charset=gb2312; format=flowed
>  
>  Dear BLAT experts
>  
>  I encountered a problem with repeative EST sequence when doing BLAT 
> with 
>  softmasked genome nib file. Take sequence AI306750 for example, the 
> command 
>  line is as following
>  ----------------------------------
>  faToNib -softMask hg18.fa hg18.nib
>  gfServer start server 2345  *.nib ?CstepSize=5 -mask
>  gfClient server 2345 /hg18/chromnibsoftmasked /data/AI306750.fa 
>  /data/AI306750.out ?Ct=dna ?Cq=rna ?CminScore=0 ?CminIdentity=0
>  ---------------------------------
>  I always get a result as follows, with no "rep match" bases declared.
>  ---------------------------------
>  >chr22 
>            Length = 49691432
>  
>   Score = 41 bits (107), Expect = 1e-03
>   Identities = 22/22 (100%)
>   Strand = Plus / Plus
>  
>  Query: 191      ctcaaaaaaaaaaaaaaaaaaa 212
>                  ||||||||||||||||||||||
>  Sbjct: 39310678 ctcaaaaaaaaaaaaaaaaaaa 39310699
>  ----------------------------------------------
>  match	mis- 	rep. 	N's	Q gap	Q gap	T gap	T gap
>       	match	match           	count	bases	count	bases
>  22	0	0	0	0	0	0	0
>  ----------------------------------------------
>  
>  It seems that the command line I used did not mask the repeat polyA 
> here 
>  and provide "rep match" bases. Therefore, my question is: 
>  (1)how can I correctly soft-mask the genome sequence file and how can 
> I get 
>  "repeat match" bases that occur in my current results. 
>  (2)does the faToNib -softmask program mask the .fa file itself or 
> take the 
>  softmasked .fa file generated from other repeatmasking program" 
>  (3)Does ucsc provide the softmasked genome sequences in fa or nib format?
>  (4)there was a -mask=type option in blat command, does it exists in 
>  gfserver/gfclient? In my current machine, the memory only allow 
>  gfserver/gfclient to run with the whole genome sequence.
>  Many thanks for the help!
>  >AI306750
>  TATACTGCTGCGAGAAGACGACAGAAGGGCAGTGACTCGACAAAGGCCACAGGCAGTCCAGGCCTCTCTC
>  TGCTCCATCCCCCTGCCTCCCATTCTGCACCACACCTGGCATGGTGCAGGGAGACATCTGCACCCCTGAG
>  TTGGGCAGCCAGGAGTGCCCCCGGGAATGGATAATAAAGATACTAGAGAACTCAAAAAAAAAAAAAAAAA
>  AAAAAAAAAAAAAAGTCGTATCGA
>  
>  >From: Galt Barber <galt at soe.ucsc.edu>
>  >To: wang xiaosong <dr.wang at hotmail.com>
>  >CC: genome at soe.ucsc.edu, xiaosong at med.umich.edu
>  >Subject: Re: [Genome] The problem in the results of BLAT linux v34
>  >Date: Mon, 26 Feb 2007 14:47:29 -0800 (PST)
>  >
>  >
>  >Looks like you are using the hard-masked version of the chromosomes.
>  >I recommend using the soft-masked versions.  There are many
>  >repeats around the exons in question and that could affect
>  >the alignments.
>  >
>  >For the question about the score that hgBlat generates,
>  >please see the blat FAQ:
>  >
>  >
>  >
>  >Also, note that if you are doing batch queries
>  >it may be easier to just use stand-alone commandline
>  >"blat" instead of gfServer/gfClient.
>  >
>  >If memory is tight you can do one chrom at a time
>  >and then combine/filter psl results with pslReps
>  >and other tools like that.
>  >
>  >-Galt
>  >
>  >
>  >On Tue, 27 Feb 2007, wang xiaosong wrote:
>  >
>  > > Dear All,
>  > >
>  > > I'm Xiaosong Wang From Dr. Arul Chinnaiyan's lab at the 
> University of
>  > > Michigan. We encountered a problem in the output of the BLAT 
> linux 
>  version
>  > > 34. The linux version of BLAT usually overlook one exon at either 
> end 
>  of
>  > > the input sequence. For example, the chromosome matched regions 
> of ERG 
>  and
>  > > TMPRSS2 sequences are 0-1128 and 55-1725 as mapped by the BLAT 
> linux 
>  v34,
>  > > while the matched regions were changed to 1-1514 and 1-1725 with 
> the
>  > > web-based BLAT(See attached file for BLAT results, and test.txt 
> for the
>  > > sequence). The linux version BLAT lost the last exon of ERG 
> (1128-1514) 
>  and
>  > > the First exon of TMPRSS2 (0-55).  The command line we use is as 
> 
>  following:
>  > > -----------------------------------------------
>  > > gfServer start path-t1 7855 *.nib -minMatch=1
>  > > gfClient path-t1 7855 /data/chromnibmasked /data/test.fa /data/test.out
>  > > -t=dna -q=rna -minScore=0 -minIdentity=0
>  > > -----------------------------------------------
>  > > In addition, we find that the score in the web-based blat results 
> was 
>  not
>  > > provided in the linux version results. Therefore, we wonder 
> whether 
>  anyone
>  > > knows the algorism behind this score.
>  > >
>  > > Thank you very much indeed.
>  > >
>  > > Xiaosong
>  > >
>  > >
>  > > Xiaosong Wang
>  > > Department of Pathology, University of Michigan Medical School
>  > > 1150 W.Medical Center Dr. Rm3232, Med Sci I, Ann Arbor, MI 48109
>  > > Phone: 734-763-1224
>  > >
>  > > _________________________________________________________________
>  > > ???????????????????????????? MSN Messenger:  
>  
>  > >
>  
>  _________________________________________________________________
>  ?????????????????????????????? MSN Hotmail??    
>  
>  
>  
>  ------------------------------
>  
>  Message: 5
>  Date: Sun, 11 Mar 2007 13:18:15 -0500
>  From: Scott Doniger <swdoniger at gmail.com>
>  Subject: [Genome] Multiple alignments with Chimp 6x sequence?
>  To: <genome at soe.ucsc.edu>
>  Message-ID: <000f01c76409$a3a91940$6501a8c0 at scottlaptop>
>  Content-Type: text/plain;	charset="iso-8859-1"
>  
>  Hi there - Do you know when the next update to the 17-way multiple 
> vertebrate alignments will be released? I'm hoping to find these same 
> alignments including the 6x chimp genome. Also updating the PhastCons 
> Elements table would be great too.
>  
>  Thanks in advance for your help. You guys do a great job keep all of 
> this data straight.
>  
>  Scott Doniger
>  Fay Lab
>  Department of Genetics
>  Center for Genome Sciences
>  Washington University School of Medicine
>  
>  ------------------------------
>  
>  Message: 6
>  Date: Mon, 12 Mar 2007 11:46:38 +0100
>  From: "Kasper Thorsen" <KASPER.THORSEN at KI.AU.DK>
>  Subject: [Genome] UCSC In-Silico PCR
>  To: <genome at soe.ucsc.edu>
>  Message-ID: <8E20326FC656DC4FAED041EF42DD5A5D01ED57A9 at skuld.svf.au.dk>
>  Content-Type: text/plain;	charset="us-ascii"
>  
>  Hello
>  
>   
>  
>  I am using the UCSC In-Silico PCR function and after submitting the
>  primer sequences the genome browser displays the amplified sequence. 
> My
>  question is whether it is possible to have the primer position displayed
>  in a separate track so it is possible to see the primer position after
>  having e.g. zoomed out.
>  
>   
>  
>   
>  
>  Kind regards
>  
>   
>  
>  Kasper Thorsen, Ph.D.
>  
>  Dept. of Clinical Biochemistry
>  
>  Aarhus University Hospital, Skejby Sygehus
>  
>  E-mail: kasper.thorsen at ki.au.dk
>  
>   
>  
>  
>  
>  ------------------------------
>  
>  Message: 7
>  Date: Mon, 12 Mar 2007 10:01:22 -0700
>  From: Archana Thakkapallayil <archanat at soe.ucsc.edu>
>  Subject: Re: [Genome] get the sequnence by using an Ensembl gene ID
>  To: Yu Zhou <zhouyubio at gmail.com>
>  Cc: Genome at soe.ucsc.edu
>  Message-ID: <45F58762.8080507 at soe.ucsc.edu>
>  Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>  
>  Hello Yu,
>  
>  Unfortunately, you cannot retrieve sequence using our MySql server 
>  because the sequence is not stored in database tables. One way to do 
> 
>  this is by clicking on the gene alignment in the Browser and and then 
> on 
>  the details page looking for the "Sequence" section and then clicking 
> on 
>  the Genomic sequence link there. This will give you the option to get 
> 
>  introns and UTR. But, you will have to do this individually for each 
> gene.
>  
>  Inorder to get this information for many genes, the more efficient 
> way 
>  is to use the Table Browser. More information on using the Table 
> Browser 
>  is here:
>  
>  
>  I hope this is helpful to you. Please feel free to write back if you 
> 
>  need more instruction.
>  
>  Regards,
>  
>  Archana
>  UCSC genome Bioinformatics Group
>  
>  
>  Yu Zhou wrote:
>  > Hi,
>  >
>  > If I know the Ensembl gene ID of a gene, how can I get the sequence 
> (5'UTR,
>  > exons, introns, 3'UTR) of the gene by using a SQL query to the database?
>  > Because I have many genes to query and to do further processing, I 
> would not
>  > use the Table Browser which is very good!
>  >
>  > I know connect to the database, but I don't know exactly the relationships
>  > between the tables. Could you tell me how to write the SQL clause 
> to do
>  > that?
>  >
>  >
>  >   
>  
>  
>  ------------------------------
>  
>  Message: 8
>  Date: Mon, 12 Mar 2007 10:44:14 -0700
>  From: Archana Thakkapallayil <archanat at soe.ucsc.edu>
>  Subject: Re: [Genome] Multiple alignments with Chimp 6x sequence?
>  To: Scott Doniger <sdoniger at wustl.edu>
>  Cc: genome at soe.ucsc.edu
>  Message-ID: <45F5916E.70607 at soe.ucsc.edu>
>  Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>  
>  Hello Scott,
>  
>  Thanks for the compliments on the Browser. We are in the process of 
>  generating a new multiple alignment for hg18. This will include 
> panTro2 
>  as well as additional new species. Be sure to check back for these updates.
>  
>  Regards,
>  
>  Archana
>  UCSC Genome Bioinformatics group
>  
>  
>  Scott Doniger wrote:
>  > Hi there - Do you know when the next update to the 17-way multiple 
> vertebrate alignments will be released? I'm hoping to find these same 
> alignments including the 6x chimp genome. Also updating the PhastCons 
> Elements table would be great too.
>  >
>  > Thanks in advance for your help. You guys do a great job keep all 
> of this data straight.
>  >
>  > Scott Doniger
>  > Fay Lab
>  > Department of Genetics
>  > Center for Genome Sciences
>  > Washington University School of Medicine
>  > _______________________________________________
>  > Genome maillist  -  Genome at soe.ucsc.edu
>  > 
>  >   
>  
>  
>  ------------------------------
>  
>  Message: 9
>  Date: Mon, 12 Mar 2007 11:40:58 -0700
>  From: Archana Thakkapallayil <archanat at soe.ucsc.edu>
>  Subject: Re: [Genome] UCSC In-Silico PCR
>  To: Kasper Thorsen <KASPER.THORSEN at KI.AU.DK>
>  Cc: genome at soe.ucsc.edu
>  Message-ID: <45F59EBA.1080309 at soe.ucsc.edu>
>  Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>  
>  Hello Kasper,
>  
>  Your request for display of PCR products is a reasonable one. 
>  Unfortunately, we currently do not have a feature like this. This is 
> in 
>  our future implementation list, but there are some higher priority 
> tasks 
>  on the list as well. Right now I don't have any estimate of when it 
>  would be completed.
>  
>  In the meantime, as a workaround, you could make a simple BED file 
> out 
>  of the locations of your PCR products and display them as a custom track.
>  
>  Please let us know if you have any further questions.
>  
>  Regards,
>  
>  Archana
>  UCSC Genome Bioinformatics Group
>  
>  Kasper Thorsen wrote:
>  > Hello
>  >
>  >  
>  >
>  > I am using the UCSC In-Silico PCR function and after submitting the
>  > primer sequences the genome browser displays the amplified 
> sequence. My
>  > question is whether it is possible to have the primer position displayed
>  > in a separate track so it is possible to see the primer position after
>  > having e.g. zoomed out.
>  >
>  >  
>  >
>  >  
>  >
>  > Kind regards
>  >
>  >  
>  >
>  > Kasper Thorsen, Ph.D.
>  >
>  > Dept. of Clinical Biochemistry
>  >
>  > Aarhus University Hospital, Skejby Sygehus
>  >
>  > E-mail: kasper.thorsen at ki.au.dk
>  >
>  >  
>  >
>  > _______________________________________________
>  > Genome maillist  -  Genome at soe.ucsc.edu
>  > 
>  >   
>  
>  
>  ------------------------------
>  
>  _______________________________________________
>  Genome maillist  -  Genome at soe.ucsc.edu
>  
>  
>  
>  End of Genome Digest, Vol 50, Issue 15
>  ************************************** 


More information about the Genome mailing list