[Genome] [Fwd: sequence retrieval problem]

Kayla Smith kayla at soe.ucsc.edu
Thu Jan 25 10:48:52 PST 2007


Gusti,

I did some poking around and I see that the Stanford HEEBO array has a
website here: http://www.microarray.org/sfgf/heebo.do

HEEBO has a spreadsheet which correlates probe name and probe sequence
available on their website. That spreadsheet is here:
http://www.microarray.org/data/download/HEEBO_Human_Set_v1.00.xls

On their website, I found a custom track of their data displayed on the
UCSC Genome Browser (incorrectly referred to as the UCSF Browser) which 
may be useful to you here:
http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg16&position=chr5:140654321-140987654&hgt.customText=http://microarray.org/data/download/UCSCGenomeBrowserFiles/Stanford_human_v1.0.bed.gz

Also, I noticed from the links in your question that you were looking on
our hg18 assembly.  From the HEEBO website, it looks like their array
was designed for the hg16 assembly.

It looks to me like you want to get gene sequence information from a set 
of genes for which you have Locus Link IDs.  Here is how to use the 
Table Browser to retrieve this information:

Set the following options:
clade: Vertebrate
genome: Human
assembly: July 2003
group: Genes and Gene Prediciton tracks
track: Known Genes
table: knownGene
filter: click on create
   In the filter, under the "Linked Tables" section, check the box next 
to knownToLocusLink.
   Click "Allow Filtering Using Fields in Checked Tables"
   Under the "hg16.knownToLocusLink based filters" set:
   value "does match" and paste in your LocusLink ID numbers.
   Click Submit.
Output format: sequence

Your output will be knownGenes (which corresponded to your LocusLink 
IDs) with their sequence.

I hope these resources are helpful to you.  If this isn't what you had
in mind, please contact us again.

In the future please send your questions about the Genome Browser to our 
mailinglist genome at soe.ucsc.edu.

Kayla Smith
UCSC Genome Bioinformatics Group.



Branwyn Wagman wrote:
> Here's a browser question that was misdirected to the cbse webmaster.
> 
> Branwyn
> 
> -------- Original Message --------
> Subject: 	sequence retrieval problem
> Date: 	Wed, 24 Jan 2007 15:08:35 -0800
> From: 	Gus Zeiner <gusti at stanford.edu>
> To: 	cbseweb at cbse.ucsc.edu
> 
> 
> 
> Hi,
> 
> I have a ton of array data, and I would like to pull out a .fasta 
> sequence file for genes of interest. The data is organized such that it 
> has locus link ID and HEEBO probes for each array spot. I have been 
> trying to extract the sequences in batch using the *identifiers* link in 
> your Table Browser 
> (http://genome.ucsc.edu/cgi-bin/hgTables?hgsid=84830564&db=hg18 
> <http://genome.ucsc.edu/cgi-bin/hgTables?hgsid=84830564&db=hg18>), but I 
> seem to be only able to retrieve sequences by using a NCBI accession 
> number (I have played with all the tracks, using a combination of spot 
> identifiers, and consistently, the only identifier that gives me 
> anything is NM_xxxxxx). This is less than optimal for me, as not every 
> spot on the Stanford HEEBO array has an accession number. 
> 
> I am not sure if there is a broken link, or if I am trying to retrieve 
> the sequence file incorrectly. If there is a simple fix for this, please 
> let me know. 
> 
> Thanks much,
> Gus Zeiner
> 
> Gusti M. Zeiner, Ph.D.
> 
> 
> Boothroyd Lab
> 
> Stanford University School of Medicine
> 
> Fairchild D305
> 
> 299 Campus Drive
> 
> Stanford, CA 
> 
> 94305-5124
> 
> p    650.723.7296
> 
> f     650.723.6853
> 
> e     gusti at stanford.edu <mailto:gusti at stanford.edu>
> 
> 
> 
> -- 
> Branwyn Stewart Wagman
> Communications & Human Resources
> Center for Biomolecular Science and Engineering (CBSE)
> Institute for Quantitative Biomedical Research (QB3)
> 501C Engineering 2 Building
> UC Santa Cruz
> 1156 High Street, MS: CBSE/ITI
> Santa Cruz CA 95064
> Tel: (831) 459-3077
> Fax: (831) 459-1809
> bwagman at soe.ucsc.edu
> http://www.cbse.ucsc.edu
> 



More information about the Genome mailing list