[Genome] microarrays in genome browser

Rachel Harte hartera at soe.ucsc.edu
Fri Jun 3 13:56:07 PDT 2005


Hi Bogdan,
I am glad that the information was helpful for you. You can obtain the 
absolute numerical values from our website using the Table Browser. 

1. Go to http://genome.ucsc.edu and click on the "Tables" link on the 
blue bar at the top of the page which will take you to the Table Browser.
2. For database, select "hgFixed" and for the table, select 
"hgFixed.gnfHumanAtlas2All". Clicking on "describe table schema" to the 
right of this gives an explanation of each table column and give examples.
3. From the output format, you can chose "all fields" or "selected 
fields". If the "selected fields" option is chosen you will be given a 
choice of fields from the table when you hit "get output".
4. You can enter a file name in the "output file" box to save the data 
to a file or just get the output in your browser if you leave this blank.
5. Click on the "get output" button. 

The table mentioned above is "gnfHumanAtlas2All" and this contains the 
data that you found on the GNF website that you mention below - these 
are the values that are between 30-2000. The first column in this table is 
"name" and this is the probe name, the second is "expCount" and this gives 
the number of data items in expScores and thirdly, "expScores", gives the 
actual data values.

The gray shades on the Known Genes details page in the Microarray section 
correspond to the median of the absolute values for the replicates for 
each tissue or cell type (see table description below). 

Here is a list of other tables that you may find useful - you will 
definitely need to look at hgFixed.gnfHumanAtlas2AllExps to interpret the 
data in the hgFixed.gnfHumanAtlas2All table. The data from each of 
these tables can be downloaded by following the instructions above for 
each of these tables.

hgFixed.gnfHumanAtlas2AllExps - this gives a list of the tissues or 
samples corresponding to the data items in the expScores column of the 
gnfHumanAtlas2All table. The numbering of these experiments starts at 0 
(id column). For example, the first and second rows (ids 0 and 1) in the 
gnfHumanAtlas2AllExps table have the name "ColorectalAdenocarcinoma" so 
the first and second data items in expScores for gnfHumanAtlas2All 
correspond to these 2 replicates for the "ColorectalAdenocarcinoma" 
tissue.

hgFixed.gnfHumanAtlas2Median - these are the median values over all 
replicates for each tissue or cell type. 

hgFixed.gnfHumanAtlas2MedianExps - this table is similar to 
gnfHumanAtlas2AllExps except it shows the order of the experiments for the 
median values in the expScores for the gnfHumanAtlas2MedianAll and 
gnfHumanAtlas2MedianRatio tables. Please note that the order is different 
for this table than for gnfHumanAtlas2AllExps. For instance, here the 
first two rows (ids 0 and 1) are "fetal brain" and "whole brain" so the 
first value in expScores in the gnfHumanAtlas2Median table is the median 
of the absolute values for the "fetal brain" replicates and the second 
value in expScores is the median of the absolute values for the "whole 
brain" replicates.

Please let me know if you have any further questions.

Rachel

 On Thu, 2 Jun 2005, tanasa wrote:

> 
> Greetings Rachel, 
> 
> Thank you very much for the very clear and comprehensive email:
> the explanations were extremely helpful. I would like to add one 
> more question: I looked into the files available on Novartis GNF 
> (http://wombat.gnf.org/index.html) and for each gene, a number 
> tells the level of expression in various tissues (usually, the number 
> is between 30 and 2000). For the files available at Novartis GNF or 
> in general, would it be possible please to let me know which are
> the (absolute) numerical values that tell if a gene is  low or highly 
> expressed (considering for instance only the shades of grey and not 
> the expression ratios). And, there is any way in which I could obtain 
> these "expression numbers" from the Genome Browser ?
> 
> Thank you very much again, 
> have a good weekend, 
> 
> Bogdan
>  
>  
> 
> 
> 
> 
> -----Original message-----
> From: Rachel Harte hartera at soe.ucsc.edu
> Date: Wed,  1 Jun 2005 16:53:18 -0400
> To: tanasa tanasa at cbrinstitute.org
> Subject: Re: [Genome] microarrays in genome browser
> 
> > Hi Bogdan,
> > For the GNF Atlas data, expression ratio is calculated in relation to the 
> > gene's expression in a particular tissue. Since in this data there is 
> > overreprensentation of certain tissues then they are grouped together so 
> > amygdala, fetal brain and whole brain are all in the "brain" group. On the 
> > page for which you provide the link below, the first 3 sets of tissues are 
> > all in the brain group (any cancer cell lines were excluded from the 
> > group). The next set beginning with the tissue "spinal cord" are in the 
> > "nerve" group. For this Human Atlas 2 data set the other groups are: 
> > immune, other, gland, muscle, and germ.
> > For a particular gene, the median expression level is calculated for 
> > this tissue grouping using the absolute values. Then, to calculate the ratio 
> > value for e.g. fetal brain, the expression value for NFATC2 is taken for 
> > fetal brain and divided by the median value for NFATC2 in all the tissues in 
> > the "brain" group. The ratio is expressed as a log2 value. So the ratio 
> > shows the relative expression of a gene in a tissue compared to the median 
> > expression level of the gene in all the non-cancerous tissues in the same 
> > group. Since there are replicates for each tissue, the median of the ratio 
> > values of the replicates is used. Red indicates that gene is more highly 
> > expressed in a particular tissue than in those of the group to which 
> > the tissue belongs while green indicates that a gene is underexpressed in a 
> > tissue relative to those in the group. Black indicates that a gene is neither 
> > underexpressed or overexpressed in a tissue and white indicates missing data. 
> > 
> > For the absolute values, the median value is taken for the replicates for 
> > each tissue. The darker the gray, the lower the expression value and the 
> > less abundant the transcript. Black indicates that there is no expression 
> > (value is < 1). 
> > For both the ratio values and the absolute values, the coloring is on a 
> > logarithmic scale.
> > 
> > I hope that this answers your question.
> > 
> > Rachel
> > 
> >  On Wed, 1 Jun 2005, tanasa wrote:
> > 
> > > 
> > > Hi - I am looking  for more information on how to interpret the "expression ratio" that is given for each molecule by the UCSC genome browser.
> > > 
> > > For instance, for NFATp (NFATc2) , at http://genome.ucsc.edu/cgi-bin/hgGene?hgsid=42436146&db=hg17&hgg_gene=U43342&hgg_chrom=chr20&hgg_start=49441339&hgg_end=49592665&hgg_type=knownGene
> > > 
> > > in distinct tissues, the expression ratio could be either higher (red) or low (green). I assume that the expression ratio
> > > is proportionally related to the absolute expression level; however, if taking into account the absolute values (in grey,
> > > below the expression ratios), for many tissues, the expression level is high, but the absolute value is low (assuming
> > > dark grey = abundant and white = poorly expressed).  
> > > 
> > > I would appreciate your suggestion in this respect - is there any formula that calculates the expression ratio and is this 
> > > proportional to the absolute value ? Is dark grey the color code for abundant expression and white a code for low expression 
> > > level ?
> > > 
> > > Thanks very much, 
> > > 
> > > Bogdan  
> > > _______________________________________________
> > > Genome maillist  -  Genome at soe.ucsc.edu
> > > http://www.soe.ucsc.edu/mailman/listinfo/genome
> > > 
> > 
> > -- 
> > UCSC Genome Bioinformatics Group
> > http://genome.ucsc.edu
> > 
> > 
> > 
> > 
> 
> -----------------------
> Bogdan Tanasa, MD 
>  
> Center for Blood Research, 
> Harvard Medical School, 
> 200 Lonwood Ave, 
> Boston, MA 02115
> phone: (617) 278 3183
> fax: (617) 278 3280
> email: tanasa at cbr.med.harvard.edu
> 

-- 
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu






More information about the Genome mailing list