[Genome] Genome Graphs

Patrick Sullivan pfsulliv at med.unc.edu
Thu Feb 8 05:58:52 PST 2007


UCSC genome mavens:

First of all, this is an EXCELLENT tool. I have already circulated it to 
my colleagues and I would urge you to publicize it widely. It more or 
less solves one of the most pressing issues in the GWAS area 
(visualizing results in genomic context), and is far superior to any of 
the other paltry tools out there.


Second, as requested, I have a couple of suggestions and/or wishes.


1. Pls allow the user to upload and plot qualitative data. In trying to 
understand the results of a genomewide association study, it is very 
useful to overlay external data from other studies that are often 
qualitative not quantitative.

For example, a user might want simply to overlay the positions of 
linkage regions implicated in other studies or a candidate gene list. 
For these, one might only wish to note where they are qualitatively. 
Would note these with a bar. See Slide 1 in attached .ppt for example.

Data input would be brilliant if there were several forms. Input might 
be the following:

a) List of standard HUGO gene names. Could match against knownGene and 
obtain chromosome and txsMin and txsMax (where txsMin is the minimum 
txStart over all isoforms and txsMax is the maximum txEnd over all 
isoforms). Example:

NRG1
DTNBP1
DISC1
COMT

For the gene NRG1 on chr8 (has multiple isoforms), txsMin=31616809 and 
txsMax=32741615. These values could be pre-computed for all knownGenes 
for efficiency.

b) Regions in from-to format where these could be chrN:x-y, SNP IDs, or 
STS markers. Example

chr8:31616809-32741615
rs1234	rs5678
D19S123	D19S654

c) Coding suggestion - add an indicator flag in the first column for 
which sort of data are on that line. This would allow all types of data 
to be in one file.

TYPE	Field1	Field2
1	NRG1
1	DTNBP1
2	chr8:31616809-32741615
2	chr9:22616809-22741615
3	rs1234	rs5678
3	rs2222	rs3333
4	D19S123	D19S654
4	D1S111	D1S222

Type=1 for single standard gene names, Type=2 for chrN:from-to, Type=3 
for two SNP IDs, and Type=4 for two STS markers.

Could then split these into separate files, merge with the appropriate 
UCSC table to get the coordinates, and then concatenate for plotting.


2. When the user selects an area on the genome overview page, goes to 
the genome browser set on that area. Some suggestions for the 
user-defined tracks at the top.

a) Show the baseline for each user track (see Slide 2).

b) If user wants the scores to be connected, put a little tick at the 
location of each marker. Otherwise is hard to know the marker density in 
a region. See Slide 2.

c) Allow the user to select of he/she wants the points connected or 
indicated by a vertical line (i.e., the -log(pvalue) for a SNP at that 
point). See Slide 3.

d) If full display is selected, display the SNP name on the user track. 
Clicking on that SNP goes to the appropriate page about that SNP (as is 
done under the current SNP track).


3. An exceptionally useful feature is the merging that occurs (for rs 
numbers or STS markers). In the output that describes the matching of 
results, pls list those that failed to match so we can trouble-shoot.


Again, many thanks for writing this tool. It will get a lot of use on 
our side.

-- 
Pat

---- Patrick Sullivan, MD, FRANZCP
---- Ray M. Hayworth & Family Distinguished Professor
---- UNC/Genetics & Carolina Center for Genome Sciences
---- CB#7264, 103 Mason Farm Road
---- Neuroscience Research Building, Room 4109D
---- Chapel Hill, NC, 27599-7264, USA
---- V: +919-966-3358  F: +919-966-3630


More information about the Genome mailing list