[Genome] Genome Graphs
Patrick Sullivan
pfsulliv at med.unc.edu
Thu Feb 8 05:58:52 PST 2007
UCSC genome mavens:
First of all, this is an EXCELLENT tool. I have already circulated it to
my colleagues and I would urge you to publicize it widely. It more or
less solves one of the most pressing issues in the GWAS area
(visualizing results in genomic context), and is far superior to any of
the other paltry tools out there.
Second, as requested, I have a couple of suggestions and/or wishes.
1. Pls allow the user to upload and plot qualitative data. In trying to
understand the results of a genomewide association study, it is very
useful to overlay external data from other studies that are often
qualitative not quantitative.
For example, a user might want simply to overlay the positions of
linkage regions implicated in other studies or a candidate gene list.
For these, one might only wish to note where they are qualitatively.
Would note these with a bar. See Slide 1 in attached .ppt for example.
Data input would be brilliant if there were several forms. Input might
be the following:
a) List of standard HUGO gene names. Could match against knownGene and
obtain chromosome and txsMin and txsMax (where txsMin is the minimum
txStart over all isoforms and txsMax is the maximum txEnd over all
isoforms). Example:
NRG1
DTNBP1
DISC1
COMT
For the gene NRG1 on chr8 (has multiple isoforms), txsMin=31616809 and
txsMax=32741615. These values could be pre-computed for all knownGenes
for efficiency.
b) Regions in from-to format where these could be chrN:x-y, SNP IDs, or
STS markers. Example
chr8:31616809-32741615
rs1234 rs5678
D19S123 D19S654
c) Coding suggestion - add an indicator flag in the first column for
which sort of data are on that line. This would allow all types of data
to be in one file.
TYPE Field1 Field2
1 NRG1
1 DTNBP1
2 chr8:31616809-32741615
2 chr9:22616809-22741615
3 rs1234 rs5678
3 rs2222 rs3333
4 D19S123 D19S654
4 D1S111 D1S222
Type=1 for single standard gene names, Type=2 for chrN:from-to, Type=3
for two SNP IDs, and Type=4 for two STS markers.
Could then split these into separate files, merge with the appropriate
UCSC table to get the coordinates, and then concatenate for plotting.
2. When the user selects an area on the genome overview page, goes to
the genome browser set on that area. Some suggestions for the
user-defined tracks at the top.
a) Show the baseline for each user track (see Slide 2).
b) If user wants the scores to be connected, put a little tick at the
location of each marker. Otherwise is hard to know the marker density in
a region. See Slide 2.
c) Allow the user to select of he/she wants the points connected or
indicated by a vertical line (i.e., the -log(pvalue) for a SNP at that
point). See Slide 3.
d) If full display is selected, display the SNP name on the user track.
Clicking on that SNP goes to the appropriate page about that SNP (as is
done under the current SNP track).
3. An exceptionally useful feature is the merging that occurs (for rs
numbers or STS markers). In the output that describes the matching of
results, pls list those that failed to match so we can trouble-shoot.
Again, many thanks for writing this tool. It will get a lot of use on
our side.
--
Pat
---- Patrick Sullivan, MD, FRANZCP
---- Ray M. Hayworth & Family Distinguished Professor
---- UNC/Genetics & Carolina Center for Genome Sciences
---- CB#7264, 103 Mason Farm Road
---- Neuroscience Research Building, Room 4109D
---- Chapel Hill, NC, 27599-7264, USA
---- V: +919-966-3358 F: +919-966-3630
More information about the Genome
mailing list