[Genome] how many chr location I may have in custom wig file?
Khrebtukova, Irina
ikhrebtukova at illumina.com
Fri Jun 8 07:18:34 PDT 2007
thanks again! - that helped!
Irina
-----Original Message-----
From: Hiram Clawson [mailto:hiram at soe.ucsc.edu]
Sent: Thursday, June 07, 2007 8:22 PM
To: Khrebtukova, Irina
Cc: genome at soe.ucsc.edu
Subject: Re: [Genome] how many chr location I may have in custom wig
file?
Good Afternoon Irina:
The difficulty is due to the type of input format. The four-column
input format is the most inefficient method of input for the wiggle
tracks. If you convert your data input into either fixedStep or
variableStep format, you will be able to load much more data.
As mentioned on this help page:
http://genome.ucsc.edu/goldenPath/help/wiggle.html
> In many cases this format will be too verbose to conveniently specify
> a large number of data values. Two other two methods of specifying
> data were developed to be more space efficient.
We'll try to improve this warning to be much more explicit about the
inefficiency of this four-column data format.
What's happening is that each of your lines is being converted into
(chromEnd-chromStart) number of data points. There is a limit of
300,000,000 data points. Each of your input lines specifies 10,000 data
points, thus your line count input limit would be approximately 30,000 =
(300,000,000 / 10,000)
Since your data does appear to be regular, it can be fixed step:
fixedStep chrom=chr8 start=10460001 step=10000 span=10000
3
2
1
0
2
-1
... etc ...
This type of data input only consumes one byte of storage for each data
value, and you can input 300,000,000 of these types of data points,
thereby covering 10,000 (span) * 300,000,000 genome bases. (==
3,000,000,000,000 bases == 10X human genome size) Note the 1-relative
start coordinate for fixedStep and variableStep, unlike bed chromStart
coordinates which are 0-relative.
--Hiram
Khrebtukova, Irina wrote:
> Hi,
>
> am I right that if I'm making custom track type=wiggle_0 in bed format
> with 2 colors (for positive and negative values), like say:
>
> track type=wiggle_0 name="Whatever" visibility=full autoScale=off
> color=200,100,0 altColor=0,100,200
> chr8 104060000 104070000 3
> chr8 104070000 104080000 2
> chr8 104080000 104090000 1
> chr8 104090000 104100000 0
> chr8 104100000 104110000 2
> chr8 104110000 104120000 -1
> chr8 104120000 104130000 3
> chr8 104130000 104140000 3
> chr8 104140000 104150000 3
> chr8 104150000 104160000 2
> chr8 104160000 104170000 2
>
> am I right that in this case I can NOT put more than 25K chromosomal
> locations into one track? (it seems exactly like this...)
>
> though it's a bit strange because I was able so far to put MUCH more
> chromosomal locations in simple BED tracks (just chr start end) as
> well as simple WIG tracks (variableStep but fixed small span).
>
> is this your policy or I'm doing something wrong?
>
> thanks! I appreciate your help and needless to say (I think it's
> obvious) that your browser is the best! (I doubt anyone could
> argue...)
>
> Irina Khrebtukova, PhD
> Sr. Staff Bioinformatics Scientist
> Illumina Inc.
> 25861 Industrial Blvd.,
> Hayward, CA 94545
> ph: 510-723-9219
> ikhrebtukova at illumina.com
>
>
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>
More information about the Genome
mailing list