[Genome] short mm8 chr1 and 2?
Galt Barber
galt at soe.ucsc.edu
Tue May 1 10:23:39 PDT 2007
Smart http clients can auto-resume an interrupted
download because http 1.1 supports range requests which
can ask for any part of the file.
wget -c allows you to do this with http to "continue"
to download.
-Galt
On Mon, 30 Apr 2007, Rachel Harte wrote:
> Hello Ben,
>
> It may be that if you are using Windows and http then the files are not
> being completely downloaded. Try using wget and ftp instead e.g. for chr1:
> wget
> 'ftp://hgdownload.cse.ucsc.edu/apache/htdocs/goldenPath/mm8/chromosomes/chr1.fa.gz'
> -O chr1.fa.gz
>
> --18:54:24--
> ftp://hgdownload.cse.ucsc.edu/apache/htdocs/goldenPath/mm8/chromosomes/chr1.fa.gz
> => `chr1.fa.gz'
> Resolving hgdownload.cse.ucsc.edu... 128.114.119.140
> Connecting to hgdownload.cse.ucsc.edu|128.114.119.140|:21... connected.
> Logging in as anonymous ... Logged in!
> ==> SYST ... done. ==> PWD ... done.
> ==> TYPE I ... done. ==> CWD /apache/htdocs/goldenPath/mm8/chromosomes
> ... done.
> ==> PASV ... done. ==> RETR chr1.fa.gz ... done.
> Length: 62,819,741 (60M) (unauthoritative)
>
> 100%[====================================>] 62,819,741 13.69M/s ETA
> 00:00
>
> 18:54:28 (14.16 MB/s) - `chr1.fa.gz' saved [62819741]
>
> then
> >sum chr1.fa.gz
> 52962 61348
>
> For chr2:
> --18:58:03--
> ftp://hgdownload.cse.ucsc.edu/apache/htdocs/goldenPath/mm8/chromosomes/chr2.fa.gz
> => `chr2.fa.gz'
> Resolving hgdownload.cse.ucsc.edu... 128.114.119.140
> Connecting to hgdownload.cse.ucsc.edu|128.114.119.140|:21... connected.
> Logging in as anonymous ... Logged in!
> ==> SYST ... done. ==> PWD ... done.
> ==> TYPE I ... done. ==> CWD /apache/htdocs/goldenPath/mm8/chromosomes
> ... done.
> ==> PASV ... done. ==> RETR chr2.fa.gz ... done.
> Length: 58,461,619 (56M) (unauthoritative)
>
> 100%[====================================>] 58,461,619 14.54M/s ETA
> 00:00
>
> 18:58:07 (14.51 MB/s) - `chr2.fa.gz' saved [58461619]
>
> then
> > sum chr2.fa.gz
> 24876 57092
>
> If your result does not look like the output above then the file is not
> being downloaded completely.
>
> I hope that this helps you. Please let us know if you have further
> questions.
>
> Rachel
>
> Rachel Harte
> UCSC Genome Bioinformatics Group
> http://genome.ucsc.edu
>
>
> On Mon, 30 Apr 2007, Ben Gantner wrote:
>
> > Hi there:
> >
> > Sorry for the bother but I've tried repeatedly to download
> > the mm8 release of the mouse genome from your site. I'm
> > downloading gzipped files from the following page:
> > http://hgdownload.cse.ucsc.edu/goldenPath/mm8/chromosomes/
> >
> > When I decompress the files I count the total sequence length
> > per chromosome (minus headers) and I get back the correct
> > lengths for most of the chromosomes as described on your
> > site:
> > http://genome.ucsc.edu/cgi-bin/hgTracks?hgsid=86540626&chromInfoPage=
> >
> > However, chr1 and chr2 show up short, and I'm missing a
> > significant amount of sequence:
> > chr1 get : 115212860, expect : 197,069,962
> > chr2 get : 102088474, expect : 181,976,762
> >
> > I've looked through the site FAQ and can't find anything
> > about this, any suggestions/ideas??
> >
> > Thanks so much,
> > Ben
> > _________________________
> > Ben Gantner
> > University of Chicago
> > Singh Lab, CIS Rm W519
> > 929 East 57th Street
> > Chicago, IL 60637
> > (Ph) 773.702.2912
> > _______________________________________________
> > Genome maillist - Genome at soe.ucsc.edu
> > http://www.soe.ucsc.edu/mailman/listinfo/genome
> >
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>
More information about the Genome
mailing list