[Genome] smaller size of mm8 vs hg18 all chain data from genome browser

Yiming Zhou yzhou at saturn.med.nyu.edu
Tue Mar 6 06:07:27 PST 2007


Dear Colleagues,

I downloaded all chain data for mm8 vs hg18 from
http://hgdownload.cse.ucsc.edu/goldenPath/mm8/vsHg18/ (ftp_data)
and from
http://genome.ucsc.edu/cgi-bin/hgTables?command=start (web_data)
with the following parameters:
1. clade: Vertebrate
2. genome: Mouse
3. assembly: Feb 2006
4. group: Comparative Genomics
5. track: Human Chain
6. table: chainHg18
7. region: genome
8. output format: all fields from selected table
9. all other parameters were default ones.

I think these two data sets represent same thing: all chain data from
mouse/human whole genome pairwise alignment.

The question is that the sizes of two datasets are different. I know the formats
of two datasets are different. Since ftp_data used a more compact format, I
expected ftp_data was smaller. But I found ftp_data was 684M while web_data was
259M.

I am wondering what make ftp_data biger than web_data. I am going to use the
data to define conserved transcription factor binding sites. Would you like to
give me suggestions which one is better for the purpose? Thank you very much.

Best Regards,
Yiming

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.



More information about the Genome mailing list