[Genome] mirroring from BSC
Brooke Rhead
rhead at soe.ucsc.edu
Wed Aug 8 13:40:38 PDT 2007
Hello Alexis,
Thank you for your interest in mirroring the Genome Browser. The
instructions on our mirror site
(http://genome.ucsc.edu/admin/mirror.html) should be complete and
up-to-date. Once you have downloaded our source tree, be sure to look
in the directory "src/product" for various README files. These files
contain detailed instructions on setting up and using various parts of
the Browser.
> Do you have in mind future changes on naming protocol for genome
> files? full name, abbreviation based on directory name, uniformity
> maske vs hardmask, etc...
The differences you see in our naming conventions are mostly historical.
We originally had only chrom*.zip files, because at one point we only
had assemblies that were chromosome-based (not scaffold-based), and the
zip format was considered more universally usable. Then we made a
switch to the .gz (or .tar.gz) format when we decided that it is
well-supported enough on various platforms and provides superior
compression to .zip.
The "chrom" name vs. the "database" name differences started when we
began displaying genomes that had not yet been assembled into
chromosomes. Usually these genomes are assembled into scaffolds, as is
the case with bosTau2 and anoCar1. Naming the files "chrom" does not
make much sense in those cases, so we substitute the database name instead.
In the future, we generally expect to use the .gz compression instead of
.zip (although sometimes .zip is still used on assemblies whose previous
versions used this format). We expect scaffold-based assemblies to have
names like "database.*.gz" and chromosome-based assemblies to have names
like "chrom*.gz".
> About mirroring UCSC genomes and offer such mirroring
> through our website, there is some needed constraint to accomplish,
> something we must know? may we mirror only the databases without
> Genome-Browser?
The full Genome Browser is freely available to mirror for non-commerical
organizations. You can mirror as much or as little as you would like.
Our FAQs on the topic are located here:
http://genome.ucsc.edu/FAQ/FAQlicense.
Note that for future mirror-related questions, we have a different
mailing list specific to mirroring the Browser, at
genome-mirror at soe.ucsc.edu.
I hope this information helps. Good luck with your work.
--
Brooke Rhead
UCSC Genome Bioinformatics Group
Alexis Torrano wrote:
>
>
>
> Hello
>
> I am Alexis Torrano. I am
> mailing you from INB-BSC (National Institute in
> BioInformatics-Barcelona Supercomputing Center) asking for some advice.
> Our researchers make an important use of your databases. And we are
> interested in keeping a mirroring of such genome databases for
> them.
> We plan
> to add an update process of your databases to our authomatical database
> update process.
> Mainly we
> will follow the indications appearing on your website about rsync. We'd
> like to know if we should take into account some other issue which could
> make easier the update.
>
>
> We have observed that some files
> are of the kind *fa.gz, ga.masked,gz and others like *Fa.zip,
> *FaMasked.zip, hardmask.fa.gz.
>
> Also, there are some files that
> match in someway their specie directory, some others do not.
>
> Anolis_carolinensis/bigZips/anoCar1.fa.masked.gz
> Anopheles_gambiae/bigZips/chromFaMasked.zip
> Canis_familiaris/bigZips/chromFaMasked.tar.gz
> Bos_taurus/bigZips/bosTau2.hardmask.fa.gz
>
> Do you have in
> mind future changes on naming protocol for genome files? full name,
> abbreviation based on directory name, uniformity maske vs hardmask,
> etc...
>
> About mirroring UCSC genomes and offer such mirroring
> through our website, there is some needed constraint to accomplish,
> something we must know? may we mirror only the databases without
> Genome-Browser?
>
>
> thank you very much.
>
> Alexis Torrano.--
> -----------------------------------------------------
> Alexis
> Torrano Martinez
> e-mail: atorrano at bsc.es, atorrano at lsi.upc.edu
>
> Nodo Computacional GNHC-1
> (inb.bsc.es)
> Instituto
> Nacional de Bioinformatica (www.inab.org)
> Barcelona Supercomputing
> Center Node (www.bsc.es)
> BSC-CNS (www.bsc.es)
> c/. Jordi Girona
> 29
> Edifici Nexus II, despatx 1B Tel: (+34) 93 413 7605
> E-08034
> Barcelona Fax: (+34) 93
> Catalunya (Spain)
> Team
> info:
> http://inb.lsi.upc.edu/
> -----------------------------------------------------
> Berlin's Law
> of Computing - Computers don't do what you ask them to do, they do what
> you tell them to do. Named for Dean Berlin, noted observer of reality.
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
More information about the Genome
mailing list