From ann at soe.ucsc.edu Tue Apr 1 12:19:23 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Tue, 01 Apr 2008 12:19:23 -0700 Subject: [Genome-mirror] releasing new cow assembly Message-ID: <47F28ABB.8000108@soe.ucsc.edu> Hello mirror sites, Later this week, we are intending to release the newest cow assembly: bosTau4. Please be prepared to host the following new data: bosTau4 MySQL database: 17 G files in /gbdb/bosTau4/* 15 G Additionally there are net and chain tables from other organisms that will be released to the respective annotation databases. These tables are named netBosTau4 and *chainBosTau4*. The size of these tables is as follows: hg18 database: 4.4 G canFam2 database: 3.6 G mm9 database: 2.6 G ornAna1 database: 1.6 G Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu From ann at soe.ucsc.edu Tue Apr 8 14:24:39 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Tue, 08 Apr 2008 14:24:39 -0700 Subject: [Genome-mirror] drop some tables Message-ID: <47FBE297.7030106@soe.ucsc.edu> Hello Mirror Sites, Today we released the chains and nets from other databases to the newly-released bosTau4 assembly. Consequently, after your next database rsync, you can safely drop some tables from the hg18 and mm9 databases: *chainBosTau3 *chainBosTau3Link netBosTau3 In the mm9 database, this is 71 tables. In hg18, 99 tables. Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu From hooverdm at helix.nih.gov Thu Apr 10 08:10:56 2008 From: hooverdm at helix.nih.gov (David Hoover) Date: Thu, 10 Apr 2008 11:10:56 -0400 Subject: [Genome-mirror] hgcentral access Message-ID: <47FE2E00.6030204@helix.nih.gov> I can no longer access the hgcentral database through the genome-mysql.cse.ucsc.edu server, and the flat file download at http://hgdownload.cse.ucsc.edu/admin/hgcentral.sql is not regularly kept up to date. Is there another way to keep up to date on changes to hgcentral? Thanks, David Hoover, Helix Systems, CIT/NIH From ann at soe.ucsc.edu Thu Apr 10 16:31:01 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Thu, 10 Apr 2008 16:31:01 -0700 Subject: [Genome-mirror] releasing large amount of data Message-ID: <47FEA335.3080807@soe.ucsc.edu> Hello mirror sites, Today we have released a relatively large amount of data corresponding with the hg17 and hg18 databases and data files. This is in support of the new Affy ENCODE EC track. hg17 and hg18 databases total = 0.5 GB /gbdb/hg17 and /gbdb/hg18 total = 2.7 GB Please plan accordingly during your next rsync. Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu From breannemackenzie at yahoo.com Fri Apr 11 10:48:48 2008 From: breannemackenzie at yahoo.com (BreAnne MacKenzie) Date: Fri, 11 Apr 2008 10:48:48 -0700 (PDT) Subject: [Genome-mirror] making lists of gene locations for acgh design Message-ID: <759850.61937.qm@web51704.mail.re2.yahoo.com> Hello, I work at the University of Minnesota and I am designing an Array Comparative Genome Hybridization plate to compare genome variation in our mutants which have variable phenotypes. We want to test only certain regions of the genome such as smad and apoptosis pathways. Here is what I need my data to look like: Gene and Chromosomal Location (two columns in excel for example) Bmp4 chr14: 4700000-4800000 Then I will take that data and list it so it reads: Chr14:4700000-4800000|ChrX:XXXXXXX-XXXXXXXX|ChrX:XXXXXXXX-XXXXXXXXXX ..... Is there a faster way to doing this with the mouse genome browser - rather than looking up individual genes, I'd like to get them in a list with the entire signaling pathway. These chromosomal locations then allow us to make probes in these areas. Any ideas? Thanks a lot for your help! BreAnne MacKenzie 612-626-2962 __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From pauline at soe.ucsc.edu Fri Apr 11 13:47:09 2008 From: pauline at soe.ucsc.edu (Pauline Fujita) Date: Fri, 11 Apr 2008 13:47:09 -0700 Subject: [Genome-mirror] [Re: making lists of gene locations for acgh design]] Message-ID: <47FFCE4D.2050701@soe.ucsc.edu> Hello BreAnne, Our Table Browser tool will format the data as you're describing. For an overview of the different features offered in the Table Browser you can consult this help doc: http://genome.cse.ucsc.edu/goldenPath/help/hgTablesHelp.html For your specific situation you can go to the Table Browser and select your assembly of interest. Then select: group: Genes and Gene Prediction Tracks track: RefSeq Genes table: refGene region: (select genome) output format: select fields from primary and related tables In the "identifiers (names/accessions):" click "paste list" and you can paste a list of genes/regions of interest into the input box that appears and click "submit". If you do not already have a list of genes you can generate one by searching in the Genome Browser (ie. you can do a search for "smad") and pasting this list into the table browser. Back at the Table Browser menu click on "get output". With "output format" set to "select fields from primary and related tables" this will take you to a table where you can select the fields (columns) you want in your output (in this case: name, chrom, cdsStart, cdsEnd, and name2). Then simply click "get output" and you should see your data returned as text. Hopefully this information was helpful and answers your question. If you have further questions or require clarification feel free to contact the mailing list at genome at soe.ucsc.edu. Regards, Pauline Fujita UCSC Genome Bioinformatics Group http://genome.ucsc.edu -------- Original Message -------- Subject: [Genome-mirror] making lists of gene locations for acgh design Date: Fri, 11 Apr 2008 10:48:48 -0700 (PDT) From: BreAnne MacKenzie To: genome-mirror at soe.ucsc.edu Hello, I work at the University of Minnesota and I am designing an Array Comparative Genome Hybridization plate to compare genome variation in our mutants which have variable phenotypes. We want to test only certain regions of the genome such as smad and apoptosis pathways. Here is what I need my data to look like: Gene and Chromosomal Location (two columns in excel for example) Bmp4 chr14: 4700000-4800000 Then I will take that data and list it so it reads: Chr14:4700000-4800000|ChrX:XXXXXXX-XXXXXXXX|ChrX:XXXXXXXX-XXXXXXXXXX ..... Is there a faster way to doing this with the mouse genome browser - rather than looking up individual genes, I'd like to get them in a list with the entire signaling pathway. These chromosomal locations then allow us to make probes in these areas. Any ideas? Thanks a lot for your help! BreAnne MacKenzie 612-626-2962 __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ Genome-mirror mailing list Genome-mirror at soe.ucsc.edu http://www.soe.ucsc.edu/mailman/listinfo/genome-mirror From kayla at soe.ucsc.edu Fri Apr 11 16:07:59 2008 From: kayla at soe.ucsc.edu (Kayla Smith) Date: Fri, 11 Apr 2008 16:07:59 -0700 Subject: [Genome-mirror] hgcentral access In-Reply-To: <47FE2E00.6030204@helix.nih.gov> References: <47FE2E00.6030204@helix.nih.gov> Message-ID: <47FFEF4F.4070906@cse.ucsc.edu> Hello David, I looked at the directory you've mentioned, and the files are dated 4/4/08. This is the date of our most recent bi-weekly cgi-release. So I believe we are on track as far as that is concerned. Please let us know if we can assist you further. Kayla Smith UCSC Genome Bioinformatics Group David Hoover wrote: > I can no longer access the hgcentral database through the > genome-mysql.cse.ucsc.edu server, and the flat file download at > http://hgdownload.cse.ucsc.edu/admin/hgcentral.sql is not regularly kept > up to date. Is there another way to keep up to date on changes to > hgcentral? > > Thanks, > David Hoover, Helix Systems, CIT/NIH > _______________________________________________ > Genome-mirror mailing list > Genome-mirror at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome-mirror From ann at soe.ucsc.edu Thu Apr 17 09:46:28 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Thu, 17 Apr 2008 09:46:28 -0700 Subject: [Genome-mirror] another 3.9 GB of data Message-ID: <48077EE4.4060705@soe.ucsc.edu> Hello Mirror Sites, Tomorrow we will be releasing 3.9 GB of data into the ce4 (C. elegans) database. Please plan accordingly during your next rsync. Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu From rhead at soe.ucsc.edu Thu Apr 17 11:46:31 2008 From: rhead at soe.ucsc.edu (Brooke Rhead) Date: Thu, 17 Apr 2008 11:46:31 -0700 Subject: [Genome-mirror] get ready for marmoset Message-ID: <48079B07.9080600@soe.ucsc.edu> Hello Mirror sites, We are planning to release a new browser for marmoset, calJac1, next week. Please be prepared to host the following data: calJac1 MySQL tables: 47 G /gbdb/calJac1/* files: 41 G We will also be releasing net and chain tracks to calJac1 in the following assembly databases. Here are the approximate sizes of the netCalJac1 and %chainCalJac1% tables in each database: hg18: 6 G mm9: 3 G monDom4: 8 G ornAna1: 1 G panTro2: 6 G rheMac2: 4 G ------------- total: ~28 G Please let us know if you have any questions or concerns. -- Brooke Rhead UCSC Genome Bioinformatics Group From ann at soe.ucsc.edu Thu Apr 24 11:45:32 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Thu, 24 Apr 2008 11:45:32 -0700 Subject: [Genome-mirror] releasing 8GB GO database Message-ID: <4810D54C.9080002@soe.ucsc.edu> Hello Mirror Sites, A heads-up that either tomorrow or early next week, we will be releasing a new GO database: go080130. This will replace the existing go database: go070111. This provides an up-to-date link to AmiGO Gene Ontology. The size of this new database is ~8 GB. The size of the old go database that this one replaces is ~4 GB. Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu From martin.hemberg at childrens.harvard.edu Wed Apr 30 08:16:46 2008 From: martin.hemberg at childrens.harvard.edu (Martin Hemberg) Date: Wed, 30 Apr 2008 11:16:46 -0400 Subject: [Genome-mirror] genome browser download Message-ID: <48188D5E.7050907@childrens.harvard.edu> Dear Sir or Madam. I am trying to download and install the USCS genome browser and I am following the instructions on the page: http://genome.ucsc.edu/admin/mirror.html So far I've reached step 4 where I run the command: rsync -avzP --delete --max-delete=20 \ rsync://hgdownload.cse.ucsc.edu/gbdb/ /gbdb/ and according to the instructions, this should download 400 Gb worth of data to my hard drive. However, the download is currently at 640 Gb and I wonder how much more it is going to fetch before it is done? Best Regards Martin Hemberg -- Martin Hemberg, PhD Post-doc, Kreiman Lab Department of Ophthalmology and Program in Neurobiology Children's Hospital Boston, Harvard Medical School From hiram at soe.ucsc.edu Wed Apr 30 10:00:56 2008 From: hiram at soe.ucsc.edu (Hiram Clawson) Date: Wed, 30 Apr 2008 10:00:56 -0700 Subject: [Genome-mirror] genome browser download In-Reply-To: <48188D5E.7050907@childrens.harvard.edu> References: <48188D5E.7050907@childrens.harvard.edu> Message-ID: <4818A5C8.9050108@soe.ucsc.edu> Good Morning Martin: If you run the rsync command with a --dry-run option before actually fetching the data, you can see how much data it is going to transfer. When I try this here today, it says: total size is 1025747436176 Which appears to be about 1 Tb. Please pardon our out of date numbers on that mirror instruction page. We should be more clear there how to identify the exact amount on a day by day basis. The data comes so fast and furious these days, it is difficult even for us to keep up. I would highly recommend trying the minimal browser installation before attempting to transfer multiple terabytes of data. http://genomewiki.ucsc.edu/index.php/Minimal_Browser_Installation Is your mirror site going to be a public resource ? --Hiram Martin Hemberg wrote: > Dear Sir or Madam. > > I am trying to download and install the USCS genome browser and I am > following the instructions on the page: > > http://genome.ucsc.edu/admin/mirror.html > > So far I've reached step 4 where I run the command: > > rsync -avzP --delete --max-delete=20 \ > rsync://hgdownload.cse.ucsc.edu/gbdb/ /gbdb/ > > and according to the instructions, this should download 400 Gb worth of > data to my hard drive. However, the download is currently at 640 Gb and > I wonder how much more it is going to fetch before it is done? > > Best Regards > > Martin Hemberg