[Genome] File documentation

Kayla Smith kayla at soe.ucsc.edu
Fri Apr 4 16:52:43 PDT 2008


Hello Xiaohong,

 From http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/

chromFa.zip - The assembly sequence in one file per chromosome.
     Repeats from RepeatMasker and Tandem Repeats Finder (with period
     of 12 or less) are shown in lower case; non-repeating sequence is
     shown in upper case. Repeat masking was done using the following
     RepeatMasker/RepBase versions: RepBase Update 9.11, RM database
     version 20050112. The main assembly is found in the chrN.fa
     files, where N is the name of the chromosome. The chrN_random.fa
     files contain clones that are not yet finished or cannot be placed
     with certainty at a specific place on the chromosome. In some
     cases, including the human HLA region on chromosome 6, the
     chrN_random.fa files also contain haplotypes that differ from the
     main assembly.

Please also see this faq:

http://genome.ucsc.edu/FAQ/FAQdownloads#download10

I hope this information is helpful to you.  Please don't hesitate to 
contact us again if you require further assistance.

Kayla Smith
UCSC Genome Bioinformatics Group


xiaohong Li wrote:
> Hello.  I need help to find the documentation for the files of individual chromosomes 
>    (   http://hgdownload.cse.ucsc.edu/goldenPath/hg18/chromosomes/    ).
> 
> Specifically, for instance, are there any documentation about what is file   "chr1.fa.gz"   and  what is file "chr1_random.fa.gz" ?
> 
> Thanks,
> Xiaohong
> 
> 
>       ____________________________________________________________________________________
> You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.  
> http://tc.deals.yahoo.com/tc/blockbuster/text5.com
> _______________________________________________
> Genome maillist  -  Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome



More information about the Genome mailing list