[Genome] Question for a lower threshold for chain-score in the hg18 2006
Hiram Clawson
hiram at soe.ucsc.edu
Sun Dec 16 22:43:31 PST 2007
Good Evening Robert:
Which set of chains exactly are you referring too ? There are
different minimum scores depending upon which chains they are.
I'm curious which chains you mention that are in the May 2004
assembly but not the March 2006 assembly ? On the newest
assemblies, we only keep the chains to the most recent version
of the other organisms. We do not keep the chains that existed
before the minimum score filter was applied. We would like to
verify which chains you mention that were filtered at 10,000.
Most chains should be at a filter of 5,000 or 3,000.
You can fetch the chain files directly from the FTP server
directory:
ftp://hgdownload.cse.ucsc.edu/apache/htdocs/goldenPath/hg18/
Note the directory names there beginning with vs
in the form of vs<otherDb> for example:
ftp://hgdownload.cse.ucsc.edu/apache/htdocs/goldenPath/vsMm9/
for the Mouse assembly. The file in that directory
hg18.mm9.all.chain.gz contains all the chains.
But these are after the minimum score filter has been
applied. Note also the README.txt files in each of
these directories. They should contain the minimum
score filters used in the alignment, for example Mm9-Hg18
was 3,000
--Hiram
Robert Olinski wrote:
> Dear Madam or Sir at UCSC,
>
> 1) We have a problem with the human chains from the UCSC Genome
> Browser May 2006 hg18. According to the following information ?Chains
> scoring below a minimum score of 10,000 were discarded?, the specified
> high threshold discarded relevant to us information for conserved
> repetitive sequences in mammals that was present in 2004 release.
> Therefore, we would like to know how (and whether) we could obtain all
> human chains with a lower threshold (possibly the ones used in the
> 2004 release of the human genome data) for 2006-human genome data. The
> May 2006 release contains many more species of interest that we want
> to keep for PhastCons scores, but some human chains from 2004 release
> are simply not at all displayed in 2006 hg18 genome browser version.
>
> 2) How is it possible to download at once all chains displayed in the
> genome browser window? Should we use the Table Browser for this
> purpose? If so, does the following format would be acceptable to
> simply copy and paste into the table browser to get the sequence for
> the given chain e.g
> chain 13475906 chr1 247249719 + 120974 741471 chr1 247249719 +
> 220708072 222295432 494
> or should we skip all coordinates and paste only numbers separated by a space?
> Thanks a lot for help and for all answers!
>
> With best regards from Japan,
> Robert
>
> Robert P Olinski, M.Sc.,PhD
> Graduate School of Bioscience and Biotechnology
> Department of Biological Sciences
> Tokyo Institute of Technology
> 4259 Nagatsuta-cho, Midori-ku, Yokohama
> 226-8501 Japan
> phone:+81-45-924-5744
> fax: +81-45-924-5835
> email: Robert.Olinski at neuro.uu.se
> email 2: robertinuppsala at hotmail.com
>
>
>
>
>
>
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>
More information about the Genome
mailing list