[Genome] Chimpanzee self alignments
Hiram Clawson
hiram at soe.ucsc.edu
Wed Dec 12 12:41:37 PST 2007
Brian Raney wrote:
> Hello JJ,
>
> I posted what we have. Running the net process on the chains
> is not very difficult. You can discover the procedure in the
> perl script we generally use to build chains and nets
> (kent/src//hg/utils/automation/doBlastzChainNet.pl)
> or in our genome build documentation files which are
> also in the kent source hierarchy
> (kent/src/hg/makeDb/doc/hg18.txt for example).
>
> In general, net'ing the self-chains may be counter productive.
> Net's make a lot of sense when comparing two different species,
> but not so much when it comes to a single species where
> several possibly overlapping rounds of duplication have occurred.
>
> brian
If you really do want to run the net process,
here is a typical net process starting with the chains:
cd /cluster/data/hg18/bed/blastzPonAbe2.2007-10-02/axtChain
# Make nets ("noClass", i.e. without rmsk/class stats which are added later):
chainPreNet hg18.ponAbe2.all.chain.gz /cluster/data/hg18/chrom.sizes /cluster/data/ponAbe2/chrom.sizes stdout \
| chainNet stdin -minSpace=1 /cluster/data/hg18/chrom.sizes /cluster/data/ponAbe2/chrom.sizes stdout /dev/null \
| netSyntenic stdin noClass.net
# Make liftOver chains:
netChainSubset -verbose=0 noClass.net hg18.ponAbe2.all.chain.gz stdout \
| chainStitchId stdin stdout | gzip -c > hg18.ponAbe2.over.chain.gz
# Make axtNet for download: one .axt per hg18 seq.
netSplit noClass.net net
cd ..
mkdir axtNet
foreach f (axtChain/net/*.net)
netToAxt $f axtChain/chain/$f:t:r.chain \
/cluster/bluearc/scratch/data/hg18/nib /cluster/bluearc/scratch/data/ponAbe2/ponAbe2.2bit stdout \
| axtSort stdin stdout \
| gzip -c > axtNet/$f:t:r.hg18.ponAbe2.net.axt.gz
end
# Make mafNet for multiz: one .maf per hg18 seq.
mkdir mafNet
foreach f (axtNet/*.hg18.ponAbe2.net.axt.gz)
axtToMaf -tPrefix=hg18. -qPrefix=ponAbe2. $f \
/cluster/data/hg18/chrom.sizes /cluster/data/ponAbe2/chrom.sizes \
stdout \
| gzip -c > mafNet/$f:t:r:r:r:r:r.maf.gz
end
More information about the Genome
mailing list