[Genome] Program to calculate AL CpG Islands available?
Donna Karolchik
donnak at soe.ucsc.edu
Wed Nov 8 16:24:49 PST 2006
hi Shan,
The software used for computing the Andy Law CpG Islands annotation is actually
a combination of two programs. The first is a program in
kent/src/oneShot/preProcGgfAndy/ -- this must be compiled in the Genome Browser
source tree. See the FAQ (http://genome.ucsc.edu/FAQ/FAQdownloads#download27)
for more information on downloading and building the source tree. The second is
Andy Law's perl script (slightly modifed by UCSC), which I will send you in a
separate email message.off-list.. To translate the output of Andy's script into
UCSC's particular table format, a perl inline command is tacked on at the end.
Here's an example of how you would run these programs on a fasta file $f:
~/bin/$MACHTYPE/preProcGgfAndy $f \
| ggf-andy-cpg-island.pl \
| perl -wpe 'chomp; ($s,$e,$cpg,$n,$c,$g,$oE) = split("\t"); $s--; \
$gc = $c + $g; $pCpG = (100.0 * 2 * $cpg / $n); \
$pGc = (100.0 * $gc / $n); \
$_ = "'$chr'\t$s\t$e\tCpG: $cpg\t$n\t$cpg\t$gc\t" . \
"$pCpG\t$pGc\t$oE\n";' \
>> cpgIslandGgfAndy.bed
UCSC runs this on masked sequence for mammals so that Alus (especially in human)
are not tagged as CpG islands. However, in chicken we use unmasked sequence.
-Donna
-----------------------------------
Donna Karolchik
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu
----- Original Message -----
From: "Shan Yang" <yang21 at llnl.gov>
To: <genome at soe.ucsc.edu>
Sent: Tuesday, November 07, 2006 5:31 PM
Subject: [Genome] Program to calculate AL CpG Islands available?
> Hi,
>
> I have some DNA sequences and want to know Andy Law CpG islands in
> them. The CpG software on the Genome Browser only gives me the
> conventional CpG island, not AL CpG island. Is the program detecting
> AL CpG island available any where?
>
> Thanks a lot!
>
> Shan Yang, PhD
> Genome Biology Division, L-441
> Biosciences Directorate
> Lawrence Livermore National Laboratory
> 7000 East Ave, Livermore, CA, 94550
>
> Ph: 925-422-7389
> Fax: 925-422-2099
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>
More information about the Genome
mailing list