[Genome] AT-Region over 90% in human geneome?

Brooke Rhead rhead at soe.ucsc.edu
Wed May 9 11:00:58 PDT 2007


Hello Javad,

There is not a straightforward way to find large AT-rich regions in the 
Genome Browser.  However, we do have a "GC Percent" track in the Genome 
Browser, where GC percent is calculated in 5-bp windows, and you could 
use the program that was used to make it to find larger AT-rich/GC-poor 
regions.

You would first need to download our source code.  This is free for 
academic, non-profit and personal use.  See this FAQ for instructions on 
downloading the source:

http://genome.ucsc.edu/FAQ/FAQdownloads#download27

Once you have that, you can use the program "hgGcPercent" to calculate 
regions of low GC percent in larger windows.  You also will need the 
.2bit file of the genome sequence, available here (for the latest human 
assembly):

http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/

To see instructions for running hgGcPercent, just run it without 
arguments.  A usage statement like this will be printed:

========
hgGcPercent - Calculate GC Percentage in 20kb windows
usage:
    hgGcPercent [options] database nibDir
      nibDir can be a .2bit file, a directory that contains a
      database.2bit file, or a directory that contains *.nib files.
      Loads gcPercent table with counts from sequence.
options:
    -win=<size> - change windows size (default 20000)
    -noLoad - do not load mysql table - create bed file
    -file=<filename> - output to <filename> (stdout OK) (implies -noLoad)
    -chr=<chrN> - process only chrN from the nibDir
    -noRandom - ignore randome chromosomes from the nibDir
    -noDots - do not display ... progress during processing
    -doGaps - process gaps correctly (default: gaps are not counted as GC)
    -wigOut - output wiggle ascii data ready to pipe to wigEncode
    -overlap=N - overlap windows by N bases (default 0)
    -verbose=N - display details to stderr during processing
    -bedRegionIn=input.bed   Read in a bed file for GC content in 
specific regions and write to bedRegionsOut
    -bedRegionOut=output.bed Write a bed file of GC content in specific 
regions from bedRegionIn

example:
   calculate GC percent in 5 base windows using a 2bit nib assembly (dp2):
   hgGcPercent -wigOut -doGaps -file=stdout -win=5 dp2 \
       /cluster/data/dp2 | wigEncode stdin gc5Base.wig gc5Base.wib
========

One of our developers ran this program in the past on the hg16 assembly 
(an older assembly) with 20Kb and 100Kb regions, and has posted the 
extremely GC-rich and GC-poor regions here, which you may also be 
interested in looking at:

http://genome-test.cse.ucsc.edu/~hiram/extremeGC.html

I hope this is helpful to you.  If you have further questions, please 
write back to the list at genome at soe.ucsc.edu.

--
Brooke Rhead
UCSC Genome Bioinformatics Group

We invite you to give us your feedback on the UCSC Genome Browser
through May 31, 2007: http://www.surveymonkey.com/s.asp?U=881163743177



Javad Karim Zad Hagh wrote:
> Dear Madam/Sir,
> 
> the bachground for the manifestaion of FRA16B (fragile site 16B) and
> FRA10B (fragile site 10B) is an over 90% AT-rich composition at this
> regions.
> 
> For example: FRA16B
> 
> Chr.16:63,913000-63915000bp
> 
> My question is:
> 
> How can i find out same AT-rich-Rgions in whole human genome? It must
> be at least 1 kb large!!
> 
> There is other rgions in human genome where has a AT-rich composition
> over 90% for 1-2 kb?
> 
> Not Centromer regions
> 
> Please help me to find the answer because i didnt find any programms
> which help me to solve this problem..It isvery importanant for my
> thesis.
> 
> best regards Javad karimzad PhD-Student Institut for Human Genetics
> Düsseldorf Germany
> 
> 
> 
>   ___________________________________________________________ Der
> frühe Vogel fängt den Wurm. Hier gelangen Sie zum neuen Yahoo! Mail:
> http://mail.yahoo.de _______________________________________________ 
> Genome maillist  -  Genome at soe.ucsc.edu 
> http://www.soe.ucsc.edu/mailman/listinfo/genome


More information about the Genome mailing list