[Genome] Retrieving intergenic regions from specific regions on chromosomes

Archana Thakkapallayil archanat at soe.ucsc.edu
Tue Oct 9 17:23:09 PDT 2007


Hello Travis,

You could get this information using our Table Browser. First of all, 
you will need to make a few custom tracks using the Table Browser.

1.  A custom track of the introns only from knownGene.
2.  A custom track of the exons only from knownGene
3.  A custom track of the introns+exons from knownGene (make this by 
combining the first two CTs) Note this is the same as simply making a 
custom track of the knownGenes as they are ).
4.  A custom track of the complement of #3 for "everything else" (aka 
intergenic regions)

Some details on how to make these custom tracks:

1. Custom track of introns:
---------------------------
1a. Open the Table Browser
1b. set the following options:
   clade: Vertebrate
   genome: Human
   assembly: Mar 2006
   group: Genes and Gene Prediction Tracks
   track:  UCSC Genes
   table: knownGene
   region: position and enter the position that you are interested in 
the text box ( chr7:115,000,000-116,000,000 )
   output format: custom track
   Click "get output"
1c.  On the next page, select the radio button for "Introns", be sure to 
name this custom track appropriately, and press "get custom track in 
table browser."
1d.  You now have a custom track of the introns of the Known Genes for 
your region of interest.

2. Custom track of exons:
-------------------------
Follow the above steps, except select the radio button for "Exons" in 
step 1c.

3. Custom track of introns+exon :
---------------------------------
3a. set the following options:
    clade: Vertebrate
    genome: Human
    assembly: Mar 2006
    group: Custom Tracks
    track:  tb_knownGene_INTRONS (this is what I named my CT of the 
introns) and select the related table
    region: position
    intersection: create
3b. On the intersection page, pull down the menu to choose your exons 
track, tb_knownGene_EXONS ( I used this name for my CT  of the exons) .
    Choose a "base-pair-wise union [OR] of tb_knownGene_INTRONS and 
tb_knownGene_EXONS" 
    click submit.
    output format: custom track
    Click "get output"
3c. Give an appropriate name to the CT ( I used "unionExonsIntrons" ) 
and choose "Create one BED record per: Whole gene".
    click 'get custom track in table Browser'.

4. Custom track of the intergenc regions:
-----------------------------------------
In this step you have to complement the unionExonsIntrons track to get 
the intergenic regions.
4a. Choose the track 'unionExonsIntrons'
4b. Create an intersection with itself by choosing "Base-pair-wise 
intersection (AND) of unionExonsIntrons and unionExonsIntrons"
4c. Also check both the boxes for
    Complement unionExonsIntrons before intersection/union
    Complement unionExonsIntrons before intersection/union
    Click submit.
4d. Back on the Table Browser choose output format: custom track and I 
named the CT as 'complementUnionIntronsExons' and then press 'get custom 
track in table browser'.

This gives you the intergenic regions for the position: 
chr7:115,000,000-116,000,000

Please see this session that I've created for you:
http://genome.cse.ucsc.edu/cgi-bin/hgTracks?hgS_doOtherUser=submit&hgS_otherUserName=Archana&hgS_otherUserSessionName=hg18_intergenic_%20regions

I hope that this helps. Please let us know if you have further questions.

Regards,

Archana
UCSC Genome Bioinformatics Group

Travis Ptacek wrote:
> I refered to the following link for the method to retrieve intergenic sequence using galaxy:
> http://www.soe.ucsc.edu/pipermail/genome/2007-June/013907.html <http://www.soe.ucsc.edu/pipermail/genome/2007-June/013907.html> 
>  
> I can perform this method sucessfully, but I need to retrieve intergenic sequence from a specific region, not an entire chromosome or genome.
>  
> When I perform this method using a region of, for example, chr7:115,000,000-116,000,000 using the Known Genes track, Galaxy correctly retrieves information for the known genes in that region. However, when I complement the interval of the query, I get an two huge intervals covering all of chromosome 7 upstream of 115,000,000 and downstream of 116,000,000 in addtion to intergenic regions within chr7:115,000,000-116,000,000. What I want, in this example, are only the intergenic regions within chr7:115,000,000-116,000,000.
>  
> Thanks in advance,
>  
> Travis Ptacek
>
> _______________________________________________
> Genome maillist  -  Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>   



More information about the Genome mailing list