[Genome] information about "phastcons" score

Ann Zweig ann at soe.ucsc.edu
Thu Apr 3 16:11:43 PDT 2008


Hello Hong Sun,

	Since there are so many parts to your question, I have embedded my answers 
within your questions below.  I am assuming that you are working with the latest 
mouse assembly (mm9) and human assembly (hg18).

hong sun wrote:
> Hello,
> We are interested in the pairwise alignment between intergenic region of 
> 50 mouse genes and the corresponding intergenic region of human.
> The 50 intergenic region of mouse genes are as followings in /*Data1*/, 
> what we are doing now is:
> 1 use UCSC genome browser to browser the chr reigon of our data, with 
> selecting only human to do the pairwise alignment with mouse in the 
> Conservation Track Settings page.

I have two comments to this part of your question.  If you are not already doing 
it, I would suggest that you create a Custom Track with your 50 intergenic mouse 
regions.  They will be displayed in the mouse genome browser and will be easier 
to navigate to.  Read about creating a custom track here: 
http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#CustomTracks

Please note that although you are only viewing only the human pairwise 
alignment, the phastCons wiggle values do not change correspondingly.  This is a 
common misconception.


> 2 then click on the blue area conservation part on the genome browser 
> page, then it gives the alignments like /*Result1*/ format (followings),
>    *our first question: *is this alignment the alignment between mouse 
> intergenic region and the corresponding intergenic region of human?

This is the alignment between your mouse coordinates (whether they are 
intergenic or not) and the corresponding human coordinates (which may or may not 
be intergenic).


>    *our second question: *can we download the alignment once but not 
> download each block?

Yes, instead of downloading from this page in a block-by-block fashion, I would 
suggest using the Human Net track on the mouse browser.  This is a pairwise 
alignment between mouse and human.

Take your third region from your Data1 file, chr12:87772800-87773999.  In the 
Conservation details page, you will see these block-by-block alignments (as you 
have noted):

B D  Mouse  gctgggatttctgtatgtgtgacac-aggggattagagaagg-gattagc-gggggtgg-a-ggactgat
B D  Human  gctcgcgtgtc--aatatgtaacacaaggggattaaagaagg-aattacagtttgggat-g-gagaggat

However, from the Human Net details page (click on the "View alignment details 
of parts of net within browser window" link), you will see a base-by-base 
alignment (human on top, mouse on bottom):

75922219 gctcgcgtgtcaatatgtaacaca-aggggattaaagaaggaattacagtttgggatgga 75922277
 >>>>>>>> ||| |  | ||  |||||   ||| ||||||||| |||||| ||||  |   ||| |||| >>>>>>>>
87772800 gctgggatttctgtatgtgtgacacaggggattagagaagggattagcg---ggggtgga 87772856

...and so on.


> 3 beside the pairwise alignment between intergenic region of mouse and 
> human, we are also interested in the conserved region of the pairwise 
> alignment, here we are willing to use PhastCons
>    conservation score, *our third question: *as we know PhastCons is for 
> multiple species alignment, but we do pairwise alignment, can we also 
> get/use PhastCons score to select the conserved region?

There is no pairwise PhastCons score computed or displayed on our website.  You 
are certainly welcome to use high-scoring regions to decide what you think is 
"conserved".  I would suggest using the items in the "Most Conserved" track.

This genomewiki page might also be helpful to you in understanding the 
intricacies of the mm9 Conservation track: 
http://genomewiki.ucsc.edu/index.php/Mm9_multiple_alignment


> 4 Suppose we can use PhastCons score. Here goes the procedure what we 
> did to get the conserved region of the pairwise alignment.
>    We click table browser on the alignment page, and we choose 
> parameters like:
>    *group: Comparatics Genomics
>    *track: Conservation
>    *table: phastCons17way
>    * region: positon chr12:30523186-30524385
>    **filter:  dataValue is >= 0.9  our fourth question: is this 
> "dataValue" the threshold for the PhastCons conservation score?*
>    *output format: bed format,
>    With all of these, We get /*Result2*/ as followings.
> 

If you decide to use the phastCons17way (or 30way), I would recommend you use 
the raw data available from our download server here: 
http://hgdownload.cse.ucsc.edu/downloads.html#mouse  Find the assembly you want 
and choose this link: "Conservation scores for alignments of XX vertebrate 
genomes with Mouse".

That said, I suggest that you instead use the "Most Conserved" track (and 
corresponding table: phastConsElements17way (or 30way)).

	This should be enough to point you in the right direction.  Please don't 
hesitate to contact the mail list again if you require further assistance.


Regards,

----------
Ann Zweig
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu


> We are willing to know with our goals, is the procedure correct? If not, 
> could it be so kind of you to help us out? Thanks in advance! :-)
> 
> 
> 
> Many greetings,
> 
> Hong Sun
> 
> *Data1:*
> chr12:30523186-30524385
> chr3:95366249-95367448
> chr12:87772800-87773999
> chr14:68894360-68895559
> chr2:121139669-121140868
> chr19:53192853-53194052
> chr11:45726131-45727330
> chr12:29260496-29261695
> chr5:121854003-121855202
> chr2:52246683-52247882
> chr11:60353173-60354372
> chr15:11850199-11851398
> chr8:27250554-27251753
> chr5:125729944-125731143
> chr17:31365675-31366874
> chr15:103067650-103068849
> chr9:35200747-35201946
> chr4:134544636-134545835
> chr19:29714158-29715357
> chr4:144158764-144159963
> chr17:26235861-26237060
> chr14:120236192-120237391
> chr17:78322989-78324188
> chr13:115579408-115580607
> chr10:41964957-41966156
> chr19:12699353-12700552
> chr14:45739159-45740358
> chr19:60944139-60945338
> chr11:98856334-98857533
> chr7:125355803-125357002
> chr13:41349800-41350999
> chr4:146828776-146829975
> chr1:62636710-62637909
> chr12:9599505-9600704
> chr7:101294530-101295729
> chr14:68912552-68913751
> chr6:115960383-115961582
> chr14:49776557-49777756
> chr4:62045208-62046407
> chr13:95385780-95386979
> chr15:81187829-81189028
> chr6:112912491-112913690
> chr11:67781653-67782852
> chr18:69468890-69470089
> chr5:118287854-118289053
> chr2:157834425-157835624
> chr10:79751314-79752513
> chr2:152023876-152025075
> chr8:110324527-110325726
> chr16:43151087-43152286
> 
> *Results1:*
> Conservation score statistics 
> <http://genome.ucsc.edu/cgi-bin/hgc?hgsid=105549388&g=phastCons17way&i=phastCons17way&c=chr12&l=30523185&r=30524385&o=30523185&db=mm8&parentWigMaf=multiz17way> 
> 
> Capitalize exons based on show bases
> 
> Place cursor over species for alignment detail. Click on 'B' to link to 
> browser for aligned species, click on 'D' to get DNA for aligned species.
> 
> *Components not displayed:* 
> X. tropicalis Elephant Cow Dog Armadillo Chicken Opossum Tetraodon Tenrec Chimp Rhesus Rabbit Zebrafish Rat 
> 
> *Alignment block 1 of 9 in window, 30523186 - 30523549, 364 bps *
> B <http://genome.ucsc.edu/cgi-bin/hgTracks?db=mm8&ct=&position=chr12%3A30523186-30523549> D <http://genome.ucsc.edu/cgi-bin/hgc?o=30523185&g=getDna&i=chr12&c=chr12&l=30523185&r=30523549&db=mm8>  Mouse  agttgagttttatactctcctaggtgctcagtccaatcaagttgagaatcaggatcaactgtcacacctg
> B <http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg18&ct=&position=chr2%3A1727829-1735185> D <http://genome.ucsc.edu/cgi-bin/hgc?o=1727828&g=getDna&i=chr2&c=chr2&l=1727828&r=1735185&db=hg18&hgSeq.revComp=on>  Human  ======================================================================
> 
>      Mouse  ggctccagttccaaacctcacatttaagacctctgctcccttggttgtattgcctaacctggccttcctg
>      Human  ======================================================================
> 
>      Mouse  gctgaagaatggagagactggaaccccagggagaatcagagaactgtataaagtgtcagcattcaatctt
>      Human  ======================================================================
> 
>      Mouse  gcagagtacactctgatgttaacctcagggcttcccttgtcttaacgctgtccacgcaaaagccatccca
>      Human  ======================================================================
> 
>      Mouse  tcttccccacaagggttcctcattggcggtgaatgttggagacctcaggaatctctcgctagggagcttc
>      Human  ======================================================================
> 
>      Mouse  tatttctgcagcac
>      Human  ==============
> 
> ................................................
> 
> 
> *Results2:*
> track name="Conservation" description="Vertebrate Multiz Alignment & 
> Conservation"
> # db: 'mm8', track: 'phastCons17way', output date: 2008-04-02 08:37:49 UTC
> # chrom specified: chr12
> # position specified: 30523186-30524385
> # data values >= 0.9
> chr12 30524026 30524047 chr12.1
> chr12 30524281 30524382 chr12.2
> 
> 
> 
> 
> 
> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
> 
> _______________________________________________
> Genome maillist  -  Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome


More information about the Genome mailing list