From rhead at soe.ucsc.edu Tue Apr 1 11:52:28 2008 From: rhead at soe.ucsc.edu (Brooke Rhead) Date: Tue, 01 Apr 2008 11:52:28 -0700 Subject: [Genome] colour affy? In-Reply-To: <47EC0641.8070903@soe.ucsc.edu> References: <47EAE968.8080708@soe.ucsc.edu> <47EC0641.8070903@soe.ucsc.edu> Message-ID: <47F2846C.5080604@soe.ucsc.edu> Hi Amanda, We now have a yellow/blue display option for microarray tracks on our test server, at http://genome-test.cse.ucsc.edu/cgi-bin/hgTracks . The option is automatically available on custom microarray tracks. To turn it on, go to the track control page (either by clicking on the blue track name, or by clicking the gray or blue "mini-button" to the far left of the track display), and look for a radio button to switch from a red/green to a yellow/blue display. Note that our test server also contains many experimental and untested tracks. The option should be on our main server in about a week. If you find any problems with the new yellow/blue functionality, or if you have any suggestions for improvement, please let us know! -- Brooke Rhead UCSC Genome Bioinformatics Group Brooke Rhead wrote: > Hello again Amanda, > > Our developers are working on adding a yellow/blue display option for > microarray tracks (including custom tracks) on the main Genome Browser > display page. I will let you know when it is available. > > -- > Brooke Rhead > UCSC Genome Bioinformatics Group > > > Brooke Rhead wrote: >> Hi Amanda, >> >> Ideally, we would have a setting for the main Genome Browser display >> (http://genome.ucsc.edu/cgi-bin/hgTracks) to switch microarray tracks so >> that they display in yellow/blue instead of red/green, similar to the >> settings on the UCSC Genes details pages and the Gene Sorter. We are >> looking into the feasibility of adding that feature. >> >> In the meantime, you could experiment with the only settings that we >> currently use to control track color and shading: the "useScore"/"color" >> track line parameters, or the "itemRgb" column in the BED file. I'm not >> sure if these settings will do much on a microarray custom track. >> >> If we are able to implement the yellow/blue to green/red display toggle >> in our main Genome Browser display, I will let you know. >> >> -- >> Brooke Rhead >> UCSC Genome Bioinformatics Group >> >> >> >> Amanda Miotto wrote: >>> Dear Brooke >>> >>> Is there any way around the limitations for shading? Idealy, I was hoping >>> to have a system that ranged from yellow to blue, similar to the microarray >>> track, but even a single colour yellow shading would be fantastic. Sorry to >>> be so fussy, as we are catering for colour-blindness. >>> Thank you! >>> >>> >>> >>> >>> Hello A. Miotto, >>> >>> Microarray track coloring is controlled by the "expScale" and "expStep" >>> parameters. These are described in our genomewiki, here: >>> http://genomewiki.ucsc.edu/index.php/Microarray_track >>> >>> The only colors available for shading based on score are blue, brown, >>> and black (using the track line parameters "useScore" and "color"): >>> http://genome.ucsc.edu/goldenPath/help/customTrack.html#TRACK >>> Also see this previously-answered question on the topic: >>> http://www.soe.ucsc.edu/pipermail/genome/2005-March/006980.html >>> >>> I am not familiar with cURL, but perhaps this section of the User's >>> Guide that describes constructing URLs to custom tracks will give you >>> the information you need: >>> http://genome.ucsc.edu/goldenPath/help/customTrack.html#SHARE >>> If this does not answer your question, please feel free write back to >>> the mailing list address. >>> >>> I hope this information is helpful. >>> >>> -- >>> Brooke Rhead >>> UCSC Genome Bioinformatics Group >>> >>> >>> Amanda Miotto wrote: >>>> Dear Sir/Madam >>>> >>>> In regards to the genome browser, I am attempting to upload my own tracks >>>> to view and have some queries:- >>>> -What parameters are used for colouring the Affy tissue data? >>>> -Are there any other colours other than blue, black and brown that offer >>>> shading in a single track? >>>> -I am trying to write an automated script to upload my tracks >>> automatically >>>> to the browser using cURL, >>>> and was curious on how your browser inputs the filepath for the uploaded >>>> files using POST in >>>> your source? Is there a cURL script already available? >>>> >>>> Thank you very much for your time. >>>> >>>> A.Miotto >>>> >>> _______________________________________________ >>> Genome maillist - Genome at soe.ucsc.edu >>> http://www.soe.ucsc.edu/mailman/listinfo/genome >> _______________________________________________ >> Genome maillist - Genome at soe.ucsc.edu >> http://www.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome From guptaas at mail.nih.gov Tue Apr 1 16:19:19 2008 From: guptaas at mail.nih.gov (Gupta, Ashutosh (NIH/NCI) [F]) Date: Tue, 1 Apr 2008 19:19:19 -0400 Subject: [Genome] request for hg18 to hg16 In-Reply-To: <47E819C9.2080408@cse.ucsc.edu> References: <014DBF86B19310419F0DF8910FC564570104F901@nihcesmlbx10.nih.gov> <47E819C9.2080408@cse.ucsc.edu> Message-ID: <014DBF86B19310419F0DF8910FC564570104FA6A@nihcesmlbx10.nih.gov> Hi, I am having problem with conversions across different builds. I have the liftOver tool for Mac OS X & all the relevant chain files. Any help on this would be appreciated. Thanks, Ashutosh. -----Original Message----- From: Kayla Smith [mailto:kayla at soe.ucsc.edu] Sent: Monday, March 24, 2008 5:15 PM To: Gupta, Ashutosh (NIH/NCI) [F] Cc: genome at soe.ucsc.edu Subject: Re: [Genome] request for hg18 to hg16 Hello Ashutosh, You can use our online liftOver tool to convert from hg18 to hg17, and then from hg17 to hg16. Here is the link: http://genome.ucsc.edu/cgi-bin/hgLiftOver See this FAQ on downloading our source: http://genome.ucsc.edu/FAQ/FAQdownloads#download27 I hope this information is helpful to you. Please don't hesitate to contact us again if you require further assistance. Kayla Smith UCSC Genome Bioinformatics Group Gupta, Ashutosh (NIH/NCI) [F] wrote: > Hi, > > Would it be possible to get a liftover file for conversion from hg18 to > hg16? > > Also, is there any windows based conversion mechanism? I need to convert > about 50 nimblegen encode array hybridization, a windows based tool > would be very helpful. > > Even the conversion source code in C (or in Mathematica or Matlab) would > be very helpful. > > Thanks, > > Ashutosh. > > > > PS: I can also help develop a tool for windows system, depends on > complexity & time though. I am sure a lot of people would find it > useful. > > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome From rhead at soe.ucsc.edu Tue Apr 1 17:29:38 2008 From: rhead at soe.ucsc.edu (Brooke Rhead) Date: Tue, 01 Apr 2008 17:29:38 -0700 Subject: [Genome] request for hg18 to hg16 In-Reply-To: <014DBF86B19310419F0DF8910FC564570104FA6A@nihcesmlbx10.nih.gov> References: <014DBF86B19310419F0DF8910FC564570104F901@nihcesmlbx10.nih.gov> <47E819C9.2080408@cse.ucsc.edu> <014DBF86B19310419F0DF8910FC564570104FA6A@nihcesmlbx10.nih.gov> Message-ID: <47F2D372.2060302@soe.ucsc.edu> Hi Ashutosh, What kind of problem are you experiencing? If you just need instructions on how to use the command-line tool, you can run the liftOver command with no arguments to see instructions. It should look something like this: ----- $ liftOver liftOver - Move annotations from one assembly to another usage: liftOver oldFile map.chain newFile unMapped oldFile and newFile are in bed format by default, but can be in GFF and maybe eventually others with the appropriate flags below. The map.chain file has the old genome as the target and the new genome as the query. *********************************************************************** WARNING: liftOver was only designed to work between different assemblies of the same organism, it may not do what you want if you are lifting between different organisms. *********************************************************************** options: -minMatch=0.N Minimum ratio of bases that must remap. Default 0.95 -gff File is in gff/gtf format. Note that the gff lines are converted separately. It would be good to have a separate check after this that the lines that make up a gene model still make a plausible gene after liftOver -genePred - File is in genePred format -sample - File is in sample format -bedPlus=N - File is bed N+ format -positions - File is in browser "position" format -hasBin - File has bin value (used only with -bedPlus) -tab - Separate by tabs rather than space (used only with -bedPlus) -pslT - File is in psl format, map target side only -minBlocks=0.N Minimum ratio of alignment blocks/exons that must map (default 1.00) -fudgeThick If thickStart/thickEnd is not mapped, use the closest mapped base. Recommended if using -minBlocks. -multiple Allow multiple output regions -minChainT, -minChainQ Minimum chain size in target/query, when mapping to multiple output regions (default 0, 0) -minSizeT deprecated synonym for -minChainT (ENCODE compat.) -minSizeQ Min matching region size in query with -multiple. -chainTable Used with -multiple, format is db.tablename, to extend chains from net (preserves dups) -errorHelp Explain error messages ----- If you are only converting 50 positions from hg18 to hg16, it might be easier to use the web-based tool, as Kayla suggested. (Or did I misunderstand your original question, and you need to convert many more than 50 positions?) -- Brooke Rhead UCSC Genome Bioinformatics Group Gupta, Ashutosh (NIH/NCI) [F] wrote: > Hi, > > I am having problem with conversions across different builds. > > I have the liftOver tool for Mac OS X & all the relevant chain files. > > Any help on this would be appreciated. > > Thanks, > > Ashutosh. > > -----Original Message----- > From: Kayla Smith [mailto:kayla at soe.ucsc.edu] > Sent: Monday, March 24, 2008 5:15 PM > To: Gupta, Ashutosh (NIH/NCI) [F] > Cc: genome at soe.ucsc.edu > Subject: Re: [Genome] request for hg18 to hg16 > > > Hello Ashutosh, > > You can use our online liftOver tool to convert from hg18 to hg17, and > then from hg17 to hg16. Here is the link: > http://genome.ucsc.edu/cgi-bin/hgLiftOver > > See this FAQ on downloading our source: > http://genome.ucsc.edu/FAQ/FAQdownloads#download27 > > I hope this information is helpful to you. Please don't hesitate to > contact us again if you require further assistance. > > Kayla Smith > UCSC Genome Bioinformatics Group > > > Gupta, Ashutosh (NIH/NCI) [F] wrote: >> Hi, >> >> Would it be possible to get a liftover file for conversion from hg18 > to >> hg16? >> >> Also, is there any windows based conversion mechanism? I need to > convert >> about 50 nimblegen encode array hybridization, a windows based tool >> would be very helpful. >> >> Even the conversion source code in C (or in Mathematica or Matlab) > would >> be very helpful. >> >> Thanks, >> >> Ashutosh. >> >> >> >> PS: I can also help develop a tool for windows system, depends on >> complexity & time though. I am sure a lot of people would find it >> useful. >> >> _______________________________________________ >> Genome maillist - Genome at soe.ucsc.edu >> http://www.soe.ucsc.edu/mailman/listinfo/genome > > > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome From guptaas at mail.nih.gov Tue Apr 1 18:59:28 2008 From: guptaas at mail.nih.gov (Gupta, Ashutosh (NIH/NCI) [F]) Date: Tue, 1 Apr 2008 21:59:28 -0400 Subject: [Genome] request for hg18 to hg16 In-Reply-To: <47F2D372.2060302@soe.ucsc.edu> References: <014DBF86B19310419F0DF8910FC564570104F901@nihcesmlbx10.nih.gov> <47E819C9.2080408@cse.ucsc.edu> <014DBF86B19310419F0DF8910FC564570104FA6A@nihcesmlbx10.nih.gov> <47F2D372.2060302@soe.ucsc.edu> Message-ID: <014DBF86B19310419F0DF8910FC564570104FA6D@nihcesmlbx10.nih.gov> Thanks a lot for the quick reply. Please have a look at the attached snapshot of my liftOver session. I am not sure where am I going wrong. I have tried several different formats, but the program never recognized the files. The files were definitely there as I could open them using other applications. I had also ensured that the data is in the recommended BED format. Thanks again for your help. Regards, Ashutosh. PS: Also, I notice that you are just typing liftOver from the command promt, which never worked for me. I always got the error-"command not found". So I had to use the strategy as in the attached file. Is there some problem with the installation of the file? I am a windows user & relatively new to mac/unix system. -----Original Message----- From: Brooke Rhead [mailto:rhead at soe.ucsc.edu] Sent: Tuesday, April 01, 2008 8:30 PM To: Gupta, Ashutosh (NIH/NCI) [F] Cc: genome at soe.ucsc.edu Subject: Re: [Genome] request for hg18 to hg16 Hi Ashutosh, What kind of problem are you experiencing? If you just need instructions on how to use the command-line tool, you can run the liftOver command with no arguments to see instructions. It should look something like this: ----- $ liftOver liftOver - Move annotations from one assembly to another usage: liftOver oldFile map.chain newFile unMapped oldFile and newFile are in bed format by default, but can be in GFF and maybe eventually others with the appropriate flags below. The map.chain file has the old genome as the target and the new genome as the query. *********************************************************************** WARNING: liftOver was only designed to work between different assemblies of the same organism, it may not do what you want if you are lifting between different organisms. *********************************************************************** options: -minMatch=0.N Minimum ratio of bases that must remap. Default 0.95 -gff File is in gff/gtf format. Note that the gff lines are converted separately. It would be good to have a separate check after this that the lines that make up a gene model still make a plausible gene after liftOver -genePred - File is in genePred format -sample - File is in sample format -bedPlus=N - File is bed N+ format -positions - File is in browser "position" format -hasBin - File has bin value (used only with -bedPlus) -tab - Separate by tabs rather than space (used only with -bedPlus) -pslT - File is in psl format, map target side only -minBlocks=0.N Minimum ratio of alignment blocks/exons that must map (default 1.00) -fudgeThick If thickStart/thickEnd is not mapped, use the closest mapped base. Recommended if using -minBlocks. -multiple Allow multiple output regions -minChainT, -minChainQ Minimum chain size in target/query, when mapping to multiple output regions (default 0, 0) -minSizeT deprecated synonym for -minChainT (ENCODE compat.) -minSizeQ Min matching region size in query with -multiple. -chainTable Used with -multiple, format is db.tablename, to extend chains from net (preserves dups) -errorHelp Explain error messages ----- If you are only converting 50 positions from hg18 to hg16, it might be easier to use the web-based tool, as Kayla suggested. (Or did I misunderstand your original question, and you need to convert many more than 50 positions?) -- Brooke Rhead UCSC Genome Bioinformatics Group Gupta, Ashutosh (NIH/NCI) [F] wrote: > Hi, > > I am having problem with conversions across different builds. > > I have the liftOver tool for Mac OS X & all the relevant chain files. > > Any help on this would be appreciated. > > Thanks, > > Ashutosh. > > -----Original Message----- > From: Kayla Smith [mailto:kayla at soe.ucsc.edu] > Sent: Monday, March 24, 2008 5:15 PM > To: Gupta, Ashutosh (NIH/NCI) [F] > Cc: genome at soe.ucsc.edu > Subject: Re: [Genome] request for hg18 to hg16 > > > Hello Ashutosh, > > You can use our online liftOver tool to convert from hg18 to hg17, and > then from hg17 to hg16. Here is the link: > http://genome.ucsc.edu/cgi-bin/hgLiftOver > > See this FAQ on downloading our source: > http://genome.ucsc.edu/FAQ/FAQdownloads#download27 > > I hope this information is helpful to you. Please don't hesitate to > contact us again if you require further assistance. > > Kayla Smith > UCSC Genome Bioinformatics Group > > > Gupta, Ashutosh (NIH/NCI) [F] wrote: >> Hi, >> >> Would it be possible to get a liftover file for conversion from hg18 > to >> hg16? >> >> Also, is there any windows based conversion mechanism? I need to > convert >> about 50 nimblegen encode array hybridization, a windows based tool >> would be very helpful. >> >> Even the conversion source code in C (or in Mathematica or Matlab) > would >> be very helpful. >> >> Thanks, >> >> Ashutosh. >> >> >> >> PS: I can also help develop a tool for windows system, depends on >> complexity & time though. I am sure a lot of people would find it >> useful. >> >> _______________________________________________ >> Genome maillist - Genome at soe.ucsc.edu >> http://www.soe.ucsc.edu/mailman/listinfo/genome > > > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome From lillioja at uow.edu.au Tue Apr 1 18:10:01 2008 From: lillioja at uow.edu.au (Stephen Lillioja) Date: Wed, 2 Apr 2008 12:10:01 +1100 Subject: [Genome] SNP Function identification - possible error Message-ID: <200804020109.AJI80977@jinn.its.uow.edu.au> I've been looking at the SNPs for the BRCA1 gene on chromosome 17. The SNPs seem to be colour coded OK but when you look at the individual SNP under 'Function' they all say they are 'untranslated, intron', I'm sure there must be an error here. Regards, Stephen Lillioja Stephen Lillioja MB ChB(Otago) MD (UNSW) FRACP Grad Cert H Ed Professor, Health and Behavioural Sciences Professor, Graduate School of Medicine Faculty of Health and Behavioural Sciences, University of Wollongong Northfields Avenue, Wollongong NSW 2522 Australia phone: 61 2 4221 5055 FAX: 61 2 4221 5850 Mobile: 0419 780 826 email: lillioja at uow.edu.au From a.miotto at griffith.edu.au Tue Apr 1 23:32:25 2008 From: a.miotto at griffith.edu.au (Amanda Miotto) Date: Wed, 2 Apr 2008 16:32:25 +1000 Subject: [Genome] search query Message-ID: For the Griffith mirror of the UCSC genome browser, I am looking to alter the position bar so that I can search by the EnsemblID, Probeset ID or Illumina search key. Would you be able to point me in the correct direction for where in the source the search algorithm is, or is there any preceding function set up for another mirror? And thank you for your assistance with the Affy colours, it is greatly appreciated! A.Miotto a.miotto at griffith.edu.au From rhead at soe.ucsc.edu Wed Apr 2 00:15:21 2008 From: rhead at soe.ucsc.edu (Brooke Rhead) Date: Wed, 02 Apr 2008 00:15:21 -0700 Subject: [Genome] request for hg18 to hg16 In-Reply-To: <014DBF86B19310419F0DF8910FC564570104FA6D@nihcesmlbx10.nih.gov> References: <014DBF86B19310419F0DF8910FC564570104F901@nihcesmlbx10.nih.gov> <47E819C9.2080408@cse.ucsc.edu> <014DBF86B19310419F0DF8910FC564570104FA6A@nihcesmlbx10.nih.gov> <47F2D372.2060302@soe.ucsc.edu> <014DBF86B19310419F0DF8910FC564570104FA6D@nihcesmlbx10.nih.gov> Message-ID: <47F33289.1020608@soe.ucsc.edu> Hi Ashutosh, I see these lines in your attached file: nci-admins-computer-2:~ levensd$ /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver new.tsv hg17ToHg16.over.chain ne2 unMapped Can't find file: new.tsv nci-admins-computer-2:~ levensd$ /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver new.tsv hg17ToHg16.over.chain ne2 unMapped Can't find file: new.tsv nci-admins-computer-2:~ levensd$ /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver new hg17ToHg16.over.chain ne2 unMapped Can't find file: new nci-admins-computer-2:~ levensd$ /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver new.tsv hg17ToHg16.over.chain ne2 unMapped Can't find file: new.tsv nci-admins-computer-2:~ levensd$ /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver new.txt hg17ToHg16.over.chain ne2.txt unMapped Can't find file: new.txt nci-admins-computer-2:~ levensd$ \306\222f For comparison, the format for running the liftOver command is: liftOver oldFile map.chain newFile unMapped The first two files, "oldFile" and "map.chain" need to either be present in your current working directory, or else you need to specify the paths to the files. The second two files, "newFile" and "unMapped" do not need to exist already -- the liftOver program will create files with the names you specify. Using your command: liftOver new.tsv hg17ToHg16.over.chain ne2 unMapped liftOver is expecting a BED file of hg17 coordinates in to be present in the current directory, in a file called "new.tsv". The hg17ToHg16.over.chain file should also be in the current directory. LiftOver will create a file containing the corresponding hg16 coordinates in a file called "ne2" in the current directory, and it will create a file called "unMapped" in the current directory and record any hg17 coordinates that did not map to hg16 in that file. Regarding your "PS" question: I see that you presently need to specify the entire path to the liftOver executable to get it to work. This is because the path to liftOver is not in your $PATH variable. If you either (1) move the liftOver executable to a directory that is already in $PATH, or if you (2) add the path where your executable resides (/Volumes/... in your case) to the $PATH variable, you should be able to execute liftOver without specifying the path to it every time. Try the command: echo $PATH to see the directories that are currently in your $PATH variable. I hope this explanation is helpful. -- Brooke Rhead UCSC Genome Bioinformatics Group Gupta, Ashutosh (NIH/NCI) [F] wrote: > Thanks a lot for the quick reply. > Please have a look at the attached snapshot of my liftOver session. > I am not sure where am I going wrong. I have tried several different > formats, but the program never recognized the files. The files were > definitely there as I could open them using other applications. > > I had also ensured that the data is in the recommended BED format. > > Thanks again for your help. > Regards, > Ashutosh. > > PS: Also, I notice that you are just typing liftOver from the command > promt, which never worked for me. I always got the error-"command not > found". So I had to use the strategy as in the attached file. Is there > some problem with the installation of the file? I am a windows user & > relatively new to mac/unix system. > > -----Original Message----- > From: Brooke Rhead [mailto:rhead at soe.ucsc.edu] > Sent: Tuesday, April 01, 2008 8:30 PM > To: Gupta, Ashutosh (NIH/NCI) [F] > Cc: genome at soe.ucsc.edu > Subject: Re: [Genome] request for hg18 to hg16 > > Hi Ashutosh, > > What kind of problem are you experiencing? > > If you just need instructions on how to use the command-line tool, you > can run the liftOver command with no arguments to see instructions. It > should look something like this: > > > ----- > $ liftOver > > liftOver - Move annotations from one assembly to another > usage: > liftOver oldFile map.chain newFile unMapped > oldFile and newFile are in bed format by default, but can be in GFF and > maybe eventually others with the appropriate flags below. > The map.chain file has the old genome as the target and the new genome > as the query. > > *********************************************************************** > WARNING: liftOver was only designed to work between different > assemblies of the same organism, it may not do what you want > if you are lifting between different organisms. > *********************************************************************** > > options: > -minMatch=0.N Minimum ratio of bases that must remap. Default 0.95 > -gff File is in gff/gtf format. Note that the gff lines are > converted > separately. It would be good to have a separate check after > this > that the lines that make up a gene model still make a > plausible gene > after liftOver > -genePred - File is in genePred format > -sample - File is in sample format > -bedPlus=N - File is bed N+ format > -positions - File is in browser "position" format > -hasBin - File has bin value (used only with -bedPlus) > -tab - Separate by tabs rather than space (used only with -bedPlus) > -pslT - File is in psl format, map target side only > -minBlocks=0.N Minimum ratio of alignment blocks/exons that must map > (default 1.00) > -fudgeThick If thickStart/thickEnd is not mapped, use the closest > mapped base. Recommended if using -minBlocks. > -multiple Allow multiple output regions > -minChainT, -minChainQ Minimum chain size in target/query, when > mapping > to multiple output regions (default 0, 0) > -minSizeT deprecated synonym for -minChainT (ENCODE > compat.) > -minSizeQ Min matching region size in query with > -multiple. > -chainTable Used with -multiple, format is db.tablename, > to extend chains from net (preserves > dups) > -errorHelp Explain error messages > > ----- > > If you are only converting 50 positions from hg18 to hg16, it might be > easier to use the web-based tool, as Kayla suggested. (Or did I > misunderstand your original question, and you need to convert many more > than 50 positions?) > > -- > Brooke Rhead > UCSC Genome Bioinformatics Group > > > Gupta, Ashutosh (NIH/NCI) [F] wrote: > >> Hi, >> >> I am having problem with conversions across different builds. >> >> I have the liftOver tool for Mac OS X & all the relevant chain files. >> >> Any help on this would be appreciated. >> >> Thanks, >> >> Ashutosh. >> >> -----Original Message----- >> From: Kayla Smith [mailto:kayla at soe.ucsc.edu] >> Sent: Monday, March 24, 2008 5:15 PM >> To: Gupta, Ashutosh (NIH/NCI) [F] >> Cc: genome at soe.ucsc.edu >> Subject: Re: [Genome] request for hg18 to hg16 >> >> >> Hello Ashutosh, >> >> You can use our online liftOver tool to convert from hg18 to hg17, and >> > > >> then from hg17 to hg16. Here is the link: >> http://genome.ucsc.edu/cgi-bin/hgLiftOver >> >> See this FAQ on downloading our source: >> http://genome.ucsc.edu/FAQ/FAQdownloads#download27 >> >> I hope this information is helpful to you. Please don't hesitate to >> contact us again if you require further assistance. >> >> Kayla Smith >> UCSC Genome Bioinformatics Group >> >> >> Gupta, Ashutosh (NIH/NCI) [F] wrote: >> >>> Hi, >>> >>> Would it be possible to get a liftover file for conversion from hg18 >>> >> to >> >>> hg16? >>> >>> Also, is there any windows based conversion mechanism? I need to >>> >> convert >> >>> about 50 nimblegen encode array hybridization, a windows based tool >>> would be very helpful. >>> >>> Even the conversion source code in C (or in Mathematica or Matlab) >>> >> would >> >>> be very helpful. >>> >>> Thanks, >>> >>> Ashutosh. >>> >>> >>> >>> PS: I can also help develop a tool for windows system, depends on >>> complexity & time though. I am sure a lot of people would find it >>> useful. >>> >>> _______________________________________________ >>> Genome maillist - Genome at soe.ucsc.edu >>> http://www.soe.ucsc.edu/mailman/listinfo/genome >>> >> _______________________________________________ >> Genome maillist - Genome at soe.ucsc.edu >> http://www.soe.ucsc.edu/mailman/listinfo/genome >> From granjeau at tagc.univ-mrs.fr Wed Apr 2 04:17:37 2008 From: granjeau at tagc.univ-mrs.fr (Samuel GRANJEAUD - IR/IFR137) Date: Wed, 02 Apr 2008 13:17:37 +0200 Subject: [Genome] Detail about table knownGene Message-ID: <47F36B51.5020603@tagc.univ-mrs.fr> Hello! I am using May2004 assembly. I was wondering what is the rational behind cluster that links together entries in the knownGene table. Best regards. From hong.sun at esat.kuleuven.be Wed Apr 2 01:57:27 2008 From: hong.sun at esat.kuleuven.be (hong sun) Date: Wed, 02 Apr 2008 10:57:27 +0200 Subject: [Genome] information about "phastcons" score References: 1183.129.175.112.92.1156324589.squirrel@serv1.igmors.u-psud.fr Message-ID: <47F34A77.4030906@esat.kuleuven.be> Hello, We are interested in the pairwise alignment between intergenic region of 50 mouse genes and the corresponding intergenic region of human. The 50 intergenic region of mouse genes are as followings in /*Data1*/, what we are doing now is: 1 use UCSC genome browser to browser the chr reigon of our data, with selecting only human to do the pairwise alignment with mouse in the Conservation Track Settings page. 2 then click on the blue area conservation part on the genome browser page, then it gives the alignments like /*Result1*/ format (followings), *our first question: *is this alignment the alignment between mouse intergenic region and the corresponding intergenic region of human? *our second question: *can we download the alignment once but not download each block? 3 beside the pairwise alignment between intergenic region of mouse and human, we are also interested in the conserved region of the pairwise alignment, here we are willing to use PhastCons conservation score, *our third question: *as we know PhastCons is for multiple species alignment, but we do pairwise alignment, can we also get/use PhastCons score to select the conserved region? 4 Suppose we can use PhastCons score. Here goes the procedure what we did to get the conserved region of the pairwise alignment. We click table browser on the alignment page, and we choose parameters like: *group: Comparatics Genomics *track: Conservation *table: phastCons17way * region: positon chr12:30523186-30524385 **filter: dataValue is >= 0.9 our fourth question: is this "dataValue" the threshold for the PhastCons conservation score?* *output format: bed format, With all of these, We get /*Result2*/ as followings. We are willing to know with our goals, is the procedure correct? If not, could it be so kind of you to help us out? Thanks in advance! :-) Many greetings, Hong Sun *Data1:* chr12:30523186-30524385 chr3:95366249-95367448 chr12:87772800-87773999 chr14:68894360-68895559 chr2:121139669-121140868 chr19:53192853-53194052 chr11:45726131-45727330 chr12:29260496-29261695 chr5:121854003-121855202 chr2:52246683-52247882 chr11:60353173-60354372 chr15:11850199-11851398 chr8:27250554-27251753 chr5:125729944-125731143 chr17:31365675-31366874 chr15:103067650-103068849 chr9:35200747-35201946 chr4:134544636-134545835 chr19:29714158-29715357 chr4:144158764-144159963 chr17:26235861-26237060 chr14:120236192-120237391 chr17:78322989-78324188 chr13:115579408-115580607 chr10:41964957-41966156 chr19:12699353-12700552 chr14:45739159-45740358 chr19:60944139-60945338 chr11:98856334-98857533 chr7:125355803-125357002 chr13:41349800-41350999 chr4:146828776-146829975 chr1:62636710-62637909 chr12:9599505-9600704 chr7:101294530-101295729 chr14:68912552-68913751 chr6:115960383-115961582 chr14:49776557-49777756 chr4:62045208-62046407 chr13:95385780-95386979 chr15:81187829-81189028 chr6:112912491-112913690 chr11:67781653-67782852 chr18:69468890-69470089 chr5:118287854-118289053 chr2:157834425-157835624 chr10:79751314-79752513 chr2:152023876-152025075 chr8:110324527-110325726 chr16:43151087-43152286 *Results1:* Conservation score statistics Capitalize exons based on show bases Place cursor over species for alignment detail. Click on 'B' to link to browser for aligned species, click on 'D' to get DNA for aligned species. *Components not displayed:* X. tropicalis Elephant Cow Dog Armadillo Chicken Opossum Tetraodon Tenrec Chimp Rhesus Rabbit Zebrafish Rat *Alignment block 1 of 9 in window, 30523186 - 30523549, 364 bps * B D Mouse agttgagttttatactctcctaggtgctcagtccaatcaagttgagaatcaggatcaactgtcacacctg B D Human ====================================================================== Mouse ggctccagttccaaacctcacatttaagacctctgctcccttggttgtattgcctaacctggccttcctg Human ====================================================================== Mouse gctgaagaatggagagactggaaccccagggagaatcagagaactgtataaagtgtcagcattcaatctt Human ====================================================================== Mouse gcagagtacactctgatgttaacctcagggcttcccttgtcttaacgctgtccacgcaaaagccatccca Human ====================================================================== Mouse tcttccccacaagggttcctcattggcggtgaatgttggagacctcaggaatctctcgctagggagcttc Human ====================================================================== Mouse tatttctgcagcac Human ============== ................................................ *Results2:* track name="Conservation" description="Vertebrate Multiz Alignment & Conservation" # db: 'mm8', track: 'phastCons17way', output date: 2008-04-02 08:37:49 UTC # chrom specified: chr12 # position specified: 30523186-30524385 # data values >= 0.9 chr12 30524026 30524047 chr12.1 chr12 30524281 30524382 chr12.2 Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From Sabine.Endele at humgenet.uni-erlangen.de Wed Apr 2 00:40:24 2008 From: Sabine.Endele at humgenet.uni-erlangen.de (Sabine Endele) Date: Wed, 02 Apr 2008 09:40:24 +0200 Subject: [Genome] appearance of new NCBI Build 36.2 or 36.3 Message-ID: Dear ladies and gentlemen, we are interested in getting some information about the NCBI Build 36.1 (released March 2006), which is used currently in your genome browser. Could you please tell us when a new release (e.g. Build 36.2 or 36.3) appear on your genome gateway? Thank you very much in advance for your assistance. Yours sincerely Sabine Endele Dr. rer. nat. Sabine Endele Direktionsassistentin Humangenetisches Institut Universit?tsklinikum Erlangen Schwabachanlage 10 91054 Erlangen Tel.: ++49 (0)9131-8522020 Fax: ++49(0)9131-8523232 Mail: sendele at humgenet.uni-erlangen.de From agarwal1975 at gmail.com Wed Apr 2 08:55:19 2008 From: agarwal1975 at gmail.com (Ashish Agarwal) Date: Wed, 2 Apr 2008 11:55:19 -0400 Subject: [Genome] url to password protected track file Message-ID: Hi. I am trying to construct a url that references files on our password protected server. For example: http://genome.ucsc.edu/cgi-bin/hgTracks?db=ce4&position=chrI&hgt.customText=http://myserver.org/file.wig However, myserver.org requires a password and an "Authorization Required" message is displayed on the top the genome browser page. In the list archives, I found a message stating that the format protocol://user:password at server.com/somepath can be used, but I cannot get this to work. What exactly is the syntax? I tried providing "protocol://user:password at myserver.org/file.wig" as the value for hgt.customText but that does not work. Without the quotes, I get an "Unrecognized format" error. With the quotes, no error message is displayed but neither is the track. Thank you in advance for any help. From sonjalthammer at yahoo.com Wed Apr 2 09:34:56 2008 From: sonjalthammer at yahoo.com (sonja althammer) Date: Wed, 2 Apr 2008 09:34:56 -0700 (PDT) Subject: [Genome] expIds Message-ID: <965264.22258.qm@web37913.mail.mud.yahoo.com> browsing the expression data from the gnf atlas2 my question arose: are the expression-Ids equal for human and mouse, i.e. do they refer to the same type of tissue? best regards! --------------------------------- You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. From fungazid at yahoo.com Wed Apr 2 10:36:52 2008 From: fungazid at yahoo.com (Fungazid) Date: Wed, 2 Apr 2008 10:36:52 -0700 (PDT) Subject: [Genome] UCSC homologene Message-ID: <880628.92535.qm@web52007.mail.re2.yahoo.com> I would like to ask 2 questions: 1) Is there UCSC-tables that group homologous genes ? More specifically, tables that group together genes that their conservation is measured at the protein level, or mRNA level. for example, the table should look like: groupNumber gene species 1 NM_001011874 mm8 1 NM_001011971 rn4 1 NM_052898 gh18 ... + some conservations measures 2) There are similar tables in homologene database: http://www.ncbi.nlm.nih.gov/sites/entrez?db=homologene&cmd=search&term= Is there a way to link these tables to UCSC refGene tables, or similar UCSC gene tables? Many thanks, Avi ____________________________________________________________________________________ You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. http://tc.deals.yahoo.com/tc/blockbuster/text5.com From galt at soe.ucsc.edu Wed Apr 2 10:40:49 2008 From: galt at soe.ucsc.edu (Galt Barber) Date: Wed, 2 Apr 2008 10:40:49 -0700 (PDT) Subject: [Genome] url to password protected track file In-Reply-To: References: Message-ID: The syntax for URLs that pass the username and password is this: http://user:password at server.com/path/file?options but the additional trick here is that this url is inside the main url (it's just the value of hgt.customText) therefore you need to also escape that url. Here is a link on the url escape mechanism: http://www.december.com/html/spec/esccodes.html and you can write your own little routine for escaping, or probably there is one in the code libraries available to you. Ah, here's a free online URL-encoder that you could use manually: http://www.motobit.com/util/url-encoder.asp -Galt On Wed, 2 Apr 2008, Ashish Agarwal wrote: > Hi. I am trying to construct a url that references files on our password > protected server. For example: > > http://genome.ucsc.edu/cgi-bin/hgTracks?db=ce4&position=chrI&hgt.customText=http://myserver.org/file.wig > > However, myserver.org requires a password and an "Authorization Required" > message is displayed on the top the genome browser page. In the list > archives, I found a message stating that the format > > protocol://user:password at server.com/somepath > > can be used, but I cannot get this to work. What exactly is the syntax? I > tried providing "protocol://user:password at myserver.org/file.wig" as the > value for hgt.customText but that does not work. Without the quotes, I get > an "Unrecognized format" error. With the quotes, no error message is > displayed but neither is the track. > > Thank you in advance for any help. > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome > From ann at soe.ucsc.edu Wed Apr 2 10:43:06 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Wed, 02 Apr 2008 10:43:06 -0700 Subject: [Genome] Detail about table knownGene In-Reply-To: <47F36B51.5020603@tagc.univ-mrs.fr> References: <47F36B51.5020603@tagc.univ-mrs.fr> Message-ID: <47F3C5AA.4040002@soe.ucsc.edu> Hello Samuel, To create the Known Gene track for the May 2004 assembly (hg17), we used a process we call KG II. You can read about the process by pressing on the 'mini-button' to the left of the actual track display, or by clicking on the hyperlinked track name in the track controls (below the display). To cluster together the genes in this track, we used a program from our source code called hgClusterGenes. Then among the genes that overlap, the longest gene is chosen to be the canonical gene to represent the cluster. Since then, we have changed the way we create the Known Gene track. On the next human assembly (hg18), we use the KG III process. You may want to take a look at this track as well for comparison sake. I hope this information is helpful to you. Please don't hesitate to contact the mail list again if you require further assistance. Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu Please feel free to search the Genome mailing list archives by visiting our home page, clicking on "Contact Us", then typing a word or phrase into the search box. On that same page (http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome mailing list. Samuel GRANJEAUD - IR/IFR137 wrote: > Hello! > > I am using May2004 assembly. I was wondering what is the rational behind > cluster that links together entries in the knownGene table. > > Best regards. > > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome From galt at soe.ucsc.edu Wed Apr 2 10:57:11 2008 From: galt at soe.ucsc.edu (Galt Barber) Date: Wed, 2 Apr 2008 10:57:11 -0700 (PDT) Subject: [Genome] url to password protected track file In-Reply-To: References: Message-ID: I should add that this can work on systems that support it for ftp and other protocols as well. We don't support https at this time. Therefore you can't use it in places like the hgt.customText variable. Security note: With http and ftp, you should be aware of the security issues. The username and password given on the URL would be potentially visible to anyone capable of snooping on the tcp connection. Note, however, that if the https secure protocol were involved, nobody could snoop the user/password. -Galt On Wed, 2 Apr 2008, Galt Barber wrote: > > The syntax for URLs that pass the username and password is this: > http://user:password at server.com/path/file?options > > but the additional trick here is that this url is inside > the main url (it's just the value of hgt.customText) > therefore you need to also escape that url. > > Here is a link on the url escape mechanism: > http://www.december.com/html/spec/esccodes.html > and you can write your own little routine for escaping, > or probably there is one in the code libraries > available to you. > > Ah, here's a free online URL-encoder that > you could use manually: > http://www.motobit.com/util/url-encoder.asp > > -Galt > > > On Wed, 2 Apr 2008, Ashish Agarwal wrote: > > > Hi. I am trying to construct a url that references files on our password > > protected server. For example: > > > > http://genome.ucsc.edu/cgi-bin/hgTracks?db=ce4&position=chrI&hgt.customText=http://myserver.org/file.wig > > > > However, myserver.org requires a password and an "Authorization Required" > > message is displayed on the top the genome browser page. In the list > > archives, I found a message stating that the format > > > > protocol://user:password at server.com/somepath > > > > can be used, but I cannot get this to work. What exactly is the syntax? I > > tried providing "protocol://user:password at myserver.org/file.wig" as the > > value for hgt.customText but that does not work. Without the quotes, I get > > an "Unrecognized format" error. With the quotes, no error message is > > displayed but neither is the track. > > > > Thank you in advance for any help. > > _______________________________________________ > > Genome maillist - Genome at soe.ucsc.edu > > http://www.soe.ucsc.edu/mailman/listinfo/genome > > > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome > From pauline at soe.ucsc.edu Wed Apr 2 13:27:18 2008 From: pauline at soe.ucsc.edu (Pauline Fujita) Date: Wed, 02 Apr 2008 13:27:18 -0700 Subject: [Genome] SNP Function identification - possible error Message-ID: <47F3EC26.8050005@soe.ucsc.edu> Hello Stephen, The developer that created this track had the following comment in response to your question: I clicked into a SNP in an exon of BRCA1 (the first match, on chr17), rs1800704, and indeed it has both missense (which implies coding) and intron in its function list. However, clicking through to the NCBI SNP details page: http://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?type=rs&rs=rs1800704 -- scroll down until you see a bunch of coral rows with a few yellow rows interspersed. At the same genomic position, the SNP alleles' functions are annotated with respect to various RefSeq IDs. Most are exonic but there are a few RefSeqs that have an intron there. You can see that in the RefSeq Genes track too. The coloring reflects only the coding functions because there is a color priority order (red > green > blue > gray > black). Coding non-synon is red and intron is black; red overrides all other colors, and black is overridden by any other color. The colors can be changed in the track controls page to change the relative priority of the different functions that may be assigned to the same SNP. At chr17:38,487,901-38,488,200 there is a small exon shared by all RefSeq Genes; I checked the few SNPs there and none have intron in their annotated function list. Hopefully this information was helpful and answers your question. If you have further questions or require clarification feel free to contact the mailing list at genome at soe.ucsc.edu. Regards, Pauline Fujita UCSC Genome Bioinformatics Group http://genome.ucsc.edu Stephen Lillioja wrote: > I've been looking at the SNPs for the BRCA1 gene on chromosome 17. The SNPs > seem to be colour coded OK but when you look at the individual SNP under > 'Function' they all say they are 'untranslated, intron', I'm sure there must > be an error here. > > > > Regards, > > > > Stephen Lillioja > Stephen Lillioja MB ChB(Otago) MD (UNSW) FRACP Grad Cert H Ed > Professor, Health and Behavioural Sciences > Professor, Graduate School of Medicine > Faculty of Health and Behavioural Sciences, > University of Wollongong > Northfields Avenue, > Wollongong NSW 2522 > Australia > phone: 61 2 4221 5055 > FAX: 61 2 4221 5850 > Mobile: 0419 780 826 > email: lillioja at uow.edu.au > > > > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome > From pauline at soe.ucsc.edu Wed Apr 2 13:28:41 2008 From: pauline at soe.ucsc.edu (Pauline Fujita) Date: Wed, 02 Apr 2008 13:28:41 -0700 Subject: [Genome] new identifiers questions Message-ID: <47F3EC79.6090706@soe.ucsc.edu> Hello Juliette, As a follow up to your question we heard back from the developer responsible for assigning the UCSC gene identifiers and he had this additional comment on how the identifiers will change in the future: In general the gene identifiers will change little. The identifiers are in the form "accession.version" (ie. uc001aaa.1). Right now all the "versions" are "1". In cases where a gene has a little extra sequence added to the start or end but is otherwise unchanged the version number will be incremented in the next build, which we are working on currently. New splicing variants and new genes will get a new accession. In some relatively rare cases genes will be dropped as in cases where Genbank or RefSeq records supporting the gene are withdrawn. I hope this information is helpful to you. Please don't hesitate to contact the mail list again if you have further questions. Regards, Pauline Fujita UCSC Genome Bioinformatics Group http://genome.ucsc.edu Juliette Aury Landas wrote: > Good morning, > > I am an intership at the Institut Curie (Paris, France). I just would > like to ask you few questions about the UCSC data, specially about the > new identifiers which recently appeared in files available on ftp > website (ex : uc002ide.1). > I am in charge of a database development. The main goal of it is to link > all aliases of different identifiers (genes, transcripts and proteins). > I also want to keep the history of each identifier. For example I would > like to be able to know which transcripts names was linked to a gene at > a specific date. That is why I am interested in these new UCSC > identifiers : > How do you create these identifiers ? > What is the cardinality ? One UCSC identifier for one gene, or one UCSC > identifier for one transcript, or one UCSC identifier for one > annotation... ? I don't really understand this new nomenclature (numbers > and letters) ? > Is there a link between these identifiers and the NCBI GeneID ? > How does the UCSC identifier change when a Gene Symbol is updated or > when the annotation changes (for example TREX1 was annotated as a single > gene with two non-overlapping coding regions ; now the downstream coding > region is represented by TREX1 and the upstream coding region is > represented by ATRIP) ? > > Thanks in advance. > > Juliette Aury-Landas > > From ann at soe.ucsc.edu Wed Apr 2 13:55:27 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Wed, 02 Apr 2008 13:55:27 -0700 Subject: [Genome] expIds In-Reply-To: <965264.22258.qm@web37913.mail.mud.yahoo.com> References: <965264.22258.qm@web37913.mail.mud.yahoo.com> Message-ID: <47F3F2BF.1090907@soe.ucsc.edu> Hello Sonja, You can find the tissue types for each of the GNF Atlas 2 tracks in the corresponding tables in the hgFixed database: gnfHumanAtlas2AllExps gnfMouseAtlas2AllExps To explore the contents of database tables, use the Table Browser tool on the website ('Tables' from the top blue navigation bar). group: All Tables database: hgFixed table: hgFixed.gnfHumanAtlas2AllExps output format: all fields from selected table The id and name fields contain the information you are looking for. For example, from the hgFixed.gnfHumanAtlas2AllExps table, the first 5 rows are: #id name 0 ColorectalAdenocarcinoma 1 ColorectalAdenocarcinoma 2 2 WHOLEBLOOD 3 WHOLEBLOOD 2 4 BM-CD33+Myeloid You might also be interested in reading the paper associated with this track: http://www.pnas.org/cgi/content/abstract/101/16/6062 I hope this information is helpful to you. Please don't hesitate to contact the mail list again if you require further assistance. Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu Please feel free to search the Genome mailing list archives by visiting our home page, clicking on "Contact Us", then typing a word or phrase into the search box. On that same page (http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome mailing list. sonja althammer wrote: > browsing the expression data from the gnf atlas2 my question arose: > are the expression-Ids equal for human and mouse, i.e. do they refer to the same type of tissue? > best regards! > > > --------------------------------- > You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome From ann at soe.ucsc.edu Wed Apr 2 14:06:28 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Wed, 02 Apr 2008 14:06:28 -0700 Subject: [Genome] search query In-Reply-To: References: Message-ID: <47F3F554.1020403@soe.ucsc.edu> Hello Amanda, I think you will find everything you need to know about adding search terms in this document: http://genome.ucsc.edu/google/admin/hgFindSpecHowTo.html If, after reading this document, you still have questions, please do not hesitate to contact the mail list again for more detailed instruction. Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu Amanda Miotto wrote: > For the Griffith mirror of the UCSC genome browser, I am looking to alter > the position bar so that I can search by the EnsemblID, Probeset ID or > Illumina search key. Would you be able to point me in the correct direction > for where in the source the search algorithm is, or is there any preceding > function set up for another mirror? And thank you for your assistance with > the Affy colours, it is greatly appreciated! > > A.Miotto > a.miotto at griffith.edu.au > > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome From ann at soe.ucsc.edu Wed Apr 2 14:07:35 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Wed, 02 Apr 2008 14:07:35 -0700 Subject: [Genome] search query In-Reply-To: References: Message-ID: <47F3F597.5050304@soe.ucsc.edu> Hello Amanda, I think you will find everything you need to know about adding search terms in this document: http://genome.ucsc.edu/google/admin/hgFindSpecHowTo.html If, after reading this document, you still have questions, please do not hesitate to contact the mail list again for more detailed instruction. Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu Amanda Miotto wrote: > For the Griffith mirror of the UCSC genome browser, I am looking to alter > the position bar so that I can search by the EnsemblID, Probeset ID or > Illumina search key. Would you be able to point me in the correct direction > for where in the source the search algorithm is, or is there any preceding > function set up for another mirror? And thank you for your assistance with > the Affy colours, it is greatly appreciated! > > A.Miotto > a.miotto at griffith.edu.au > > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome From jayuan2008 at yahoo.com Wed Apr 2 16:00:09 2008 From: jayuan2008 at yahoo.com (Yuan Jian) Date: Wed, 2 Apr 2008 16:00:09 -0700 (PDT) Subject: [Genome] table Message-ID: <415108.19920.qm@web46002.mail.sp1.yahoo.com> Hi UCSC, Ensembl has exon id like ENSEXXXX. does UCSC have own exon id? yuan --------------------------------- You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. From ann at soe.ucsc.edu Wed Apr 2 16:10:13 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Wed, 02 Apr 2008 16:10:13 -0700 Subject: [Genome] UCSC homologene Message-ID: <47F41255.9040501@soe.ucsc.edu> Hello Avi, We have several data sets available that show homologous genes. The BlastTab tables show pairwise orthologs between the following organisms: organism tableName -------- --------- human hgBlastTab mouse mmBlastTab rat rnBlastTab zebrafish drBlastTab D. melanogaster dmBlastTab C. elegans ceBlastTab S. cerevisiae scBlastTab Orthologies between human, mouse, and rat are computed by taking the best BLASTP hit, and filtering out non-syntenic hits. For more distant species reciprocal-best BLASTP hits are used. The Conservation tracks show multiple alignments of a number of species along with measures of evolutionary conservation. For example, the Conservation track on the latest human assembly (hg18), shows alignment with 27 other vertebrates. You can read about the details behind any track (description, methods, display, credits, references) by pressing on the 'mini-button' to the left of the actual track display, or by clicking on the hyperlinked track name in the track controls (below the display). You can use the Table Browser tool on our website ('Tables' from the top blue navigation bar) to view the contents of tables. Alternatively, you can download tables from our download sever here: http://hgdownload.cse.ucsc.edu/downloads.html This should be enough to get you started. Please don't hesitate to contact the mail list again if you require further assistance. Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu Please feel free to search the Genome mailing list archives by visiting our home page, clicking on "Contact Us", then typing a word or phrase into the search box. On that same page (http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome mailing list. Fungazid wrote: > I would like to ask 2 questions: > > 1) Is there UCSC-tables that group homologous genes ? > More specifically, tables that group together genes > that their conservation is measured at the protein > level, or mRNA level. > for example, the table should look like: > > groupNumber gene species > 1 NM_001011874 mm8 > 1 NM_001011971 rn4 > 1 NM_052898 gh18 > ... > > + some conservations measures > > > 2) There are similar tables in homologene database: > > http://www.ncbi.nlm.nih.gov/sites/entrez?db=homologene&cmd=search&term= > > Is there a way to link these tables to UCSC refGene > tables, or similar UCSC gene tables? > > Many thanks, Avi > > > > ____________________________________________________________________________________ > You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. > http://tc.deals.yahoo.com/tc/blockbuster/text5.com > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome From ann at soe.ucsc.edu Wed Apr 2 17:37:11 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Wed, 02 Apr 2008 17:37:11 -0700 Subject: [Genome] table In-Reply-To: <415108.19920.qm@web46002.mail.sp1.yahoo.com> References: <415108.19920.qm@web46002.mail.sp1.yahoo.com> Message-ID: <47F426B7.2050103@soe.ucsc.edu> Hello Yuan, We do not assign specific IDs to exons in any of the gene tracks that we create in the genome browser. However, you can use the Table Browser to download the exons of any gene track. Read more about using the Table Browser here: http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html If you have more specific questions, please don't hesitate to contact the mail list again. Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu Yuan Jian wrote: > Hi UCSC, > > Ensembl has exon id like ENSEXXXX. does UCSC have own exon id? > > yuan > > > --------------------------------- > You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome From jayuan2008 at yahoo.com Wed Apr 2 18:00:12 2008 From: jayuan2008 at yahoo.com (Yuan Jian) Date: Wed, 2 Apr 2008 18:00:12 -0700 (PDT) Subject: [Genome] table In-Reply-To: <47F426B7.2050103@soe.ucsc.edu> Message-ID: <675766.64359.qm@web46005.mail.sp1.yahoo.com> Ann thanks, can I download a table inlcuding kgID+ENSEXX + chr+strand+exonStart + exonStop? Yuan Ann Zweig wrote: Hello Yuan, We do not assign specific IDs to exons in any of the gene tracks that we create in the genome browser. However, you can use the Table Browser to download the exons of any gene track. Read more about using the Table Browser here: http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html If you have more specific questions, please don't hesitate to contact the mail list again. Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu Yuan Jian wrote: > Hi UCSC, > > Ensembl has exon id like ENSEXXXX. does UCSC have own exon id? > > yuan > > > --------------------------------- > You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome --------------------------------- You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. --------------------------------- You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. From jayuan2008 at yahoo.com Wed Apr 2 23:14:12 2008 From: jayuan2008 at yahoo.com (Yuan Jian) Date: Wed, 2 Apr 2008 23:14:12 -0700 (PDT) Subject: [Genome] snps128 Message-ID: <784634.97489.qm@web46008.mail.sp1.yahoo.com> Hi UCSC, I downloaded SNPs128 from UCSC. the most columns of avHet avHetSE are zero. but I want to get a table with genotype frequency for all alleles. which table should I download? Yu --------------------------------- You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. --------------------------------- You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. From pauline.fujita at gmail.com Wed Apr 2 18:41:31 2008 From: pauline.fujita at gmail.com (Pauline Fujita) Date: Wed, 02 Apr 2008 18:41:31 -0700 Subject: [Genome] appearance of new NCBI Build 36.2 or 36.3 Message-ID: <47F435CB.9010401@soe.ucsc.edu> From vmittal3 at mail.gatech.edu Wed Apr 2 19:38:08 2008 From: vmittal3 at mail.gatech.edu (Vinay Kumar Mittal) Date: Wed, 02 Apr 2008 22:38:08 -0400 Subject: [Genome] regarding UCSC genes (Human) Message-ID: <1207190288.47f4431037e7d@webmail.mail.gatech.edu> Hi, I have downloaded known UCSC gene table (for Humans) from table browser. Is there anyway to find description for all these genes in a tabular format. I know about refseqSummary.txt.gz, but all UCSC genes are not present in this description file. Thanks -- Vinay Kumar Mittal, School of Biology, Georgia Institute of Technology From ann at soe.ucsc.edu Thu Apr 3 08:48:29 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Thu, 03 Apr 2008 08:48:29 -0700 Subject: [Genome] regarding UCSC genes (Human) In-Reply-To: <1207190288.47f4431037e7d@webmail.mail.gatech.edu> References: <1207190288.47f4431037e7d@webmail.mail.gatech.edu> Message-ID: <47F4FC4D.6050501@soe.ucsc.edu> Hello Vinay, The table you should use is the kgXref table. It contains (among other things) the UCSC ID, the Gene Symbol, and the short description you are looking for. For example, for the GABRA3 gene: mysql> select * from kgXref where geneSymbol='GABRA3'\G *************************** 1. row *************************** kgID: uc004ffn.1 mRNA: NM_000808 spID: P34903 spDisplayID: GBRA3_HUMAN geneSymbol: GABRA3 refseq: NM_000808 protAcc: NP_000799 description: gamma-aminobutyric acid A receptor, alpha 3 Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu Please feel free to search the Genome mailing list archives by visiting our home page, clicking on "Contact Us", then typing a word or phrase into the search box. On that same page (http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome mailing list. Vinay Kumar Mittal wrote: > Hi, > > I have downloaded known UCSC gene table (for Humans) from table browser. Is > there anyway to find description for all these genes in a tabular format. I > know about refseqSummary.txt.gz, but all UCSC genes are not present in this > description file. > > Thanks > -- > Vinay Kumar Mittal, > School of Biology, > Georgia Institute of Technology > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome From sonjalthammer at yahoo.com Thu Apr 3 09:07:33 2008 From: sonjalthammer at yahoo.com (sonja althammer) Date: Thu, 3 Apr 2008 09:07:33 -0700 (PDT) Subject: [Genome] expIds In-Reply-To: <47F4FBA5.8080501@soe.ucsc.edu> Message-ID: <912882.51297.qm@web37907.mail.mud.yahoo.com> hello! thanks, but i already got the table you printed for me... my question refered to something else: in the table hgFixed.gnfHumanAtlas2AllExps there are IDs in the range from 0 to 157 while there are just 79 expIds for human in the gnf atlas2 for expression and regulation. for mouse it is analog... so is it correct to conclude that 0 ColorectalAdenocarcinoma and 1 ColorectalAdenocarcinoma 2 refer to the exprId 1 2,3 refer to exprId 2 4,5 refer to exprId 3 ... 156,157 refer to exprId 78 you understand what i mean? basically i need to get the expression-values from all common tissues in mouse and human or rather the conservation of the expression in all common tissues... best regards! sonja Ann Zweig wrote: Hello again Sonja, I have printed the contents of the tables for your use (see below). This should help you correlate between mouse and human. In the future, please direct your questions to the genome mailing list at genome at soe.ucsc.edu -- our moderated forum for user questions and discussion. You will likely get a quicker response to your question. Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu hgFixed.gnfMouseAtlas2AllExps #id name 0 substantia nigra 1 substantia nigra 2 2 spinal cord upper 3 spinal cord upper 2 4 spinal cord lower 5 spinal cord lower 2 6 hypothalamus 7 hypothalamus 2 8 preoptic 9 preoptic 2 10 frontal cortex 11 frontal cortex 2 12 cerebral cortex 13 cerebral cortex 2 14 amygdala 15 amygdala 2 16 dorsal striatum 17 dorsal striatum 2 18 hippocampus 19 hippocampus 2 20 olfactory bulb 21 olfactory bulb 2 22 cerebellum 23 cerebellum 2 24 trigeminal 25 trigeminal 2 26 dorsal root ganglia 27 dorsal root ganglia 2 28 pituitary 29 pituitary 2 30 eye 31 eye 2 32 lymph node 33 lymph node 2 34 trachea 35 trachea 2 36 uterus 37 uterus 2 38 ovary 39 ovary 2 40 adipose tissue 41 adipose tissue 2 42 adrenal gland 43 adrenal gland 2 44 bladder 45 bladder 2 46 embryo day 10.5 47 embryo day 10.5 2 48 embryo day 9.5 49 embryo day 9.5 2 50 embryo day 8.5 51 embryo day 8.5 2 52 embryo day 7.5 53 embryo day 7.5 2 54 embryo day 6.5 55 embryo day 6.5 2 56 epidermis 57 epidermis 2 58 digits 59 digits 2 60 snout epidermis 61 snout epidermis 2 62 tongue 63 tongue 2 64 medial olfactory epithelium 65 medial olfactory epithelium 2 66 prostate 67 prostate 2 68 vomeralnasal organ 69 vomeralnasal organ 2 70 lung 71 lung 2 72 umbilical cord 73 umbilical cord 2 74 stomach 75 stomach 2 76 large intestine 77 large intestine 2 78 bone marrow 79 bone marrow 2 80 bone 81 bone 2 82 spleen 83 spleen 2 84 thymus 85 thymus 2 86 B220+ B-cells 87 B220+ B-cells 2 88 CD4+T-cells 89 CD4+T-cells 2 90 CD8+T-cells 91 CD8+T-cells 2 92 brown fat 93 brown fat 2 94 heart 95 heart 2 96 skeletal muscle 97 skeletal muscle 2 98 placenta 99 placenta 2 100 mammary gland (lact) 101 mammary gland (lact) 2 102 blastocysts 103 blastocysts 2 104 kidney 105 kidney 2 106 small intestine 107 small intestine 2 108 salivary gland 109 salivary gland 2 110 thyroid 111 thyroid 2 112 liver 113 liver 2 114 fertilized egg 115 fertilized egg 2 116 oocyte 117 oocyte 2 118 pancreas 119 pancreas 2 120 testis 121 testis 2 hgFixed.gnfHumanAtlas2AllExps #id name 0 ColorectalAdenocarcinoma 1 ColorectalAdenocarcinoma 2 2 WHOLEBLOOD 3 WHOLEBLOOD 2 4 BM-CD33+Myeloid 5 BM-CD33+Myeloid 2 6 PB-CD14+Monocytes 7 PB-CD14+Monocytes 2 8 PB-BDCA4+Dentritic Cells 9 PB-BDCA4+Dentritic Cells 2 10 PB-CD56+NKCells 11 PB-CD56+NKCells 2 12 PB-CD4+Tcells 13 PB-CD4+Tcells 2 14 PB-CD8+Tcells 15 PB-CD8+Tcells 2 16 PB-CD19+Bcells 17 PB-CD19+Bcells 2 18 BM-CD105+Endothelial 19 BM-CD105+Endothelial 2 20 BM-CD34+ 21 BM-CD34+ 2 22 leukemialymphoblastic(molt4) 23 leukemialymphoblastic(molt4) 2 24 721 B lymphoblasts 25 721 B lymphoblasts 2 26 lymphomaburkittsRaji 27 lymphomaburkittsRaji 2 28 leukemiapromyelocytic(hl60) 29 leukemiapromyelocytic(hl60) 2 30 lymphomaburkittsDaudi 31 lymphomaburkittsDaudi 2 32 leukemiachronicmyelogenous(k562) 33 leukemiachronicmyelogenous(k562) 2 34 thymus 35 thymus 2 36 Tonsil 37 Tonsil 2 38 lymphnode 39 lymphnode 2 40 fetalliver 41 fetalliver 2 42 BM-CD71+EarlyErythroid 43 BM-CD71+EarlyErythroid 2 44 bonemarrow 45 bonemarrow 2 46 TemporalLobe 47 TemporalLobe 2 48 globuspallidus 49 globuspallidus 2 50 CerebellumPeduncles 51 CerebellumPeduncles 2 52 cerebellum 53 cerebellum 2 54 caudatenucleus 55 caudatenucleus 2 56 WholeBrain 57 WholeBrain 2 58 ParietalLobe 59 ParietalLobe 2 60 MedullaOblongata 61 MedullaOblongata 2 62 Amygdala 63 Amygdala 2 64 PrefrontalCortex 65 PrefrontalCortex 2 66 OccipitalLobe 67 OccipitalLobe 2 68 Hypothalamus 69 Hypothalamus 2 70 Thalamus 71 Thalamus 2 72 subthalamicnucleus 73 subthalamicnucleus 2 74 CingulateCortex 75 CingulateCortex 2 76 Pons 77 Pons 2 78 spinalcord 79 spinalcord 2 80 fetalbrain 81 fetalbrain 2 82 adrenalgland 83 adrenalgland 2 84 Lung 85 Lung 2 86 Heart 87 Heart 2 88 Liver 89 Liver 2 90 kidney 91 kidney 2 92 Prostate 93 Prostate 2 94 Uterus 95 Uterus 2 96 Thyroid 97 Thyroid 2 98 fetalThyroid 99 fetalThyroid 2 100 fetallung 101 fetallung 2 102 PLACENTA 103 PLACENTA 2 104 CardiacMyocytes 105 CardiacMyocytes 2 106 SmoothMuscle 107 SmoothMuscle 2 108 bronchialepithelialcells 109 bronchialepithelialcells 2 110 ADIPOCYTE 111 ADIPOCYTE 2 112 Pancreas 113 Pancreas 2 114 PancreaticIslets 115 PancreaticIslets 2 116 testis 117 testis 2 118 TestisLeydigCell 119 TestisLeydigCell 2 120 TestisGermCell 121 TestisGermCell 2 122 TestisInterstitial 123 TestisInterstitial 2 124 TestisSeminiferousTubule 125 TestisSeminiferousTubule 2 126 salivarygland 127 salivarygland 2 128 trachea 129 trachea 2 130 AdrenalCortex 131 AdrenalCortex 2 132 Ovary 133 Ovary 2 134 Appendix 135 Appendix 2 136 skin 137 skin 2 138 ciliaryganglion 139 ciliaryganglion 2 140 TrigeminalGanglion 141 TrigeminalGanglion 2 142 atrioventricularnode 143 atrioventricularnode 2 144 DRG 145 DRG 2 146 SuperiorCervicalGanglion 147 SuperiorCervicalGanglion 2 148 SkeletalMuscle 149 SkeletalMuscle 2 150 UterusCorpus 151 UterusCorpus 2 152 TONGUE 153 TONGUE 2 154 OlfactoryBulb 155 OlfactoryBulb 2 156 Pituitary 157 Pituitary 2 sonja althammer wrote: > hello ann! > thanks a lot! i found the tables i was looking for! > but what i still do not understand about the tables is, why #id (in > hgFixed.gnfHumanAtlas2AllExps) is in the range from 0 to 157 while there > are just 79 expIds for human in the gnf atlas2 for expression and > regulation. > for mouse it is analog... > as the tissues appear twice, do i conclude right like this: > to get the according tissue to expId 79 i select #id 156 from > hgFixed.gnfHumanAtlas2AllExps (expId*2 -2) > > in the end i want to work with the intersection of the tissues in human > and mouse... > best regards! > sonja > > */Ann Zweig /* wrote: > > Hello Sonja, > > You can find the tissue types for each of the GNF Atlas 2 tracks in the > corresponding tables in the hgFixed database: > > gnfHumanAtlas2AllExps > gnfMouseAtlas2AllExps > > To explore the contents of database tables, use the Table Browser tool > on the website ('Tables' from the top blue navigation bar). > > group: All Tables > database: hgFixed > table: hgFixed.gnfHumanAtlas2AllExps > > output format: all fields from selected table > > The id and name fields contain the information you are looking for. > For example, from the hgFixed.gnfHumanAtlas2AllExps table, the first 5 > rows are: > > #id name > 0 ColorectalAdenocarcinoma > 1 ColorectalAdenocarcinoma 2 > 2 WHOLEBLOOD > 3 WHOLEBLOOD 2 > 4 BM-CD33+Myeloid > > > You might also be interested in reading the paper associated with this > track: http://www.pnas.org/cgi/content/abstract/101/16/6062 > > I hope this information is helpful to you. Please don't hesitate to > contact the mail list again if you require further assistance. > > Regards, > > ---------- > Ann Zweig > UCSC Genome Bioinformatics Group > http://genome.ucsc.edu > > Please feel free to search the Genome mailing list archives by visiting > our home page, clicking on "Contact Us", then typing a word or phrase > into the search box. On that same page > (http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome > mailing list. > > > > > > > sonja althammer wrote: > > browsing the expression data from the gnf atlas2 my question arose: > > are the expression-Ids equal for human and mouse, i.e. do they > refer to the same type of tissue? > > best regards! > > > > > > --------------------------------- > > You rock. That's why Blockbuster's offering you one month of > Blockbuster Total Access, No Cost. > > _______________________________________________ > > Genome maillist - Genome at soe.ucsc.edu > > http://www.soe.ucsc.edu/mailman/listinfo/genome > > > ------------------------------------------------------------------------ > You rock. That's why Blockbuster's offering you one month of Blockbuster > Total Access > , > No Cost. --------------------------------- You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. From ann at soe.ucsc.edu Thu Apr 3 09:15:47 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Thu, 03 Apr 2008 09:15:47 -0700 Subject: [Genome] table In-Reply-To: <675766.64359.qm@web46005.mail.sp1.yahoo.com> References: <675766.64359.qm@web46005.mail.sp1.yahoo.com> Message-ID: <47F502B3.2090203@soe.ucsc.edu> Hello again Yuan, You can use the Table Browser ('Tables' from the top blue navigation bar) to combine two tables, which will give you the information you are looking for. The knownGene table contains the UCSC kgID. The knownToEnsembl table relates the kgID with the EsemblID. The two tables are related on the name field like so: knownToEnsembl.name <-> knownGene.name knownGene; +------------+ | Field | +------------+ | name | | chrom | | strand | | txStart | | txEnd | | cdsStart | | cdsEnd | | exonCount | | exonStarts | | exonEnds | | proteinID | | alignID | +------------+ knownToEnsembl; +-------+ | Field | +-------+ | name | | value | +-------+ You can use the Table Browser to join these two tables on the name field and get the output you want. Read more about using the Table Browser here: http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu Yuan Jian wrote: > Ann thanks, > > can I download a table inlcuding kgID+ENSEXX + chr+strand+exonStart + > exonStop? > > Yuan > > > */Ann Zweig /* wrote: > > Hello Yuan, > > We do not assign specific IDs to exons in any of the gene tracks > that we create > in the genome browser. However, you can use the Table Browser to > download the > exons of any gene track. Read more about using the Table Browser here: > http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html > > If you have more specific questions, please don't hesitate to > contact the mail > list again. > > > Regards, > > ---------- > Ann Zweig > UCSC Genome Bioinformatics Group > http://genome.ucsc.edu > > > > Yuan Jian wrote: > > Hi UCSC, > > > > Ensembl has exon id like ENSEXXXX. does UCSC have own exon id? > > > > yuan > > > > > > --------------------------------- > > You rock. That's why Blockbuster's offering you one month of > Blockbuster Total Access, No Cost. > > _______________________________________________ > > Genome maillist - Genome at soe.ucsc.edu > > http://www.soe.ucsc.edu/mailman/listinfo/genome > > > ------------------------------------------------------------------------ > You rock. That's why Blockbuster's offering you one month of Blockbuster > Total Access > , > No Cost. > > ------------------------------------------------------------------------ > You rock. That's why Blockbuster's offering you one month of Blockbuster > Total Access > , > No Cost. From ann at soe.ucsc.edu Thu Apr 3 09:45:04 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Thu, 03 Apr 2008 09:45:04 -0700 Subject: [Genome] expIds In-Reply-To: <912882.51297.qm@web37907.mail.mud.yahoo.com> References: <912882.51297.qm@web37907.mail.mud.yahoo.com> Message-ID: <47F50990.9090304@soe.ucsc.edu> Hello Sonja, Sorry that I misunderstood your question. Yes, you have understood the relationship between the table and the display correctly. Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu sonja althammer wrote: > hello! > thanks, but i already got the table you printed for me... > my question refered to something else: > in the table hgFixed.gnfHumanAtlas2AllExps there are IDs in the range from 0 to 157 while there are just 79 expIds for human in the gnf atlas2 for expression and regulation. > for mouse it is analog... > so is it correct to conclude that > > 0 ColorectalAdenocarcinoma and > 1 ColorectalAdenocarcinoma 2 > refer to the exprId 1 > > 2,3 refer to exprId 2 > 4,5 refer to exprId 3 > ... > 156,157 refer to exprId 78 > > you understand what i mean? > basically i need to get the expression-values from all common tissues in mouse and human or rather the conservation of the expression in all common tissues... > best regards! > sonja > > > > Ann Zweig wrote: Hello again Sonja, > > I have printed the contents of the tables for your use (see below). This > should help you correlate between mouse and human. > > In the future, please direct your questions to the genome mailing list at > genome at soe.ucsc.edu -- our moderated forum for user questions and discussion. > You will likely get a quicker response to your question. > > > Regards, > > ---------- > Ann Zweig > UCSC Genome Bioinformatics Group > http://genome.ucsc.edu > > > hgFixed.gnfMouseAtlas2AllExps > #id name > 0 substantia nigra > 1 substantia nigra 2 > 2 spinal cord upper > 3 spinal cord upper 2 > 4 spinal cord lower > 5 spinal cord lower 2 > 6 hypothalamus > 7 hypothalamus 2 > 8 preoptic > 9 preoptic 2 > 10 frontal cortex > 11 frontal cortex 2 > 12 cerebral cortex > 13 cerebral cortex 2 > 14 amygdala > 15 amygdala 2 > 16 dorsal striatum > 17 dorsal striatum 2 > 18 hippocampus > 19 hippocampus 2 > 20 olfactory bulb > 21 olfactory bulb 2 > 22 cerebellum > 23 cerebellum 2 > 24 trigeminal > 25 trigeminal 2 > 26 dorsal root ganglia > 27 dorsal root ganglia 2 > 28 pituitary > 29 pituitary 2 > 30 eye > 31 eye 2 > 32 lymph node > 33 lymph node 2 > 34 trachea > 35 trachea 2 > 36 uterus > 37 uterus 2 > 38 ovary > 39 ovary 2 > 40 adipose tissue > 41 adipose tissue 2 > 42 adrenal gland > 43 adrenal gland 2 > 44 bladder > 45 bladder 2 > 46 embryo day 10.5 > 47 embryo day 10.5 2 > 48 embryo day 9.5 > 49 embryo day 9.5 2 > 50 embryo day 8.5 > 51 embryo day 8.5 2 > 52 embryo day 7.5 > 53 embryo day 7.5 2 > 54 embryo day 6.5 > 55 embryo day 6.5 2 > 56 epidermis > 57 epidermis 2 > 58 digits > 59 digits 2 > 60 snout epidermis > 61 snout epidermis 2 > 62 tongue > 63 tongue 2 > 64 medial olfactory epithelium > 65 medial olfactory epithelium 2 > 66 prostate > 67 prostate 2 > 68 vomeralnasal organ > 69 vomeralnasal organ 2 > 70 lung > 71 lung 2 > 72 umbilical cord > 73 umbilical cord 2 > 74 stomach > 75 stomach 2 > 76 large intestine > 77 large intestine 2 > 78 bone marrow > 79 bone marrow 2 > 80 bone > 81 bone 2 > 82 spleen > 83 spleen 2 > 84 thymus > 85 thymus 2 > 86 B220+ B-cells > 87 B220+ B-cells 2 > 88 CD4+T-cells > 89 CD4+T-cells 2 > 90 CD8+T-cells > 91 CD8+T-cells 2 > 92 brown fat > 93 brown fat 2 > 94 heart > 95 heart 2 > 96 skeletal muscle > 97 skeletal muscle 2 > 98 placenta > 99 placenta 2 > 100 mammary gland (lact) > 101 mammary gland (lact) 2 > 102 blastocysts > 103 blastocysts 2 > 104 kidney > 105 kidney 2 > 106 small intestine > 107 small intestine 2 > 108 salivary gland > 109 salivary gland 2 > 110 thyroid > 111 thyroid 2 > 112 liver > 113 liver 2 > 114 fertilized egg > 115 fertilized egg 2 > 116 oocyte > 117 oocyte 2 > 118 pancreas > 119 pancreas 2 > 120 testis > 121 testis 2 > > > hgFixed.gnfHumanAtlas2AllExps > #id name > 0 ColorectalAdenocarcinoma > 1 ColorectalAdenocarcinoma 2 > 2 WHOLEBLOOD > 3 WHOLEBLOOD 2 > 4 BM-CD33+Myeloid > 5 BM-CD33+Myeloid 2 > 6 PB-CD14+Monocytes > 7 PB-CD14+Monocytes 2 > 8 PB-BDCA4+Dentritic Cells > 9 PB-BDCA4+Dentritic Cells 2 > 10 PB-CD56+NKCells > 11 PB-CD56+NKCells 2 > 12 PB-CD4+Tcells > 13 PB-CD4+Tcells 2 > 14 PB-CD8+Tcells > 15 PB-CD8+Tcells 2 > 16 PB-CD19+Bcells > 17 PB-CD19+Bcells 2 > 18 BM-CD105+Endothelial > 19 BM-CD105+Endothelial 2 > 20 BM-CD34+ > 21 BM-CD34+ 2 > 22 leukemialymphoblastic(molt4) > 23 leukemialymphoblastic(molt4) 2 > 24 721 B lymphoblasts > 25 721 B lymphoblasts 2 > 26 lymphomaburkittsRaji > 27 lymphomaburkittsRaji 2 > 28 leukemiapromyelocytic(hl60) > 29 leukemiapromyelocytic(hl60) 2 > 30 lymphomaburkittsDaudi > 31 lymphomaburkittsDaudi 2 > 32 leukemiachronicmyelogenous(k562) > 33 leukemiachronicmyelogenous(k562) 2 > 34 thymus > 35 thymus 2 > 36 Tonsil > 37 Tonsil 2 > 38 lymphnode > 39 lymphnode 2 > 40 fetalliver > 41 fetalliver 2 > 42 BM-CD71+EarlyErythroid > 43 BM-CD71+EarlyErythroid 2 > 44 bonemarrow > 45 bonemarrow 2 > 46 TemporalLobe > 47 TemporalLobe 2 > 48 globuspallidus > 49 globuspallidus 2 > 50 CerebellumPeduncles > 51 CerebellumPeduncles 2 > 52 cerebellum > 53 cerebellum 2 > 54 caudatenucleus > 55 caudatenucleus 2 > 56 WholeBrain > 57 WholeBrain 2 > 58 ParietalLobe > 59 ParietalLobe 2 > 60 MedullaOblongata > 61 MedullaOblongata 2 > 62 Amygdala > 63 Amygdala 2 > 64 PrefrontalCortex > 65 PrefrontalCortex 2 > 66 OccipitalLobe > 67 OccipitalLobe 2 > 68 Hypothalamus > 69 Hypothalamus 2 > 70 Thalamus > 71 Thalamus 2 > 72 subthalamicnucleus > 73 subthalamicnucleus 2 > 74 CingulateCortex > 75 CingulateCortex 2 > 76 Pons > 77 Pons 2 > 78 spinalcord > 79 spinalcord 2 > 80 fetalbrain > 81 fetalbrain 2 > 82 adrenalgland > 83 adrenalgland 2 > 84 Lung > 85 Lung 2 > 86 Heart > 87 Heart 2 > 88 Liver > 89 Liver 2 > 90 kidney > 91 kidney 2 > 92 Prostate > 93 Prostate 2 > 94 Uterus > 95 Uterus 2 > 96 Thyroid > 97 Thyroid 2 > 98 fetalThyroid > 99 fetalThyroid 2 > 100 fetallung > 101 fetallung 2 > 102 PLACENTA > 103 PLACENTA 2 > 104 CardiacMyocytes > 105 CardiacMyocytes 2 > 106 SmoothMuscle > 107 SmoothMuscle 2 > 108 bronchialepithelialcells > 109 bronchialepithelialcells 2 > 110 ADIPOCYTE > 111 ADIPOCYTE 2 > 112 Pancreas > 113 Pancreas 2 > 114 PancreaticIslets > 115 PancreaticIslets 2 > 116 testis > 117 testis 2 > 118 TestisLeydigCell > 119 TestisLeydigCell 2 > 120 TestisGermCell > 121 TestisGermCell 2 > 122 TestisInterstitial > 123 TestisInterstitial 2 > 124 TestisSeminiferousTubule > 125 TestisSeminiferousTubule 2 > 126 salivarygland > 127 salivarygland 2 > 128 trachea > 129 trachea 2 > 130 AdrenalCortex > 131 AdrenalCortex 2 > 132 Ovary > 133 Ovary 2 > 134 Appendix > 135 Appendix 2 > 136 skin > 137 skin 2 > 138 ciliaryganglion > 139 ciliaryganglion 2 > 140 TrigeminalGanglion > 141 TrigeminalGanglion 2 > 142 atrioventricularnode > 143 atrioventricularnode 2 > 144 DRG > 145 DRG 2 > 146 SuperiorCervicalGanglion > 147 SuperiorCervicalGanglion 2 > 148 SkeletalMuscle > 149 SkeletalMuscle 2 > 150 UterusCorpus > 151 UterusCorpus 2 > 152 TONGUE > 153 TONGUE 2 > 154 OlfactoryBulb > 155 OlfactoryBulb 2 > 156 Pituitary > 157 Pituitary 2 > > sonja althammer wrote: >> hello ann! >> thanks a lot! i found the tables i was looking for! >> but what i still do not understand about the tables is, why #id (in >> hgFixed.gnfHumanAtlas2AllExps) is in the range from 0 to 157 while there >> are just 79 expIds for human in the gnf atlas2 for expression and >> regulation. >> for mouse it is analog... >> as the tissues appear twice, do i conclude right like this: >> to get the according tissue to expId 79 i select #id 156 from >> hgFixed.gnfHumanAtlas2AllExps (expId*2 -2) >> >> in the end i want to work with the intersection of the tissues in human >> and mouse... >> best regards! >> sonja >> >> */Ann Zweig /* wrote: >> >> Hello Sonja, >> >> You can find the tissue types for each of the GNF Atlas 2 tracks in the >> corresponding tables in the hgFixed database: >> >> gnfHumanAtlas2AllExps >> gnfMouseAtlas2AllExps >> >> To explore the contents of database tables, use the Table Browser tool >> on the website ('Tables' from the top blue navigation bar). >> >> group: All Tables >> database: hgFixed >> table: hgFixed.gnfHumanAtlas2AllExps >> >> output format: all fields from selected table >> >> The id and name fields contain the information you are looking for. >> For example, from the hgFixed.gnfHumanAtlas2AllExps table, the first 5 >> rows are: >> >> #id name >> 0 ColorectalAdenocarcinoma >> 1 ColorectalAdenocarcinoma 2 >> 2 WHOLEBLOOD >> 3 WHOLEBLOOD 2 >> 4 BM-CD33+Myeloid >> >> >> You might also be interested in reading the paper associated with this >> track: http://www.pnas.org/cgi/content/abstract/101/16/6062 >> >> I hope this information is helpful to you. Please don't hesitate to >> contact the mail list again if you require further assistance. >> >> Regards, >> >> ---------- >> Ann Zweig >> UCSC Genome Bioinformatics Group >> http://genome.ucsc.edu >> >> Please feel free to search the Genome mailing list archives by visiting >> our home page, clicking on "Contact Us", then typing a word or phrase >> into the search box. On that same page >> (http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome >> mailing list. >> >> >> >> >> >> >> sonja althammer wrote: >> > browsing the expression data from the gnf atlas2 my question arose: >> > are the expression-Ids equal for human and mouse, i.e. do they >> refer to the same type of tissue? >> > best regards! >> > >> > >> > --------------------------------- >> > You rock. That's why Blockbuster's offering you one month of >> Blockbuster Total Access, No Cost. >> > _______________________________________________ >> > Genome maillist - Genome at soe.ucsc.edu >> > http://www.soe.ucsc.edu/mailman/listinfo/genome >> >> >> ------------------------------------------------------------------------ >> You rock. That's why Blockbuster's offering you one month of Blockbuster >> Total Access >> , >> No Cost. > > > > --------------------------------- > You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome From ann at soe.ucsc.edu Thu Apr 3 09:43:24 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Thu, 03 Apr 2008 09:43:24 -0700 Subject: [Genome] snps128 In-Reply-To: <784634.97489.qm@web46008.mail.sp1.yahoo.com> References: <784634.97489.qm@web46008.mail.sp1.yahoo.com> Message-ID: <47F5092C.2070105@soe.ucsc.edu> Hello Yu, We create the SNP128 track from data available through dbSNP at NCBI. NCBI doesn't give avHet and avHetSE for most SNPs -- this page has an explanation: http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpsnpfaq.section.Content.Heterozygosity_Data click on the second question to expand the answer: ----------------------------------------------------------------------- Q: We have sequenced a cDNA containing a base change that corresponds to a SNP in the SNP database, yet dbSNP shows no heterozygosity data for this SNP. Why is this? We compute heterozygosity at dbSNP based on submitted allele frequency for the SNP. In the case of an example SNP, say rs4779794, the frequency data were not directly submitted, so we were unable to compute the heterozygosity value. ----------------------------------------------------------------------- Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu Please feel free to search the Genome mailing list archives by visiting our home page, clicking on "Contact Us", then typing a word or phrase into the search box. On that same page (http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome mailing list. Yuan Jian wrote: > Hi UCSC, > > I downloaded SNPs128 from UCSC. > the most columns of avHet avHetSE are zero. > but I want to get a table with genotype frequency for all alleles. > which table should I download? > > Yu > > > > > --------------------------------- > You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. > > --------------------------------- > You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome From pauline at soe.ucsc.edu Thu Apr 3 10:04:46 2008 From: pauline at soe.ucsc.edu (Pauline Fujita) Date: Thu, 03 Apr 2008 10:04:46 -0700 Subject: [Genome] appearance of new NCBI Build 36.2 or 36.3 Message-ID: <47F50E2E.4030003@soe.ucsc.edu> Hi Everyone, Apologies for the blank message - I'm not sure what happened. The message in my "sent" folder isn't blank. Also, I sent it from my soe using my laptop at home but everyone's replies went to my gmail. Very confusing. Here is the original message I tried to forward, a reply to one of the mailing list questions from yesterday (I forgot to cc' the mailing list in my original reply). Pauline -------- Original Message -------- Subject: Re: [Genome] appearance of new NCBI Build 36.2 or 36.3 Date: Wed, 02 Apr 2008 13:53:40 -0700 From: UCSC Genome Browser Help Desk Reply-To: genome at soe.ucsc.edu Organization: UCSC Genome Browser To: Sabine Endele References: Hello Sabine, We will not be releasing intermediate updates but will wait for the next full assembly release, presumably Build 37 (and this will be released as "hg19"). Hopefully this information was helpful and answers your question. If you have further questions or require clarification feel free to contact the mailing list at genome at soe.ucsc.edu. Regards, Pauline Fujita UCSC Genome Bioinformatics Group http://genome.ucsc.edu Sabine Endele wrote: > Dear ladies and gentlemen, > > we are interested in getting some information about the NCBI Build 36.1 (released March 2006), which is used currently in your genome browser. > Could you please tell us when a new release (e.g. Build 36.2 or 36.3) appear on your genome gateway? > > Thank you very much in advance for your assistance. > > Yours sincerely > > Sabine Endele > > > Dr. rer. nat. Sabine Endele > Direktionsassistentin > Humangenetisches Institut > Universit?tsklinikum Erlangen > Schwabachanlage 10 > 91054 Erlangen > Tel.: ++49 (0)9131-8522020 > Fax: ++49(0)9131-8523232 > Mail: sendele at humgenet.uni-erlangen.de > > > > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome > From jayunit100 at gmail.com Thu Apr 3 10:39:37 2008 From: jayunit100 at gmail.com (Jay Vyas) Date: Thu, 3 Apr 2008 13:39:37 -0400 Subject: [Genome] Using MySQL Api To get a gene's sequence between a start and stop sight... Message-ID: <79ceddbc0804031039p39771f42p68b04f1a30939b5a@mail.gmail.com> Hi guys : Im trying to figure out how run a query through the mysql interface that will return the entire intragenic region of 2 genes. Can somebody post a sample query to do this ? Currently the way I do this - Go to UCSC, enter the gene name (lets call it X). Zoom in on the end of the gene preceding X (lets call this one Y) Click on the end of the Y, write down the end nucleotide number (call this Yend) Click on the beggining of X, write down this number (call it Xbeg) enter in the value Yend-Xbeg in the position text field. click on the image and uncheck all boxes other than 5' UTR Download the sequence in fasta format. -- Thanks !!! Jay Vyas BioSystems Modelling Group/UCHC http://folding.uchc.edu From rhead at soe.ucsc.edu Thu Apr 3 11:25:02 2008 From: rhead at soe.ucsc.edu (Brooke Rhead) Date: Thu, 03 Apr 2008 11:25:02 -0700 Subject: [Genome] snps128 In-Reply-To: <47F5092C.2070105@soe.ucsc.edu> References: <784634.97489.qm@web46008.mail.sp1.yahoo.com> <47F5092C.2070105@soe.ucsc.edu> Message-ID: <47F520FE.7060504@soe.ucsc.edu> Hi Yu, You may also be interested in the information in the HapMap SNPs track, which contains about four million SNPs that were genotyped in four populations (click on the track name to read more details; also see: http://www.hapmap.org/). This track contains allele frequency data for each of the four populations. The tables that contain information on each of the populations are: hapmapSnpsCEU hapmapSnpsCHB hapmapSnpsJPT hapmapSnpsYRI There is also a summary table: hapmapAllelesSummary The 'score' field of hapmapAllelesSummary is a calculation of heterozygosity over all four populations. Note that this is not exactly the same as the average heterozygosity calculated by dbSNP (described here: http://www.ncbi.nlm.nih.gov/SNP/Hetfreq.html). The heterozygosity calculated for the HapMap SNPs track is computed by UCSC as 2*p*q (*1000 for bed scoring), where p is (allele1 / total) and q is (allele2 / total). A score of 500 in the hapmapAllelesSummary.score field corresponds to a heterozygosity of 50%. -- Brooke Rhead UCSC Genome Bioinformatics Group Ann Zweig wrote: > Hello Yu, > > We create the SNP128 track from data available through dbSNP at NCBI. NCBI > doesn't give avHet and avHetSE for most SNPs -- this page has an > explanation: > > http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpsnpfaq.section.Content.Heterozygosity_Data > > click on the second question to expand the answer: > ----------------------------------------------------------------------- > Q: We have sequenced a cDNA containing a base change that corresponds > to a SNP in the SNP database, yet dbSNP shows no heterozygosity data > for this SNP. Why is this? > > We compute heterozygosity at dbSNP based on submitted allele frequency > for the SNP. In the case of an example SNP, say rs4779794, the > frequency data were not directly submitted, so we were unable to > compute the heterozygosity value. > ----------------------------------------------------------------------- > > > Regards, > > ---------- > Ann Zweig > UCSC Genome Bioinformatics Group > http://genome.ucsc.edu > > Please feel free to search the Genome mailing list archives by visiting our home > page, clicking on "Contact Us", then typing a word or phrase into the search > box. On that same page > (http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome mailing > list. > > > > > Yuan Jian wrote: >> Hi UCSC, >> >> I downloaded SNPs128 from UCSC. >> the most columns of avHet avHetSE are zero. >> but I want to get a table with genotype frequency for all alleles. >> which table should I download? >> >> Yu >> >> >> >> >> --------------------------------- >> You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. >> >> --------------------------------- >> You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. >> _______________________________________________ >> Genome maillist - Genome at soe.ucsc.edu >> http://www.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome From sde at mrc-lmb.cam.ac.uk Thu Apr 3 12:02:09 2008 From: sde at mrc-lmb.cam.ac.uk (Subhajyoti De) Date: Thu, 03 Apr 2008 20:02:09 +0100 Subject: [Genome] Viewing Orthologs in Genome browser In-Reply-To: <47F50990.9090304@soe.ucsc.edu> References: <912882.51297.qm@web37907.mail.mud.yahoo.com> <47F50990.9090304@soe.ucsc.edu> Message-ID: <47F529B1.3040906@mrc-lmb.cam.ac.uk> Hi, I am interested in viewing syntenic relationship between my favourite gene and its ortholog in a second species (say mouse). The human gene would be shown along with its neighbouring genomic window (say 10kb) and genes therein. Corresponding mouse gene would also be shown with its neighbouring genomic window and genes therein. Is it possible to somehow show that in the genome browser? cheers, Subho From ann at soe.ucsc.edu Thu Apr 3 13:09:34 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Thu, 03 Apr 2008 13:09:34 -0700 Subject: [Genome] Viewing Orthologs in Genome browser In-Reply-To: <47F529B1.3040906@mrc-lmb.cam.ac.uk> References: <912882.51297.qm@web37907.mail.mud.yahoo.com> <47F50990.9090304@soe.ucsc.edu> <47F529B1.3040906@mrc-lmb.cam.ac.uk> Message-ID: <47F5397E.6090001@soe.ucsc.edu> Hello Subho, It is possible to view orthologous genes in human and mouse, but not in the same display at the same time. But you can easily navigate from one to the other using either of these methods. Method One: Find your gene in the human genome browser. Click on that gene to go to the details page. On this page, click on "Other Species". In this table, click on the "Genome Browser" link in the Mouse column. This will open the mouse genome browser to the orthologous gene. These orthologies between human and mouse are computed by taking the best BLASTP hit, and filtering out non-syntenic hits. Method Two: Find your gene in the human genome browser. Turn on the Mouse Net track. The Mouse Net track shows the best mouse/human chain for every part of the human genome. It is useful for finding orthologous regions and for studying genome rearrangement. In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. Click on the top-level Mouse Net to go to the details page. On this page, click on "Open Mouse browser" at position corresponding to the part of chain that is in this window. the will open the mouse genome browser to the location of the orthology (whether there is a gene there or not). I hope this information is helpful to you. Please don't hesitate to contact the mail list again if you require further assistance. Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu Please feel free to search the Genome mailing list archives by visiting our home page, clicking on "Contact Us", then typing a word or phrase into the search box. On that same page (http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome mailing list. Subhajyoti De wrote: > Hi, > > I am interested in viewing syntenic relationship between my > favourite gene and its ortholog in a second species (say mouse). The > human gene would be shown along with its neighbouring genomic window > (say 10kb) and genes therein. Corresponding mouse gene would also be > shown with its neighbouring genomic window and genes therein. > > Is it possible to somehow show that in the genome browser? > cheers, Subho > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome From fungazid at yahoo.com Thu Apr 3 13:34:32 2008 From: fungazid at yahoo.com (Fungazid) Date: Thu, 3 Apr 2008 13:34:32 -0700 (PDT) Subject: [Genome] UCSC homologene In-Reply-To: <47F41255.9040501@soe.ucsc.edu> Message-ID: <368542.80271.qm@web52011.mail.re2.yahoo.com> Ann, thank you for your nice help, I must say .I have another 2 related questions (I hope it is not too detailed ) 1) what do you mean by: "For more distant species reciprocal-best BLASTP hits are used." ?? For example how can I find the target gene of dog or opossum, for a query mouse gene? 2) About the mmBlastTab table in hg18 database: As far as I understand: a. 'query' field = the name of human gene, b. 'target' = the name of the homologous gene is mouse c. hg18 and mm8 kgXref tables give aliases to the query and taget gene names respectively. I hope this is right Thank you again, Avi --- Ann Zweig wrote: > Hello Avi, > > We have several data sets available that show > homologous genes. > > The BlastTab tables show pairwise orthologs between > the following > organisms: > > organism tableName > -------- --------- > human hgBlastTab > mouse mmBlastTab > rat rnBlastTab > zebrafish drBlastTab > D. melanogaster dmBlastTab > C. elegans ceBlastTab > S. cerevisiae scBlastTab > > Orthologies between human, mouse, and rat are > computed by taking the > best BLASTP hit, and filtering out non-syntenic > hits. For more distant > species reciprocal-best BLASTP hits are used. > > The Conservation tracks show multiple alignments of > a number of species > along with measures of evolutionary conservation. > For example, the > Conservation track on the latest human assembly > (hg18), shows alignment > with 27 other vertebrates. > > You can read about the details behind any track > (description, methods, > display, credits, references) by pressing on the > 'mini-button' to the > left of the actual track display, or by clicking on > the hyperlinked > track name in the track controls (below the > display). > > You can use the Table Browser tool on our website > ('Tables' from the > top blue navigation bar) to view the contents of > tables. Alternatively, > you can download tables from our download sever > here: > http://hgdownload.cse.ucsc.edu/downloads.html > > This should be enough to get you started. Please > don't hesitate to > contact the mail list again if you require further > assistance. > > > Regards, > > ---------- > Ann Zweig > UCSC Genome Bioinformatics Group > http://genome.ucsc.edu > > Please feel free to search the Genome mailing list > archives by visiting > our home page, clicking on "Contact Us", then typing > a word or phrase > into the search box. On that same page > (http://genome.ucsc.edu/contacts.html), you can > subscribe to the Genome > mailing list. > > > Fungazid wrote: > > I would like to ask 2 questions: > > > > 1) Is there UCSC-tables that group homologous > genes ? > > More specifically, tables that group together > genes > > that their conservation is measured at the > protein > > level, or mRNA level. > > for example, the table should look like: > > > > groupNumber gene species > > 1 NM_001011874 mm8 > > 1 NM_001011971 rn4 > > 1 NM_052898 gh18 > > ... > > > > + some conservations measures > > > > > > 2) There are similar tables in homologene > database: > > > > > http://www.ncbi.nlm.nih.gov/sites/entrez?db=homologene&cmd=search&term= > > > > Is there a way to link these tables to UCSC > refGene > > tables, or similar UCSC gene tables? > > > > Many thanks, Avi > > > > > > > > > ____________________________________________________________________________________ > > You rock. That's why Blockbuster's offering you > one month of > Blockbuster Total Access, No Cost. > > > http://tc.deals.yahoo.com/tc/blockbuster/text5.com > > _______________________________________________ > > Genome maillist - Genome at soe.ucsc.edu > > http://www.soe.ucsc.edu/mailman/listinfo/genome > ____________________________________________________________________________________ You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. http://tc.deals.yahoo.com/tc/blockbuster/text5.com From ann at soe.ucsc.edu Thu Apr 3 13:45:27 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Thu, 03 Apr 2008 13:45:27 -0700 Subject: [Genome] Using MySQL Api To get a gene's sequence between a start and stop sight... In-Reply-To: <79ceddbc0804031039p39771f42p68b04f1a30939b5a@mail.gmail.com> References: <79ceddbc0804031039p39771f42p68b04f1a30939b5a@mail.gmail.com> Message-ID: <47F541E7.4060709@soe.ucsc.edu> Hi Jay, If you are just looking for intergenic sequence between two genes, then the method you have described works fine. Once you have the intergenic area displayed in the genome browser, you can also click on the "DNA" link in the top blue navigation bar. This will give you the DNA sequence for the exact position you are viewing in the browser. For ease of navigation, try enabling the "Next/previous item navigation" on the configuration page (the 'configure' button from the genome browser). Read more about this and other configuration options here: http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#FineTuning It is not possible to extract the DNA using the MySQL public server, as the underlying sequence is not kept in a database table. Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu Please feel free to search the Genome mailing list archives by visiting our home page, clicking on "Contact Us", then typing a word or phrase into the search box. On that same page (http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome mailing list. Jay Vyas wrote: > Hi guys : Im trying to figure out how run a query through the mysql > interface that will return the entire intragenic region of 2 genes. Can > somebody post a sample query to do this ? > > Currently the way I do this - > > Go to UCSC, enter the gene name (lets call it X). > Zoom in on the end of the gene preceding X (lets call this one Y) > Click on the end of the Y, write down the end nucleotide number (call > this Yend) > Click on the beggining of X, write down this number (call it Xbeg) > enter in the value Yend-Xbeg in the position text field. > click on the image and uncheck all boxes other than 5' UTR > Download the sequence in fasta format. > > -- > Thanks !!! > Jay Vyas > BioSystems Modelling Group/UCHC > http://folding.uchc.edu > _______________________________________________ > Genome maillist - Genome at soe.ucsc.edu > http://www.soe.ucsc.edu/mailman/listinfo/genome From ann at soe.ucsc.edu Thu Apr 3 14:08:41 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Thu, 03 Apr 2008 14:08:41 -0700 Subject: [Genome] request for hg18 to hg16 In-Reply-To: References: Message-ID: <47F54759.609@soe.ucsc.edu> Hello Ashutosh, Please note that our mail list software strips all attachments sent to the list. That said, I was able to deduce what is happening with your input files. The scores that you have entered in your input file are in the 6th field. They must be moved to the 5th field. The 6th field is reserved for strand (+/-). Please read about the BED format here: http://genome.ucsc.edu/goldenPath/help/customTrack.html#BED You should be able to run all of your conversions at one time. Just place them all in your new.tsv file and run it as Brooke explained previously. Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu Please feel free to search the Genome mailing list archives by visiting our home page, clicking on "Contact Us", then typing a word or phrase into the search box. On that same page (http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome mailing list. > > Hey Brooke, > > Thanks a lot for the help. > > It did work. However I had to use the same strategy as earlier (that is > to give the complete path). > > I could not find the $PATH directories to copy my files over there. > > > > But problem is, that the data values were changed. > > This is before conversion: > > This is after conversion: > > > > Note the special characters & changed score values in the last column. > > > > Exactly the same thing happened when I was using the web version of the > liftOver program. > > How to fix this? I could just take the coordinates from the shifted > file, & scores from the original file, would it be correct to do so? > > > > Also is there a way to run a lot of these liftOver commands at one go? I > need to run about 150 conversions, doing them one by one will take > forever. If I can write all the commands in a single text file, how do I > execute that file Mac? > > > > Thank you so much for your help. I really appreciate it. > > > > Regards, > > Ashutosh. > > PS: In case you can?t see the embedded pictures in this email, I have > attached a copy of this email in the pdf format too. > > > > -----Original Message----- > From: Brooke Rhead [mailto:rhead at soe.ucsc.edu] > Sent: Wednesday, April 02, 2008 3:15 AM > To: Gupta, Ashutosh (NIH/NCI) [F] > Cc: genome at soe.ucsc.edu > Subject: Re: [Genome] request for hg18 to hg16 > > > > Hi Ashutosh, > > > > I see these lines in your attached file: > > > > nci-admins-computer-2:~ levensd$ > > /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver > > new.tsv hg17ToHg16.over.chain ne2 unMapped > > Can't find file: new.tsv > > nci-admins-computer-2:~ levensd$ > > /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver > > new.tsv hg17ToHg16.over.chain ne2 unMapped > > Can't find file: new.tsv > > nci-admins-computer-2:~ levensd$ > > /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver new > > hg17ToHg16.over.chain ne2 unMapped > > Can't find file: new > > nci-admins-computer-2:~ levensd$ > > /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver > > new.tsv hg17ToHg16.over.chain ne2 unMapped > > Can't find file: new.tsv > > nci-admins-computer-2:~ levensd$ > > /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver > > new.txt hg17ToHg16.over.chain ne2.txt unMapped > > Can't find file: new.txt > > nci-admins-computer-2:~ levensd$ \306\222f > > > > > > For comparison, the format for running the liftOver command is: > > > > liftOver oldFile map.chain newFile unMapped > > > > The first two files, "oldFile" and "map.chain" need to either be present > > in your current working directory, or else you need to specify the paths > > to the files. The second two files, "newFile" and "unMapped" do not > > need to exist already -- the liftOver program will create files with the > > names you specify. > > > > Using your command: > > liftOver new.tsv hg17ToHg16.over.chain ne2 unMapped > > > > liftOver is expecting a BED file of hg17 coordinates in to be present in > > the current directory, in a file called "new.tsv". The > > hg17ToHg16.over.chain file should also be in the current directory. > > LiftOver will create a file containing the corresponding hg16 > > coordinates in a file called "ne2" in the current directory, and it will > > create a file called "unMapped" in the current directory and record any > > hg17 coordinates that did not map to hg16 in that file. > > > > Regarding your "PS" question: I see that you presently need to specify > > the entire path to the liftOver executable to get it to work. This is > > because the path to liftOver is not in your $PATH variable. If you > > either (1) move the liftOver executable to a directory that is already > > in $PATH, or if you (2) add the path where your executable resides > > (/Volumes/... in your case) to the $PATH variable, you should be able to > > execute liftOver without specifying the path to it every time. Try the > > command: > > echo $PATH > > to see the directories that are currently in your $PATH variable. > > > > I hope this explanation is helpful. > > > > -- > > Brooke Rhead > > UCSC Genome Bioinformatics Group > > > > > > > > Gupta, Ashutosh (NIH/NCI) [F] wrote: > >> Thanks a lot for the quick reply. > >> Please have a look at the attached snapshot of my liftOver session. > >> I am not sure where am I going wrong. I have tried several different > >> formats, but the program never recognized the files. The files were > >> definitely there as I could open them using other applications. > >> > >> I had also ensured that the data is in the recommended BED format. > >> > >> Thanks again for your help. > >> Regards, > >> Ashutosh. > >> > >> PS: Also, I notice that you are just typing liftOver from the command > >> promt, which never worked for me. I always got the error-"command not > >> found". So I had to use the strategy as in the attached file. Is there > >> some problem with the installation of the file? I am a windows user & > >> relatively new to mac/unix system. > >> > >> -----Original Message----- > >> From: Brooke Rhead [mailto:rhead at soe.ucsc.edu] > >> Sent: Tuesday, April 01, 2008 8:30 PM > >> To: Gupta, Ashutosh (NIH/NCI) [F] > >> Cc: genome at soe.ucsc.edu > >> Subject: Re: [Genome] request for hg18 to hg16 > >> > >> Hi Ashutosh, > >> > >> What kind of problem are you experiencing? > >> > >> If you just need instructions on how to use the command-line tool, you > >> can run the liftOver command with no arguments to see instructions. It > >> should look something like this: > >> > >> > >> ----- > >> $ liftOver > >> > >> liftOver - Move annotations from one assembly to another > >> usage: > >> liftOver oldFile map.chain newFile unMapped > >> oldFile and newFile are in bed format by default, but can be in GFF and > >> maybe eventually others with the appropriate flags below. > >> The map.chain file has the old genome as the target and the new genome > >> as the query. > >> > >> *********************************************************************** > >> WARNING: liftOver was only designed to work between different > >> assemblies of the same organism, it may not do what you want > >> if you are lifting between different organisms. > >> *********************************************************************** > >> > >> options: > >> -minMatch=0.N Minimum ratio of bases that must remap. Default 0.95 > >> -gff File is in gff/gtf format. Note that the gff lines are > >> converted > >> separately. It would be good to have a separate check after > >> this > >> that the lines that make up a gene model still make a > >> plausible gene > >> after liftOver > >> -genePred - File is in genePred format > >> -sample - File is in sample format > >> -bedPlus=N - File is bed N+ format > >> -positions - File is in browser "position" format > >> -hasBin - File has bin value (used only with -bedPlus) > >> -tab - Separate by tabs rather than space (used only with -bedPlus) > >> -pslT - File is in psl format, map target side only > >> -minBlocks=0.N Minimum ratio of alignment blocks/exons that must map > >> (default 1.00) > >> -fudgeThick If thickStart/thickEnd is not mapped, use the closest > >> mapped base. Recommended if using -minBlocks. > >> -multiple Allow multiple output regions > >> -minChainT, -minChainQ Minimum chain size in target/query, when > >> mapping > >> to multiple output regions (default 0, 0) > >> -minSizeT deprecated synonym for -minChainT (ENCODE > >> compat.) > >> -minSizeQ Min matching region size in query with > >> -multiple. > >> -chainTable Used with -multiple, format is db.tablename, > >> to extend chains from net (preserves > >> dups) > >> -errorHelp Explain error messages > >> > >> ----- > >> > >> If you are only converting 50 positions from hg18 to hg16, it might be > >> easier to use the web-based tool, as Kayla suggested. (Or did I > >> misunderstand your original question, and you need to convert many more > >> than 50 positions?) > >> > >> -- > >> Brooke Rhead > >> UCSC Genome Bioinformatics Group > >> > >> > >> Gupta, Ashutosh (NIH/NCI) [F] wrote: > >> > >> > Hi, > >> > > >> > I am having problem with conversions across different builds. > >> > > >> > I have the liftOver tool for Mac OS X & all the relevant chain files. > >> > > >> > Any help on this would be appreciated. > >> > > >> > Thanks, > >> > > >> > Ashutosh. > >> > > >> > -----Original Message----- > >> > From: Kayla Smith [mailto:kayla at soe.ucsc.edu] > >> > Sent: Monday, March 24, 2008 5:15 PM > >> > To: Gupta, Ashutosh (NIH/NCI) [F] > >> > Cc: genome at soe.ucsc.edu > >> > Subject: Re: [Genome] request for hg18 to hg16 > >> > > >> > > >> > Hello Ashutosh, > >> > > >> > You can use our online liftOver tool to convert from hg18 to hg17, and > >> > > >> > >> > >> > then from hg17 to hg16. Here is the link: > >> > http://genome.ucsc.edu/cgi-bin/hgLiftOver > >> > > >> > See this FAQ on downloading our source: > >> > http://genome.ucsc.edu/FAQ/FAQdownloads#download27 > >> > > >> > I hope this information is helpful to you. Please don't hesitate to > >> > contact us again if you require further assistance. > >> > > >> > Kayla Smith > >> > UCSC Genome Bioinformatics Group > >> > > >> > > >> > Gupta, Ashutosh (NIH/NCI) [F] wrote: > >> > > >> >> Hi, > >> >> > >> >> Would it be possible to get a liftover file for conversion from hg18 > >> >> > >> > to > >> > > >> >> hg16? > >> >> > >> >> Also, is there any windows based conversion mechanism? I need to > >> >> > >> > convert > >> > > >> >> about 50 nimblegen encode array hybridization, a windows based tool > >> >> would be very helpful. > >> >> > >> >> Even the conversion source code in C (or in Mathematica or Matlab) > >> >> > >> > would > >> > > >> >> be very helpful. > >> >> > >> >> Thanks, > >> >> > >> >> Ashutosh. > >> >> > >> >> > >> >> > >> >> PS: I can also help develop a tool for windows system, depends on > >> >> complexity & time though. I am sure a lot of people would find it > >> >> useful. > >> >> > >> >> _______________________________________________ > >> >> Genome maillist - Genome at soe.ucsc.edu > >> >> http://www.soe.ucsc.edu/mailman/listinfo/genome > >> >> > >> > _______________________________________________ > >> > Genome maillist - Genome at soe.ucsc.edu > >> > http://www.soe.ucsc.edu/mailman/listinfo/genome > >> > > > > ------------------------------------------------------------------------ > > Subject: > confirm 4d79c0dc8c71eefc5082009115917a5d3cb4ff34 > From: > genome-request at soe.ucsc.edu > > > If you reply to this message, keeping the Subject: header intact, > Mailman will discard the held message. Do this if the message is > spam. If you reply to this message and include an Approved: header > with the list password in it, the message will be approved for posting > to the list. The Approved: header can also appear in the first line > of the body of the reply. From ann at soe.ucsc.edu Thu Apr 3 14:25:48 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Thu, 03 Apr 2008 14:25:48 -0700 Subject: [Genome] UCSC homologene In-Reply-To: <368542.80271.qm@web52011.mail.re2.yahoo.com> References: <368542.80271.qm@web52011.mail.re2.yahoo.com> Message-ID: <47F54B5C.9020703@soe.ucsc.edu> Hello again Avi, We have only created BlastTab tables with pairwise orthologs between the seven organisms that I listed in my previous email. Your best bet for viewing the relationship between the mouse and the dog (or opossum) is to use the Chain/Net tracks. In the mouse genome browser, turn on the Chain and Net tracks to the Dog (or Opossum). The Dog Chain track in the mouse browser shows alignments of dog (canFam2, May 2005) to the mouse genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both dog and mouse simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. You can read about the details behind this or any track (description, methods, display, credits, references) by pressing on the 'mini-button' to the left of the actual track display, or by clicking on the hyperlinked track name in the track controls (below the display). As for question 2, you are correct: query is the species in this database (in this case, hg18), and target is the species to which it is aligned (in this case, mm9). The kgXref tables in each database give several aliases for each gene ID. Regards, ---------- Ann Zweig UCSC Genome Bioinformatics Group http://genome.ucsc.edu Fungazid wrote: > Ann, > thank you for your nice help, I must say .I have > another 2 related questions (I hope it is not too > detailed ) > > 1) > what do you mean by: "For more distant > species reciprocal-best BLASTP hits are used." ?? For > example how can I find the target gene of dog or > opossum, for a query mouse gene? > > 2) > About the mmBlastTab table in hg18 database: > As far as I understand: > a. 'query' field = the name of human gene, > b. 'target' = the name of the homologous gene is mouse > c. hg18 and mm8 kgXref tables give aliases to the > query and taget gene names respectively. > I hope this is right > > Thank you again, Avi > > > --- Ann Zweig wrote: > >> Hello Avi, >> >> We have several data sets available that show >> homologous genes. >> >> The BlastTab tables show pairwise orthologs between >> the following >> organisms: >> >> organism tableName >> -------- --------- >> human hgBlastTab >> mouse mmBlastTab >> rat rnBlastTab >> zebrafish drBlastTab >> D. melanogaster dmBlastTab >> C. elegans ceBlastTab >> S. cerevisiae scBlastTab >> >> Orthologies between human, mouse, and rat are >> computed by taking the >> best BLASTP hit, and filtering out non-syntenic >> hits. For more distant >> species reciprocal-best BLASTP hits are used. >> >> The Conservation tracks show multiple alignments of >> a number of species >> along with measures of evolutionary conservation. >> For example, the >> Conservation track on the latest human assembly >> (hg18), shows alignment >> with 27 other vertebrates. >> >> You can read about the details behind any track >> (description, methods, >> display, credits, references) by pressing on the >> 'mini-button' to the >> left of the actual track display, or by clicking on >> the hyperlinked >> track name in the track controls (below the >> display). >> >> You can use the Table Browser tool on our website >> ('Tables' from the >> top blue navigation bar) to view the contents of >> tables. Alternatively, >> you can download tables from our download sever >> here: >> http://hgdownload.cse.ucsc.edu/downloads.html >> >> This should be enough to get you started. Please >> don't hesitate to >> contact the mail list again if you require further >> assistance. >> >> >> Regards, >> >> ---------- >> Ann Zweig >> UCSC Genome Bioinformatics Group >> http://genome.ucsc.edu >> >> Please feel free to search the Genome mailing list >> archives by visiting >> our home page, clicking on "Contact Us", then typing >> a word or phrase >> into the search box. On that same page >> (http://genome.ucsc.edu/contacts.html), you can >> subscribe to the Genome >> mailing list. >> >> >> Fungazid wrote: >> > I would like to ask 2 questions: >> > >> > 1) Is there UCSC-tables that group homologous >> genes ? >> > More specifically, tables that group together >> genes >> > that their conservation is measured at the >> protein >> > level, or mRNA level. >> > for example, the table should look like: >> > >> > groupNumber gene species >> > 1 NM_001011874 mm8 >> > 1 NM_001011971 rn4 >> > 1 NM_052898 gh18 >> > ... >> > >> > + some conservations measures >> > >> > >> > 2) There are similar tables in homologene >> database: >> > >> > >> > http://www.ncbi.nlm.nih.gov/sites/entrez?db=homologene&cmd=search&term= >> > >> > Is there a way to link these tables to UCSC >> refGene >> > tables, or similar UCSC gene tables? >> > >> > Many thanks, Avi >> > >> > >> > >> > >> > ____________________________________________________________________________________ >> > You rock. That's why Blockbuster's offering you >> one month of >> Blockbuster Total Access, No Cost. >> > >> http://tc.deals.yahoo.com/tc/blockbuster/text5.com >> > _______________________________________________ >> > Genome maillist - Genome at soe.ucsc.edu >> > http://www.soe.ucsc.edu/mailman/listinfo/genome >> > > > > ____________________________________________________________________________________ > You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. > http://tc.deals.yahoo.com/tc/blockbuster/text5.com From ann at soe.ucsc.edu Thu Apr 3 16:11:43 2008 From: ann at soe.ucsc.edu (Ann Zweig) Date: Thu, 03 Apr 2008 16:11:43 -0700 Subject: [Genome] information about "phastcons" score In-Reply-To: <47F34A77.4030906@esat.kuleuven.be> References: 1183.129.175.112.92.1156324589.squirrel@serv1.igmors.u-psud.fr <47F34A77.4030906@esat.kuleuven.be> Message-ID: <47F5642F.4020309@soe.ucsc.edu> Hello Hong Sun, Since there are so many parts to your question, I have embedded my answers within your questions below. I am assuming that you are working with the latest mouse assembly (mm9) and human assembly (hg18). hong sun wrote: > Hello, > We are interested in the pairwise alignment between intergenic region of > 50 mouse genes and the corresponding intergenic region of human. > The 50 intergenic region of mouse genes are as followings in /*Data1*/, > what we are doing now is: > 1 use UCSC genome browser to browser the chr reigon of our data, with > selecting only human to do the pairwise alignment with mouse in the > Conservation Track Settings page. I have two comments to this part of your question. If you are not already doing it, I would suggest that you create a Custom Track with your 50 intergenic mouse regions. They will be displayed in the mouse genome browser and will be easier to navigate to. Read about creating a custom track here: http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#CustomTracks Please note that although you are only viewing only the human pairwise alignment, the phastCons wiggle values do not change correspondingly. This is a common misconception. > 2 then click on the blue area conservation part on the genome browser > page, then it gives the alignments like /*Result1*/ format (followings), > *our first question: *is this alignment the alignment between mouse > intergenic region and the corresponding intergenic region of human? This is the alignment between your mouse coordinates (whether they are intergenic or not) and the corresponding human coordinates (which may or may not be intergenic). > *our second question: *can we download the alignment once but not > download each block? Yes, instead of downloading from this page in a block-by-block fashion, I would suggest using the Human Net track on the mouse browser. This is a pairwise alignment between mouse and human. Take your third region from your Data1 file, chr12:87772800-87773999. In the Conservation details page, you will see these block-by-block alignments (as you have noted): B D Mouse gctgggatttctgtatgtgtgacac-aggggattagagaagg-gattagc-gggggtgg-a-ggactgat B D Human gctcgcgtgtc--aatatgtaacacaaggggattaaagaagg-aattacagtttgggat-g-gagaggat However, from the Human Net details page (click on the "View alignment details of parts of net within browser window" link), you will see a base-by-base alignment (human on top, mouse on bottom): 75922219 gctcgcgtgtcaatatgtaacaca-aggggattaaagaaggaattacagtttgggatgga 75922277 >>>>>>>> ||| | | || ||||| ||| ||||||||| |||||| |||| | ||| |||| >>>>>>>> 87772800 gctgggatttctgtatgtgtgacacaggggattagagaagggattagcg---ggggtgga 87772856 ...and so on. > 3 beside the pairwise alignment between intergenic region of