[Genome] request for hg18 to hg16

Hiram Clawson hiram at soe.ucsc.edu
Sun Apr 6 11:16:00 PDT 2008


Good Morning Ashutosh:

You can download a binary for the Mac from the location:
http://hgdownload.cse.ucsc.edu/admin/exe/
although this is a PPC binary only, not an Intel binary.

This is merely the program, it is not a Mac dmg packaged
object.  Simply set the execution bits on with a 'chmod'
then run the binary from a command line.  It will output
its usage message when run with no arguments.

--Hiram

Gupta, Ashutosh (NIH/NCI) [F] wrote:
> Thanks a lot Ann for the quick reply! It was very helpful.
> I had read the details for BED format, but didn't realize that optional
> arguments also had a format. 
> 
> The problem with adding all the data to new.tsv file is that each file
> is about 50 MB, so adding all the files to one file (~3 GB) would take
> for ever & then conversion might be another forever.
> 
> The easiest way around it seems to be the following.
> I can write a text file with all the conversion commands, e.g.
> liftOver file1 chainFile newFile1 unMapped1
> liftOver file2 chainFile newFile2 unMapped2
> ...
> liftOver file150 chainFile newFile150 unMapped150
> 
> And than run that text file from terminal server on the Mac.
> 
> Question is: How do I run it from terminal server?
> 
> Thanks a lot for all the help.
> Best regards,
> Ashutosh.
> 
> 
> -----Original Message-----
> From: Ann Zweig [mailto:ann at soe.ucsc.edu] 
> Sent: Thursday, April 03, 2008 5:09 PM
> To: Gupta, Ashutosh (NIH/NCI) [F]
> Cc: 'genome at soe.ucsc.edu'
> Subject: Re: RE: [Genome] request for hg18 to hg16
> 
> Hello Ashutosh,
> 
> 	Please note that our mail list software strips all attachments
> sent to the 
> list.  That said, I was able to deduce what is happening with your input
> files.
> 
> 	The scores that you have entered in your input file are in the
> 6th field.  They 
> must be moved to the 5th field.  The 6th field is reserved for strand
> (+/-). 
> Please read about the BED format here: 
> http://genome.ucsc.edu/goldenPath/help/customTrack.html#BED
> 
> 	You should be able to run all of your conversions at one time.
> Just place them 
> all in your new.tsv file and run it as Brooke explained previously.
> 
> 
> Regards,
> 
> ----------
> Ann Zweig
> UCSC Genome Bioinformatics Group
> http://genome.ucsc.edu
> 
> Please feel free to search the Genome mailing list archives by visiting
> our home 
> page, clicking on "Contact Us", then typing a word or phrase into the
> search 
> box.  On that same page
> (http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome
> mailing 
> list.
> 
> 
> 
>> Hey Brooke,
>>
>> Thanks a lot for the help.
>>
>> It did work. However I had to use the same strategy as earlier (that
> is 
>> to give the complete path).
>>
>> I could not find the $PATH directories to copy my files over there.
>>
>>  
>>
>> But problem is, that the data values were changed.
>>
>> This is before conversion:
>>
>> This is after conversion:
>>
>>  
>>
>> Note the special characters & changed score values in the last column.
>>
>>  
>>
>> Exactly the same thing happened when I was using the web version of
> the 
>> liftOver program.
>>
>> How to fix this? I could just take the coordinates from the shifted 
>> file, & scores from the original file, would it be correct to do so?
>>
>>  
>>
>> Also is there a way to run a lot of these liftOver commands at one go?
> I 
>> need to run about 150 conversions, doing them one by one will take 
>> forever. If I can write all the commands in a single text file, how do
> I 
>> execute that file Mac?
>>
>>  
>>
>> Thank you so much for your help. I really appreciate it.
>>
>>  
>>
>> Regards,
>>
>> Ashutosh.
>>
>> PS: In case you can't see the embedded pictures in this email, I have 
>> attached a copy of this email in the pdf format too.
>>
>>  
>>
>> -----Original Message-----
>> From: Brooke Rhead [mailto:rhead at soe.ucsc.edu]
>> Sent: Wednesday, April 02, 2008 3:15 AM
>> To: Gupta, Ashutosh (NIH/NCI) [F]
>> Cc: genome at soe.ucsc.edu
>> Subject: Re: [Genome] request for hg18 to hg16
>>
>>  
>>
>> Hi Ashutosh,
>>
>>  
>>
>> I see these lines in your attached file:
>>
>>  
>>
>> nci-admins-computer-2:~ levensd$
>>
>> /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver
>>
>> new.tsv hg17ToHg16.over.chain ne2 unMapped
>>
>> Can't find file: new.tsv
>>
>> nci-admins-computer-2:~ levensd$
>>
>> /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver
>>
>> new.tsv hg17ToHg16.over.chain ne2 unMapped
>>
>> Can't find file: new.tsv
>>
>> nci-admins-computer-2:~ levensd$
>>
>> /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver
> new
>> hg17ToHg16.over.chain ne2 unMapped
>>
>> Can't find file: new
>>
>> nci-admins-computer-2:~ levensd$
>>
>> /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver
>>
>> new.tsv hg17ToHg16.over.chain ne2 unMapped
>>
>> Can't find file: new.tsv
>>
>> nci-admins-computer-2:~ levensd$
>>
>> /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver
>>
>> new.txt hg17ToHg16.over.chain ne2.txt unMapped
>>
>> Can't find file: new.txt
>>
>> nci-admins-computer-2:~ levensd$ \306\222f
>>
>>  
>>
>>  
>>
>> For comparison, the format for running the liftOver command is:
>>
>>  
>>
>> liftOver oldFile map.chain newFile unMapped
>>
>>  
>>
>> The first two files, "oldFile" and "map.chain" need to either be
> present
>> in your current working directory, or else you need to specify the
> paths
>> to the files.  The second two files, "newFile" and "unMapped" do not
>>
>> need to exist already -- the liftOver program will create files with
> the
>> names you specify.
>>
>>  
>>
>> Using your command:
>>
>> liftOver new.tsv hg17ToHg16.over.chain ne2 unMapped
>>
>>  
>>
>> liftOver is expecting a BED file of hg17 coordinates in to be present
> in
>> the current directory, in a file called "new.tsv".  The
>>
>> hg17ToHg16.over.chain file should also be in the current directory. 
>>
>> LiftOver will create a file containing the corresponding hg16
>>
>> coordinates in a file called "ne2" in the current directory, and it
> will
>> create a file called "unMapped" in the current directory and record
> any
>> hg17 coordinates that did not map to hg16 in that file.
>>
>>  
>>
>> Regarding your "PS" question: I see that you presently need to specify
>>
>> the entire path to the liftOver executable to get it to work.  This is
>>
>> because the path to liftOver is not in your $PATH variable.  If you
>>
>> either (1) move the liftOver executable to a directory that is already
>>
>> in $PATH, or if you (2) add the path where your executable resides
>>
>> (/Volumes/... in your case) to the $PATH variable, you should be able
> to
>> execute liftOver without specifying the path to it every time.  Try
> the
>> command:
>>
>> echo $PATH
>>
>> to see the directories that are currently in your $PATH variable.
>>
>>  
>>
>> I hope this explanation is helpful.
>>
>>  
>>
>> --
>>
>> Brooke Rhead
>>
>> UCSC Genome Bioinformatics Group
>>
>>  
>>
>>  
>>
>>  
>>
>> Gupta, Ashutosh (NIH/NCI) [F] wrote:
>>
>>>  Thanks a lot for the quick reply.
>>>  Please have a look at the attached snapshot of my liftOver session.
>>>  I am not sure where am I going wrong. I have tried several different
>>>  formats, but the program never recognized the files. The files were
>>>  definitely there as I could open them using other applications.
>>>  I had also ensured that the data is in the recommended BED format.
>>>  Thanks again for your help.
>>>  Regards,
>>>  Ashutosh.
>>>  PS: Also, I notice that you are just typing liftOver from the
> command
>>>  promt, which never worked for me. I always got the error-"command
> not
>>>  found". So I had to use the strategy as in the attached file. Is
> there
>>>  some problem with the installation of the file? I am a windows user
> &
>>>  relatively new to mac/unix system.
>>>  -----Original Message-----
>>>  From: Brooke Rhead [mailto:rhead at soe.ucsc.edu]
>>>  Sent: Tuesday, April 01, 2008 8:30 PM
>>>  To: Gupta, Ashutosh (NIH/NCI) [F]
>>>  Cc: genome at soe.ucsc.edu
>>>  Subject: Re: [Genome] request for hg18 to hg16
>>>  Hi Ashutosh,
>>>  What kind of problem are you experiencing?
>>>  If you just need instructions on how to use the command-line tool,
> you
>>>  can run the liftOver command with no arguments to see instructions.
> It
>>>  should look something like this:
>>>  -----
>>>  $ liftOver
>>>  liftOver - Move annotations from one assembly to another
>>>  usage:
>>>     liftOver oldFile map.chain newFile unMapped
>>>  oldFile and newFile are in bed format by default, but can be in GFF
> and
>>>  maybe eventually others with the appropriate flags below.
>>>  The map.chain file has the old genome as the target and the new
> genome
>>>  as the query.
> ***********************************************************************
>>>  WARNING: liftOver was only designed to work between different
>>>           assemblies of the same organism, it may not do what you
> want
>>>           if you are lifting between different organisms.
> ***********************************************************************
>>>  options:
>>>     -minMatch=0.N Minimum ratio of bases that must remap. Default
> 0.95
>>>     -gff  File is in gff/gtf format.  Note that the gff lines are
>>>  converted
>>>           separately.  It would be good to have a separate check
> after
>>>  this
>>>           that the lines that make up a gene model still make a
>>>  plausible gene
>>>           after liftOver
>>>     -genePred - File is in genePred format
>>>     -sample - File is in sample format
>>>     -bedPlus=N - File is bed N+ format
>>>     -positions - File is in browser "position" format
>>>     -hasBin - File has bin value (used only with -bedPlus)
>>>     -tab - Separate by tabs rather than space (used only with
> -bedPlus)
>>>     -pslT - File is in psl format, map target side only
>>>     -minBlocks=0.N Minimum ratio of alignment blocks/exons that must
> map
>>>                    (default 1.00)
>>>     -fudgeThick    If thickStart/thickEnd is not mapped, use the
> closest
>>>                    mapped base.  Recommended if using -minBlocks.
>>>     -multiple               Allow multiple output regions
>>>     -minChainT, -minChainQ  Minimum chain size in target/query, when
>>>  mapping
>>>                             to multiple output regions (default 0, 0)
>>>     -minSizeT               deprecated synonym for -minChainT (ENCODE
>>>  compat.)
>>>     -minSizeQ               Min matching region size in query with
>>>  -multiple.
>>>     -chainTable             Used with -multiple, format is
> db.tablename,
>>>                                 to extend chains from net (preserves
>>>  dups)
>>>     -errorHelp              Explain error messages
>>>  -----
>>>  If you are only converting 50 positions from hg18 to hg16, it might
> be
>>>  easier to use the web-based tool, as Kayla suggested.  (Or did I
>>>  misunderstand your original question, and you need to convert many
> more
>>>  than 50 positions?)
>>>  --
>>>  Brooke Rhead
>>>  UCSC Genome Bioinformatics Group
>>>  Gupta, Ashutosh (NIH/NCI) [F] wrote:
>>>  
>>>> Hi,
>>>> I am having problem with conversions across different builds.
>>>> I have the liftOver tool for Mac OS X & all the relevant chain
> files.
>>>> Any help on this would be appreciated.
>>>> Thanks,
>>>> Ashutosh.
>>>> -----Original Message-----
>>>> From: Kayla Smith [mailto:kayla at soe.ucsc.edu]
>>>> Sent: Monday, March 24, 2008 5:15 PM
>>>> To: Gupta, Ashutosh (NIH/NCI) [F]
>>>> Cc: genome at soe.ucsc.edu
>>>> Subject: Re: [Genome] request for hg18 to hg16
>>>> Hello Ashutosh,
>>>> You can use our online liftOver tool to convert from hg18 to hg17,
> and
>>>>    
>>>  
>>>> then from hg17 to hg16.  Here is the link:
>>>> http://genome.ucsc.edu/cgi-bin/hgLiftOver
>>>> See this FAQ on downloading our source:
>>>> http://genome.ucsc.edu/FAQ/FAQdownloads#download27
>>>> I hope this information is helpful to you.  Please don't hesitate
> to
>>>> contact us again if you require further assistance.
>>>> Kayla Smith
>>>> UCSC Genome Bioinformatics Group
>>>> Gupta, Ashutosh (NIH/NCI) [F] wrote:
>>>>    
>>>>> Hi,
>>>>> Would it be possible to get a liftover file for conversion from
> hg18
>>>>>      
>>>> to
>>>>    
>>>>> hg16?
>>>>> Also, is there any windows based conversion mechanism? I need to
>>>>>      
>>>> convert
>>>>    
>>>>> about 50 nimblegen encode array hybridization, a windows based
> tool
>>>>> would be very helpful.
>>>>> Even the conversion source code in C (or in Mathematica or Matlab)
>>>>>      
>>>> would
>>>>    
>>>>> be very helpful.
>>>>> Thanks,
>>>>> Ashutosh.
>>>>> PS: I can also help develop a tool for windows system, depends on
>>>>> complexity & time though. I am sure a lot of people would find it
>>>>> useful.
>>>>> _______________________________________________
>>>>> Genome maillist  -  Genome at soe.ucsc.edu
>>>>> http://www.soe.ucsc.edu/mailman/listinfo/genome
>>>>>      
>>>> _______________________________________________
>>>> Genome maillist  -  Genome at soe.ucsc.edu
>>>> http://www.soe.ucsc.edu/mailman/listinfo/genome
>>>>    
>>
>>
> ------------------------------------------------------------------------
>> Subject:
>> confirm 4d79c0dc8c71eefc5082009115917a5d3cb4ff34
>> From:
>> genome-request at soe.ucsc.edu
>>
>>
>> If you reply to this message, keeping the Subject: header intact,
>> Mailman will discard the held message.  Do this if the message is
>> spam.  If you reply to this message and include an Approved: header
>> with the list password in it, the message will be approved for posting
>> to the list.  The Approved: header can also appear in the first line
>> of the body of the reply.
> 
> _______________________________________________
> Genome maillist  -  Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
> 


More information about the Genome mailing list