[Genome] request for hg18 to hg16
Hiram Clawson
hiram at soe.ucsc.edu
Sun Apr 6 11:16:00 PDT 2008
Good Morning Ashutosh:
You can download a binary for the Mac from the location:
http://hgdownload.cse.ucsc.edu/admin/exe/
although this is a PPC binary only, not an Intel binary.
This is merely the program, it is not a Mac dmg packaged
object. Simply set the execution bits on with a 'chmod'
then run the binary from a command line. It will output
its usage message when run with no arguments.
--Hiram
Gupta, Ashutosh (NIH/NCI) [F] wrote:
> Thanks a lot Ann for the quick reply! It was very helpful.
> I had read the details for BED format, but didn't realize that optional
> arguments also had a format.
>
> The problem with adding all the data to new.tsv file is that each file
> is about 50 MB, so adding all the files to one file (~3 GB) would take
> for ever & then conversion might be another forever.
>
> The easiest way around it seems to be the following.
> I can write a text file with all the conversion commands, e.g.
> liftOver file1 chainFile newFile1 unMapped1
> liftOver file2 chainFile newFile2 unMapped2
> ...
> liftOver file150 chainFile newFile150 unMapped150
>
> And than run that text file from terminal server on the Mac.
>
> Question is: How do I run it from terminal server?
>
> Thanks a lot for all the help.
> Best regards,
> Ashutosh.
>
>
> -----Original Message-----
> From: Ann Zweig [mailto:ann at soe.ucsc.edu]
> Sent: Thursday, April 03, 2008 5:09 PM
> To: Gupta, Ashutosh (NIH/NCI) [F]
> Cc: 'genome at soe.ucsc.edu'
> Subject: Re: RE: [Genome] request for hg18 to hg16
>
> Hello Ashutosh,
>
> Please note that our mail list software strips all attachments
> sent to the
> list. That said, I was able to deduce what is happening with your input
> files.
>
> The scores that you have entered in your input file are in the
> 6th field. They
> must be moved to the 5th field. The 6th field is reserved for strand
> (+/-).
> Please read about the BED format here:
> http://genome.ucsc.edu/goldenPath/help/customTrack.html#BED
>
> You should be able to run all of your conversions at one time.
> Just place them
> all in your new.tsv file and run it as Brooke explained previously.
>
>
> Regards,
>
> ----------
> Ann Zweig
> UCSC Genome Bioinformatics Group
> http://genome.ucsc.edu
>
> Please feel free to search the Genome mailing list archives by visiting
> our home
> page, clicking on "Contact Us", then typing a word or phrase into the
> search
> box. On that same page
> (http://genome.ucsc.edu/contacts.html), you can subscribe to the Genome
> mailing
> list.
>
>
>
>> Hey Brooke,
>>
>> Thanks a lot for the help.
>>
>> It did work. However I had to use the same strategy as earlier (that
> is
>> to give the complete path).
>>
>> I could not find the $PATH directories to copy my files over there.
>>
>>
>>
>> But problem is, that the data values were changed.
>>
>> This is before conversion:
>>
>> This is after conversion:
>>
>>
>>
>> Note the special characters & changed score values in the last column.
>>
>>
>>
>> Exactly the same thing happened when I was using the web version of
> the
>> liftOver program.
>>
>> How to fix this? I could just take the coordinates from the shifted
>> file, & scores from the original file, would it be correct to do so?
>>
>>
>>
>> Also is there a way to run a lot of these liftOver commands at one go?
> I
>> need to run about 150 conversions, doing them one by one will take
>> forever. If I can write all the commands in a single text file, how do
> I
>> execute that file Mac?
>>
>>
>>
>> Thank you so much for your help. I really appreciate it.
>>
>>
>>
>> Regards,
>>
>> Ashutosh.
>>
>> PS: In case you can't see the embedded pictures in this email, I have
>> attached a copy of this email in the pdf format too.
>>
>>
>>
>> -----Original Message-----
>> From: Brooke Rhead [mailto:rhead at soe.ucsc.edu]
>> Sent: Wednesday, April 02, 2008 3:15 AM
>> To: Gupta, Ashutosh (NIH/NCI) [F]
>> Cc: genome at soe.ucsc.edu
>> Subject: Re: [Genome] request for hg18 to hg16
>>
>>
>>
>> Hi Ashutosh,
>>
>>
>>
>> I see these lines in your attached file:
>>
>>
>>
>> nci-admins-computer-2:~ levensd$
>>
>> /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver
>>
>> new.tsv hg17ToHg16.over.chain ne2 unMapped
>>
>> Can't find file: new.tsv
>>
>> nci-admins-computer-2:~ levensd$
>>
>> /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver
>>
>> new.tsv hg17ToHg16.over.chain ne2 unMapped
>>
>> Can't find file: new.tsv
>>
>> nci-admins-computer-2:~ levensd$
>>
>> /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver
> new
>> hg17ToHg16.over.chain ne2 unMapped
>>
>> Can't find file: new
>>
>> nci-admins-computer-2:~ levensd$
>>
>> /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver
>>
>> new.tsv hg17ToHg16.over.chain ne2 unMapped
>>
>> Can't find file: new.tsv
>>
>> nci-admins-computer-2:~ levensd$
>>
>> /Volumes/nci10b.nci.nih.gov/Group/LP/Ashutosh/HG\ liftover/liftOver
>>
>> new.txt hg17ToHg16.over.chain ne2.txt unMapped
>>
>> Can't find file: new.txt
>>
>> nci-admins-computer-2:~ levensd$ \306\222f
>>
>>
>>
>>
>>
>> For comparison, the format for running the liftOver command is:
>>
>>
>>
>> liftOver oldFile map.chain newFile unMapped
>>
>>
>>
>> The first two files, "oldFile" and "map.chain" need to either be
> present
>> in your current working directory, or else you need to specify the
> paths
>> to the files. The second two files, "newFile" and "unMapped" do not
>>
>> need to exist already -- the liftOver program will create files with
> the
>> names you specify.
>>
>>
>>
>> Using your command:
>>
>> liftOver new.tsv hg17ToHg16.over.chain ne2 unMapped
>>
>>
>>
>> liftOver is expecting a BED file of hg17 coordinates in to be present
> in
>> the current directory, in a file called "new.tsv". The
>>
>> hg17ToHg16.over.chain file should also be in the current directory.
>>
>> LiftOver will create a file containing the corresponding hg16
>>
>> coordinates in a file called "ne2" in the current directory, and it
> will
>> create a file called "unMapped" in the current directory and record
> any
>> hg17 coordinates that did not map to hg16 in that file.
>>
>>
>>
>> Regarding your "PS" question: I see that you presently need to specify
>>
>> the entire path to the liftOver executable to get it to work. This is
>>
>> because the path to liftOver is not in your $PATH variable. If you
>>
>> either (1) move the liftOver executable to a directory that is already
>>
>> in $PATH, or if you (2) add the path where your executable resides
>>
>> (/Volumes/... in your case) to the $PATH variable, you should be able
> to
>> execute liftOver without specifying the path to it every time. Try
> the
>> command:
>>
>> echo $PATH
>>
>> to see the directories that are currently in your $PATH variable.
>>
>>
>>
>> I hope this explanation is helpful.
>>
>>
>>
>> --
>>
>> Brooke Rhead
>>
>> UCSC Genome Bioinformatics Group
>>
>>
>>
>>
>>
>>
>>
>> Gupta, Ashutosh (NIH/NCI) [F] wrote:
>>
>>> Thanks a lot for the quick reply.
>>> Please have a look at the attached snapshot of my liftOver session.
>>> I am not sure where am I going wrong. I have tried several different
>>> formats, but the program never recognized the files. The files were
>>> definitely there as I could open them using other applications.
>>> I had also ensured that the data is in the recommended BED format.
>>> Thanks again for your help.
>>> Regards,
>>> Ashutosh.
>>> PS: Also, I notice that you are just typing liftOver from the
> command
>>> promt, which never worked for me. I always got the error-"command
> not
>>> found". So I had to use the strategy as in the attached file. Is
> there
>>> some problem with the installation of the file? I am a windows user
> &
>>> relatively new to mac/unix system.
>>> -----Original Message-----
>>> From: Brooke Rhead [mailto:rhead at soe.ucsc.edu]
>>> Sent: Tuesday, April 01, 2008 8:30 PM
>>> To: Gupta, Ashutosh (NIH/NCI) [F]
>>> Cc: genome at soe.ucsc.edu
>>> Subject: Re: [Genome] request for hg18 to hg16
>>> Hi Ashutosh,
>>> What kind of problem are you experiencing?
>>> If you just need instructions on how to use the command-line tool,
> you
>>> can run the liftOver command with no arguments to see instructions.
> It
>>> should look something like this:
>>> -----
>>> $ liftOver
>>> liftOver - Move annotations from one assembly to another
>>> usage:
>>> liftOver oldFile map.chain newFile unMapped
>>> oldFile and newFile are in bed format by default, but can be in GFF
> and
>>> maybe eventually others with the appropriate flags below.
>>> The map.chain file has the old genome as the target and the new
> genome
>>> as the query.
> ***********************************************************************
>>> WARNING: liftOver was only designed to work between different
>>> assemblies of the same organism, it may not do what you
> want
>>> if you are lifting between different organisms.
> ***********************************************************************
>>> options:
>>> -minMatch=0.N Minimum ratio of bases that must remap. Default
> 0.95
>>> -gff File is in gff/gtf format. Note that the gff lines are
>>> converted
>>> separately. It would be good to have a separate check
> after
>>> this
>>> that the lines that make up a gene model still make a
>>> plausible gene
>>> after liftOver
>>> -genePred - File is in genePred format
>>> -sample - File is in sample format
>>> -bedPlus=N - File is bed N+ format
>>> -positions - File is in browser "position" format
>>> -hasBin - File has bin value (used only with -bedPlus)
>>> -tab - Separate by tabs rather than space (used only with
> -bedPlus)
>>> -pslT - File is in psl format, map target side only
>>> -minBlocks=0.N Minimum ratio of alignment blocks/exons that must
> map
>>> (default 1.00)
>>> -fudgeThick If thickStart/thickEnd is not mapped, use the
> closest
>>> mapped base. Recommended if using -minBlocks.
>>> -multiple Allow multiple output regions
>>> -minChainT, -minChainQ Minimum chain size in target/query, when
>>> mapping
>>> to multiple output regions (default 0, 0)
>>> -minSizeT deprecated synonym for -minChainT (ENCODE
>>> compat.)
>>> -minSizeQ Min matching region size in query with
>>> -multiple.
>>> -chainTable Used with -multiple, format is
> db.tablename,
>>> to extend chains from net (preserves
>>> dups)
>>> -errorHelp Explain error messages
>>> -----
>>> If you are only converting 50 positions from hg18 to hg16, it might
> be
>>> easier to use the web-based tool, as Kayla suggested. (Or did I
>>> misunderstand your original question, and you need to convert many
> more
>>> than 50 positions?)
>>> --
>>> Brooke Rhead
>>> UCSC Genome Bioinformatics Group
>>> Gupta, Ashutosh (NIH/NCI) [F] wrote:
>>>
>>>> Hi,
>>>> I am having problem with conversions across different builds.
>>>> I have the liftOver tool for Mac OS X & all the relevant chain
> files.
>>>> Any help on this would be appreciated.
>>>> Thanks,
>>>> Ashutosh.
>>>> -----Original Message-----
>>>> From: Kayla Smith [mailto:kayla at soe.ucsc.edu]
>>>> Sent: Monday, March 24, 2008 5:15 PM
>>>> To: Gupta, Ashutosh (NIH/NCI) [F]
>>>> Cc: genome at soe.ucsc.edu
>>>> Subject: Re: [Genome] request for hg18 to hg16
>>>> Hello Ashutosh,
>>>> You can use our online liftOver tool to convert from hg18 to hg17,
> and
>>>>
>>>
>>>> then from hg17 to hg16. Here is the link:
>>>> http://genome.ucsc.edu/cgi-bin/hgLiftOver
>>>> See this FAQ on downloading our source:
>>>> http://genome.ucsc.edu/FAQ/FAQdownloads#download27
>>>> I hope this information is helpful to you. Please don't hesitate
> to
>>>> contact us again if you require further assistance.
>>>> Kayla Smith
>>>> UCSC Genome Bioinformatics Group
>>>> Gupta, Ashutosh (NIH/NCI) [F] wrote:
>>>>
>>>>> Hi,
>>>>> Would it be possible to get a liftover file for conversion from
> hg18
>>>>>
>>>> to
>>>>
>>>>> hg16?
>>>>> Also, is there any windows based conversion mechanism? I need to
>>>>>
>>>> convert
>>>>
>>>>> about 50 nimblegen encode array hybridization, a windows based
> tool
>>>>> would be very helpful.
>>>>> Even the conversion source code in C (or in Mathematica or Matlab)
>>>>>
>>>> would
>>>>
>>>>> be very helpful.
>>>>> Thanks,
>>>>> Ashutosh.
>>>>> PS: I can also help develop a tool for windows system, depends on
>>>>> complexity & time though. I am sure a lot of people would find it
>>>>> useful.
>>>>> _______________________________________________
>>>>> Genome maillist - Genome at soe.ucsc.edu
>>>>> http://www.soe.ucsc.edu/mailman/listinfo/genome
>>>>>
>>>> _______________________________________________
>>>> Genome maillist - Genome at soe.ucsc.edu
>>>> http://www.soe.ucsc.edu/mailman/listinfo/genome
>>>>
>>
>>
> ------------------------------------------------------------------------
>> Subject:
>> confirm 4d79c0dc8c71eefc5082009115917a5d3cb4ff34
>> From:
>> genome-request at soe.ucsc.edu
>>
>>
>> If you reply to this message, keeping the Subject: header intact,
>> Mailman will discard the held message. Do this if the message is
>> spam. If you reply to this message and include an Approved: header
>> with the list password in it, the message will be approved for posting
>> to the list. The Approved: header can also appear in the first line
>> of the body of the reply.
>
> _______________________________________________
> Genome maillist - Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>
More information about the Genome
mailing list