[Genome] nib utilities?
Hiram Clawson
hiram at soe.ucsc.edu
Mon Oct 16 08:09:02 PDT 2006
Good Morning Davide:
I'm not certain exactly what you are asking about here.
You can get nibFrag to write its output to stdout by
using the special name "stdout" in the command in place
of the out.fa argument. All the kent utilities recognize
the special words: stdin stdout stderr in place of any
filename.
You could also use the single 2bit file for the genome
and use the twoBitToFa command to extract sequences.
See usage message attached below.
--Hiram
On 2006 Oct 16, , at 1:52 AM, Davide Cittaro wrote:
> Hi all, in order to get sequences from genomes I found that nibFrag
> does the job I need (-> get a sequence in a specified range).
> Nevertheless every time I launch, it reads the wanted nib files and
> gives the output to a fasta file. While the continuous reads results
> in a slow down of the process (if I have to get 100000 sequences it
> takes some time), it seems that the output cannot be appended nor
> sent to stdout.
> I wonder if there is a way to get fragments from a gfServer that is
> already running (so that nib files have been loaded previously) and
> which other utilities can handle nib files.
>
> nibFrag - Extract part of a nib file as .fa (all bases/gaps lower case
> by default)
> usage:
> nibFrag [options] file.nib start end strand out.fa
> where strand is + (plus) or m (minus)
> options:
> -masked - use lower case characters for bases meant to be masked out
> -hardMasked - use upper case for not masked-out and 'N' characters
> for masked-out bases
> -upper - use upper case characters for all bases
> -name=name Use given name after '>' in output sequence
> -dbHeader=db Add full database info to the header, with or without
> -name option
> -tbaHeader=db Format header for compatibility with tba, takes
> database name as argument
twoBitToFa - Convert all or part of .2bit file to fasta
usage:
twoBitToFa input.2bit output.fa
options:
-seq=name - restrict this to just one sequence
-start=X - start at given position in sequence (zero-based)
-end=X - end at given position in sequence (non-inclusive)
-seqList=file - file containing list of sequence names
to output of form the seqSpec[:start-end]
-noMask - convert sequence to all upper case
Sequence and range may also be specified as part of the input
file name using the syntax:
/path/input.2bit:name
or
/path/input.2bit:name
or
/path/input.2bit:name:start-end
More information about the Genome
mailing list