Ubuntu Manpage: makehmmerdb - build nhmmer database from a sequence file

NAME

       makehmmerdb - build nhmmer database from a sequence file

SYNOPSIS

       makehmmerdb [options] seqfile binaryfile

DESCRIPTION

       makehmmerdb  is  used to create a binary file from a DNA sequence file. This binary file may be used as a
       target database for the DNA search tool nhmmer.  Using default settings in nhmmer, this yields a  roughly
       10-fold acceleration with small loss of sensitivity on benchmarks.

OPTIONS

       -h     Help; print a brief reminder of command line usage and all available options.

OTHER OPTIONS

       --informat <s>
              Assert  that  input  seqfile is in format <s>, bypassing format autodetection.  Common choices for
              <s> include:  fasta,  embl,  genbank.   Alignment  formats  also  work;  common  choices  include:
              stockholm, a2m, afa, psiblast, clustal, phylip.  For more information, and for codes for some less
              common  formats,  see main documentation.  The string <s> is case-insensitive (fasta or FASTA both
              work).

       --bin_length <n>
              Bin length. The binary file depends on a data structure called the FM  index,  which  organizes  a
              permuted copy of the sequence in bins of length <n>.  Longer bin length will lead to smaller files
              (because data is captured about each bin) and possibly slower query time. The default is 256. Much
              more than 512 may lead to notable reduction in speed.

       --sa_freq <n>
              Suffix array sample rate. The FM index structure also samples from the underlying suffix array for
              the  sequence database. More frequent sampling (smaller value for <n>) will yield larger file size
              and faster search (until file size becomes large enough to cause I/O  to  be  a  bottleneck).  The
              default value is 8. Must be a power of 2.

       --block_size <n>
              The  input  sequence  is  broken into blocks of size <n> million letters. An FM index is built for
              each block, rather than building an FM index for the entire  sequence  database.  Default  is  50.
              Larger blocks do not seem to yield substantial speed increase.

COPYRIGHT

       Copyright (C) 2023 Howard Hughes Medical Institute.
       Freely distributed under the BSD open source license.

       For additional information on copyright and licensing, see the file called COPYRIGHT in your HMMER source
       distribution, or see the HMMER web page (http://hmmer.org/).

AUTHOR

       http://eddylab.org

HMMER 3.4                                           Aug 2023                                      makehmmerdb(1)

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

OTHER OPTIONS

SEE ALSO

COPYRIGHT

AUTHOR