Ubuntu Manpage: hmmemit - sample sequences from a profile

NAME

       hmmemit - sample sequences from a profile

SYNOPSIS

       hmmemit [options] hmmfile

DESCRIPTION

       The  hmmemit  program  samples  (emits)  sequences from the profile HMM(s) in hmmfile, and
       writes them to output.  Sampling sequences may  be  useful  for  a  variety  of  purposes,
       including creating synthetic true positives for benchmarks or tests.

       The  default  is  to  sample one unaligned sequence from the core probability model, which
       means that each sequence consists of one full-length domain.  Alternatively, with  the  -c
       option, you can emit a simple majority-rule consensus sequence; or with the -a option, you
       can emit an alignment (in which case, you probably also want to set -N to something  other
       than its default of 1 sequence per model).

       As  another  option,  with the -p option you can sample a sequence from a fully configured
       HMMER search profile. This means sampling a `homologous sequence' by  HMMER's  definition,
       including  nonhomologous  flanking  sequences,  local alignments, and multiple domains per
       sequence, depending on the length model and alignment mode chosen for the profile.

       The hmmfile may contain a library of HMMs, in which case each HMM will be used in turn.

       hmmfile may be '-' (dash), which means reading this input from stdin rather than a file.

COMMON OPTIONS

       -h     Help; print a brief reminder of command line usage and all available options.

       -o <f> Direct the output sequences to file <f>, rather than to stdout.

       -N <n> Sample <n> sequences per model, rather than just one.

OPTIONS CONTROLLING WHAT TO EMIT

The default is to sample N sequences from the core model. Alternatively, you may choose
one (and only one) of the following alternatives.

-a Emit an alignment for each HMM in the hmmfile rather than sampling unaligned
sequences one at a time.

-c Emit a plurality-rule consensus sequence, instead of sampling a sequence from the
profile HMM's probability distribution. The consensus sequence is formed by
selecting the maximum probability residue at each match state.

-C Emit a fancier plurality-rule consensus sequence than the -c option. If the maximum
probability residue has p < minl show it as a lower case 'any' residue (n or x); if
p >= minl and < minu show it as a lower case residue; and if p >= minu show it as
an upper case residue. The default settings of minu and minl are both 0.0, which
means -C gives the same output as -c unless you also set minu and minl to what you
want.

-p Sample unaligned sequences from the implicit search profile, not from the core
model. The core model consists only of the homologous states (between the begin
and end states of a HMMER Plan7 model). The profile includes the nonhomologous N,
C, and J states, local/glocal and uni/multihit algorithm configuration, and the
target length model. Therefore sequences sampled from a profile may include
nonhomologous as well as homologous sequences, and may contain more than one
homologous sequence segment. By default, the profile is in multihit local mode, and
the target sequence length is configured for L=400.

OPTIONS CONTROLLING EMISSION FROM PROFILES

       These options require that you have set the -p option.

       -L <n> Configure the profile's target sequence length model to generate a mean  length  of
              approximately <n> rather than the default of 400.

       --local
              Configure the profile for multihit local alignment.

       --unilocal
              Configure the profile for unihit local alignment (Smith/Waterman).

       --glocal
              Configure the profile for multihit glocal alignment.

       --uniglocal
              Configure the profile for unihit glocal alignment.

OPTIONS CONTROLLING FANCY CONSENSUS EMISSION

       These options require that you have set the -C option.

       --minl <x>
              Sets the minl threshold for showing weakly conserved residues as lower case.  (0 <=
              x <= 1)

       --minu <x>
              Sets the minu threshold for showing strongly conserved residues as upper case.   (0
              <= x <= 1)

OTHER OPTIONS

       --seed <n>
              Seed the random number generator with <n>, an integer >= 0.  If <n> is nonzero, any
              stochastic simulations will be reproducible; the same command will  give  the  same
              results.   If  <n>  is  0,  the  random number generator is seeded arbitrarily, and
              stochastic simulations will vary from run to run of the same command.  The  default
              is  0:  use  an  arbitrary  seed, so different hmmemit runs will generate different
              samples.

COPYRIGHT

       Copyright (C) 2019 Howard Hughes Medical Institute.
       Freely distributed under the BSD open source license.

       For additional information on copyright and licensing, see the file  called  COPYRIGHT  in
       your HMMER source distribution, or see the HMMER web page (http://hmmer.org/).

AUTHOR

       http://eddylab.org