Ubuntu Manpage: pocketsphinx_continuous - Run speech recognition in continuous listening mode

Provided by: pocketsphinx_0.8.0+real5prealpha+1-6ubuntu4_amd64

NAME

       pocketsphinx_continuous - Run speech recognition in continuous listening mode

SYNOPSIS

       pocketsphinx_continuous [-infile filename.wav ] [ -inmic yes ] [ options ]...

DESCRIPTION

       This  program  opens  the audio device or a file and waits for speech.  When it detects an
       utterance, it performs speech recognition on it.

       To record from microphone and decode use

       -inmic yes

       To decode a 16kHz 16-bit mono WAV file use

       -infile filename.wav

       You can also specify -lm or -fsg or -kws depending on whether you are using a  statistical
       language model or a finite-state grammar or look for a keyphase.

OPTIONS

       -adcdev
              of audio device to use for input.

       -agc   Automatic gain control for c0 ('max', 'emax', 'noise', or 'none')

       -agcthresh
              Initial threshold for automatic gain control

       -allphone
              phoneme decoding with phonetic lm

       -allphone_ci
              Perform phoneme decoding with phonetic lm and context-independent units only

       -alpha Preemphasis parameter

       -argfile
              file giving extra arguments.

       -ascale
              Inverse of acoustic model scale for confidence score calculation

       -aw    Inverse weight applied to acoustic scores.

       -backtrace
              Print results and backtraces to log file.

       -beam  Beam  width  applied  to  every  frame in Viterbi search (smaller values mean wider
              beam)

       -bestpath
              Run bestpath (Dijkstra) search over word lattice (3rd pass)

       -bestpathlw
              Language model probability weight for bestpath search

       -ceplen
              Number of components in the input feature vector

       -cmn   Cepstral mean normalization scheme ('current', 'prior', or 'none')

       -cmninit
              Initial values (comma-separated) for cepstral mean when 'prior' is used

       -compallsen
              Compute all senone scores in every  frame  (can  be  faster  when  there  are  many
              senones)

       -debug level for debugging messages

       -dict  pronunciation dictionary (lexicon) input file

       -dictcase
              Dictionary  is case sensitive (NOTE: case insensitivity applies to ASCII characters
              only)

       -dither
              Add 1/2-bit noise

       -doublebw
              Use double bandwidth filters (same center freq)

       -ds    Frame GMM computation downsampling ratio

       -fdict word pronunciation dictionary input file

       -feat  Feature stream type, depends on the acoustic model

       -featparams
              containing feature extraction parameters.

       -fillprob
              Filler word transition probability

       -frate Frame rate

       -fsg   format finite state grammar file

       -fsgusealtpron
              Add alternate pronunciations to FSG

       -fsgusefiller
              Insert filler words at each state.

       -fwdflat
              Run forward flat-lexicon search over word lattice (2nd pass)

       -fwdflatbeam
              Beam width applied to every frame in second-pass flat search

       -fwdflatefwid
              Minimum number of end frames for a word to be searched in fwdflat search

       -fwdflatlw
              Language model probability weight for flat lexicon (2nd pass) decoding

       -fwdflatsfwin
              Window of frames in lattice to search for successor words in fwdflat search

       -fwdflatwbeam
              Beam width applied to word exits in second-pass flat search

       -fwdtree
              Run forward lexicon-tree search (1st pass)

       -hmm   containing acoustic model files.

       -infile
              file to transcribe.

       -inmic Transcribe audio from microphone.

       -input_endian
              Endianness of input data, big or little, ignored if NIST or MS Wav

       -jsgf  grammar file

       -keyphrase
              to spot

       -kws   file with keyphrases to spot, one per line

       -kws_delay
              Delay to wait for best detection score

       -kws_plp
              Phone loop probability for keyword spotting

       -kws_threshold
              Threshold for p(hyp)/p(alternatives) ratio

       -latsize
              Initial backpointer table size

       -lda   containing transformation matrix to be applied to features (single-stream  features
              only)

       -ldadim
              Dimensionality of output of feature transformation (0 to use entire matrix)

       -lifter
              Length of sin-curve for liftering, or 0 for no liftering.

       -lm    trigram language model input file

       -lmctl a set of language model

       -lmname
              language model in -lmctl to use by default

       -logbase
              Base in which all log-likelihoods calculated

       -logfn to write log messages in

       -logspec
              Write out logspectral files instead of cepstra

       -lowerf
              Lower edge of filters

       -lpbeam
              Beam width applied to last phone in words

       -lponlybeam
              Beam width applied to last phone in single-phone words

       -lw    Language model probability weight

       -maxhmmpf
              Maximum number of active HMMs to maintain at each frame (or -1 for no pruning)

       -maxwpf
              Maximum number of distinct word exits at each frame (or -1 for no pruning)

       -mdef  definition input file

       -mean  gaussian means input file

       -mfclogdir
              to log feature files to

       -min_endfr
              Nodes ignored in lattice construction if they persist for fewer than N frames

       -mixw  mixture weights input file (uncompressed)

       -mixwfloor
              Senone mixture weights floor (applied to data from -mixw file)

       -mllr  transformation to apply to means and variances

       -mmap  Use memory-mapped I/O (if possible) for model files

       -ncep  Number of cep coefficients

       -nfft  Size of FFT

       -nfilt Number of filter banks

       -nwpen New word transition penalty

       -pbeam Beam width applied to phone transitions

       -pip   Phone insertion penalty

       -pl_beam
              Beam width applied to phone loop search for lookahead

       -pl_pbeam
              Beam width applied to phone loop transitions for lookahead

       -pl_pip
              Phone insertion penalty for phone loop

       -pl_weight
              Weight for phoneme lookahead penalties

       -pl_window
              Phoneme lookahead window size, in frames

       -rawlogdir
              to log raw audio files to

       -remove_dc
              Remove DC offset from each frame

       -remove_noise
              Remove noise with spectral subtraction in mel-energies

       -remove_silence
              Enables VAD, removes silence frames from processing

       -round_filters
              Round mel filter frequencies to DFT points

       -samprate
              Sampling rate

       -seed  Seed for random number generator; if less than zero, pick our own

       -sendump
              dump (compressed mixture weights) input file

       -senlogdir
              to log senone score files to

       -senmgau
              to codebook mapping input file (usually not needed)

       -silprob
              Silence word transition probability

       -smoothspec
              Write out cepstral-smoothed logspectral files

       -svspec
              specification (e.g., 24,0-11/25,12-23/26-38 or 0-12/13-25/26-38)

       -time  Print word times in file transcription.

       -tmat  state transition matrix input file

       -tmatfloor
              HMM state transition probability floor (applied to -tmat file)

       -topn  Maximum number of top Gaussians to use in scoring.

       -topn_beam
              Beam width used to determine top-N Gaussians (or a list, per-feature)

       -toprule
              rule for JSGF (first public rule is default)

       -transform
              Which type of transform to use to calculate cepstra (legacy, dct, or htk)

       -unit_area
              Normalize mel filters to unit area

       -upperf
              Upper edge of filters

       -uw    Unigram weight

       -vad_postspeech
              Num of silence frames to keep after from speech to silence.

       -vad_prespeech
              Num of speech frames to keep before silence to speech.

       -vad_startspeech
              Num of speech frames to trigger vad from silence to speech.

       -vad_threshold
              Threshold  for  decision between noise and silence frames. Log-ratio between signal
              level and noise level.

       -var   gaussian variances input file

       -varfloor
              Mixture gaussian variance floor (applied to data from -var file)

       -varnorm
              Variance normalize each utterance (only if CMN == current)

       -verbose
              Show input filenames

       -warp_params
              defining the warping function

       -warp_type
              Warping function type (or shape)

       -wbeam Beam width applied to word exits

       -wip   Word insertion penalty

       -wlen  Hamming window length

AUTHOR

       Written by numerous people at CMU from 1994 onwards.  This manual page by  David  Huggins-
       Daines <dhuggins@cs.cmu.edu>

COPYRIGHT

       Copyright © 1994-2016 Carnegie Mellon University.  See the file LICENSE included with this
       package for more information.

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

AUTHOR

COPYRIGHT

SEE ALSO