Provided by: libfec-dev_1.0-26-gc5d935f-1_amd64 bug

NAME

       create_viterbi27,    set_viterbi27_polynomial,    init_viterbi27,    update_viterbi27_blk,
       chainback_viterbi27,   delete_viterbi27,   create_viterbi29,    set_viterbi_29_polynomial,
       init_viterbi29,      update_viterbi29_blk,      chainback_viterbi29,     delete_viterbi29,
       create_viterbi39,   set_viterbi_39_polynomial,    init_viterbi39,    update_viterbi39_blk,
       chainback_viterbi39,   delete_viterbi39,   create_viterbi615,   set_viterbi615_polynomial,
       init_viterbi615,  update_viterbi615_blk,  chainback_viterbi615,  delete_viterbi615  - IA32
       SIMD-assisted Viterbi decoders

SYNOPSIS

       #include "fec.h"
       void *create_viterbi27(int blocklen);
       void set_viterbi27_polynomial(int polys[2]);
       int init_viterbi27(void *vp,int starting_state);
       int update_viterbi27_blk(void *vp,unsigned char syms[],int nbits);
       int chainback_viterbi27(void *vp, unsigned char *data,unsigned int nbits,unsigned int endstate);
       void delete_viterbi27(void *vp);

       void *create_viterbi29(int blocklen);
       void set_viterbi29_polynomial(int polys[2]);
       int init_viterbi29(void *vp,int starting_state);
       int update_viterbi29_blk(void *vp,unsigned char syms[],int nbits);
       int chainback_viterbi29(void *vp, unsigned char *data,unsigned int nbits,unsigned int endstate);
       void delete_viterbi29(void *vp);

       void *create_viterbi39(int blocklen);
       void set_viterbi39_polynomial(int polys[3]);
       int init_viterbi39(void *vp,int starting_state);
       int update_viterbi39_blk(void *vp,unsigned char syms[],int nbits);
       int chainback_viterbi39(void *vp, unsigned char *data,unsigned int nbits,unsigned int endstate);
       void delete_viterbi39(void *vp);

       void *create_viterbi615(int blocklen);
       void set_viterbi615_polynomial(int polys[6]);
       int init_viterbi615(void *vp,int starting_state);
       int update_viterbi615_blk(void *vp,unsigned char syms[],int nbits);
       int chainback_viterbi615(void *vp, unsigned char *data,unsigned int nbits,unsigned int endstate);
       void delete_viterbi615(void *vp);

DESCRIPTION

       These  functions implement high performance Viterbi decoders for four convolutional codes:
       a  rate  1/2  constraint  length  7  (k=7)  code  ("viterbi27"),  a  rate  1/2  k=9   code
       ("viterbi29"),  a rate 1/3 k=9 code ("viterbi39") and a rate 1/6 k=15 code ("viterbi615").
       The decoders use the Intel IA32 or PowerPC SIMD instruction sets, if available, to improve
       decoding speed.

       On  the IA32 there are three different SIMD instruction sets. The first and most common is
       MMX, introduced on later Intel Pentiums and then on the Intel Pentium II  and  most  Intel
       clones  (AMD  K6, Transmeta Crusoe, etc).  SSE was introduced on the Pentium III and later
       implemented in the AMD Athlon 4 (AMD calls it "3D  Now!   Professional").  Most  recently,
       SSE2  was introduced in the Intel Pentium 4, and has been adopted by more recent AMD CPUs.
       The presence of SSE2 implies the existence of SSE, which in turn implies MMX.

       Altivec is the PowerPC SIMD instruction set. It is roughly comparable to SSE2. Altivec was
       introduced  to the general public in the Apple Macintosh G4; it is also present in the G5.
       Altivec is actually a Motorola trademark; Apple calls it "Velocity Engine" and  IBM  calls
       it "VMX". All refer to the same thing.

       When  built  for  the  IA32 or PPC architectures, the functions automatically use the most
       powerful SIMD instruction set available. If no SIMD instructions are available, or if  the
       library  is  built  for  a  non-IA32,  non-PPC  machine,  a portable C version is executed
       instead.

USAGE

       Four versions of each function  are  provided,  one  for  each  code.   In  the  following
       discussion,  change  "viterbi" to "viterbi27", "viterbi29", "viterbi39" or "viterbi615" as
       desired.

       Before  Viterbi  decoding  can  begin,  an   instance   must   first   be   created   with
       create_viterbi().   This  function  creates  and  returns a pointer to an internal control
       structure containing the path metrics and the branch decisions. create_viterbi() takes one
       argument that gives the length of the data block in bits. You must not attempt to decode a
       block longer than the length given to create_viterbi().

       Before decoding a new frame, init_viterbi() must be called to reset the decoder state.  It
       accepts  the  instance pointer returned by create_viterbi() and the initial starting state
       of the convolutional encoder (usually 0). If the initial  starting  state  is  unknown  or
       incorrect,  the  decoder  will still function but the decoded data may be incorrect at the
       start of the block.

       Blocks of received symbols are processed with calls to  update_viterbi_blk().   The  nbits
       parameter  specifies the number of data bits (not channel symbols) represented by the syms
       buffer. (For rate 1/2 codes, the number of symbols in syms is twice  nbits,  and  so  on.)
       Each  symbol is expected to range from 0 through 255, with 0 corresponding to a "strong 0"
       and 255 corresponding to a "strong 1". The  caller  is  responsible  for  determining  the
       proper pairing of input symbols (commonly known as decoder symbol phasing).

       At  the  end  of  the block, the data is recovered with a call to chainback_viterbi(). The
       arguments are the pointer to the decoder instance, a pointer  to  a  user-supplied  buffer
       into which the decoded data is to be written, the number of data bits (not bytes) that are
       to be decoded, and the terminal state of the convolutional encoder at the end of the frame
       (usually  0).  If the terminal state is incorrect or unknown, the decoded data bits at the
       end of the frame may be unreliable. The decoded data is written in big-endian order, i.e.,
       the  first  bit  in  the frame is written into the high order bit of the first byte in the
       buffer. If the frame is not an integral number of bytes long, the low order  bits  of  the
       last byte in the frame will be unused.

       Note  that the decoders assume the use of a tail, i.e., the encoding and transmission of a
       sufficient number of  padding  bits  beyond  the  end  of  the  user  data  to  force  the
       convolutional encoder into the known terminal state given to chainback_viterbi(). The tail
       is always one bit less than the constraint length of the code, so the k=7 code uses 6 tail
       bits  (12 tail symbols), the k=9 code uses 8 tail bits (16 tail symbols) and the k=15 code
       uses 14 tail bits (84 tail symbols).

       The  tail  bits  are  not  included  in  the  length  arguments  to  create_viterbi()  and
       chainback_viterbi(). For example, if the block contains 1000 user bits, then this would be
       the  length  parameter  given  to  create_viterbi27()   and   chainback_viterbi27(),   and
       update_viterbi27_blk()  would be called with a total of 2012 symbols - the last 12 encoded
       symbols representing the tail bits.

       After the  call  to  chainback_viterbi(),  the  decoder  may  be  reset  with  a  call  to
       init_viterbi()  and  another block can be decoded.  Alternatively, delete_viterbi() can be
       called to free all resources used by the Viterbi decoder.

       The set_viterbi_polynomial() function allows use of other than the default code  generator
       polynomials. Although only one set of polynomials are generally used with each code, there
       can are different conventions as to their order and symbol polarity, and  these  functions
       simplifies their use.

       The  default  polynomials  for  the  viterbi27 routes are those of the NASA-JPL convention
       without symbol inversion.  The NASA-JPL convention normally inverts the first symbol.  The
       CCSDS/NASA-GSFC convention swaps the two symbols and inverts the second.

       To set the NASA-JPL convention with symbol inversion:

       int polys[2] = { -V27POLYA,V27POLYB };
       set_viterbi27_polynomial(polys);

       and to set the CCSDS convention with symbol inversion:

       int polys[2] = { V27POLYB,-V27POLYA };
       set_viterbi27_polynomial(polys);

       The  default  polynomials  for  the  viterbi615  routines  are  those  used by the Cassini
       spacecraft without symbol inversion. Mars Pathfinder (MPF) and STEREO swap the  third  and
       fourth  polynomials.  Both conventions invert the first, third and fifth symbols. Refer to
       fec.h for the polynomial constant definitions.

       To set the Cassini convention with symbol inversion, do the following:

       int polys[6] = { -V615POLYA,V615POLYB,-V615POLYC,V615POLYD,-V615POLYE,V615POLYF };
       set_viterbi615_polynomial(polys);

       and to set the MPF/STEREO convention with symbol inversion:

       int polys[6] = { -V615POLYA,V615POLYB,-V615POLYD,V615POLYC,-V615POLYE,V615POLYF };
       set_viterbi615_polynomial(polys);

       For performance reasons, calling this function changes the code generator polynomials  for
       all instances of corresponding Viterbi decoder, including those already created.

ERROR PERFORMANCE

       These  decoders  have  all  been  extensively  tested  and  found  to  provide performance
       consistent with that expected for soft-decision Viterbi decoding with 8-bit symbols.

       Due to internal differences, the implementations vary slightly in  error  performance.  In
       general, the portable C versions exhibit the best error performance because they use full-
       sized branch metrics, and the MMX versions exhibit the worst because they use 8-bit branch
       metrics  with  modulo  comparisons. The SSE, SSE2 and Altivec implementations of the r=1/2
       k=7 and r=1/2 k=9 codes use unsigned 8-bit branch metrics, and are almost as good as the C
       versions.   The r=1/3 k=9 and r=1/6 k=15 codes are implemented with 16-bit path metrics in
       all SIMD versions.

DIRECT ACCESS TO SPECIFIC FUNCTION VERSIONS

       Calling the functions listed above automatically calls  the  appropriate  version  of  the
       function  depending  on the CPU type and available SIMD instructions. A particular version
       can also be called directly by appending the appropriate suffix to the function name.  The
       available suffixes are "_mmx", "_sse", "_sse2", "_av" and "_port", for the MMX, SSE, SSE2,
       Altivec and portable  versions,  respectively.  For  example,  the  SSE2  version  of  the
       update_viterbi27_blk() function can be invoked as update_viterbi27_blk_sse2().

       Naturally,  the  _av  functions  are  only available on the PowerPC and the _mmx, _sse and
       _sse2 versions are only available on IA-32. Calling a SIMD-enabled function on a CPU  that
       doesn't  support the appropriate set of instructions will result in an illegal instruction
       exception.

RETURN VALUES

       create_viterbi returns a pointer to the structure containing the decoder state.  The other
       functions return -1 on error, 0 otherwise.

AUTHOR & COPYRIGHT

       Phil Karn, KA9Q (karn@ka9q.net)

LICENSE

       This  software  may  be  used  under  the  terms of the GNU Limited General Public License
       (LGPL).

                                                                                  SIMD-VITERBI(3)