Provided by: libfec-dev_1.0-26-gc5d935f-1_amd64 bug

NAME

       create_viterbi27,  set_viterbi27_polynomial,  init_viterbi27,  update_viterbi27_blk, chainback_viterbi27,
       delete_viterbi27,  create_viterbi29,  set_viterbi_29_polynomial,  init_viterbi29,   update_viterbi29_blk,
       chainback_viterbi29,   delete_viterbi29,   create_viterbi39,  set_viterbi_39_polynomial,  init_viterbi39,
       update_viterbi39_blk,         chainback_viterbi39,          delete_viterbi39,          create_viterbi615,
       set_viterbi615_polynomial,       init_viterbi615,       update_viterbi615_blk,      chainback_viterbi615,
       delete_viterbi615 - IA32 SIMD-assisted Viterbi decoders

SYNOPSIS

       #include "fec.h"
       void *create_viterbi27(int blocklen);
       void set_viterbi27_polynomial(int polys[2]);
       int init_viterbi27(void *vp,int starting_state);
       int update_viterbi27_blk(void *vp,unsigned char syms[],int nbits);
       int chainback_viterbi27(void *vp, unsigned char *data,unsigned int nbits,unsigned int endstate);
       void delete_viterbi27(void *vp);

       void *create_viterbi29(int blocklen);
       void set_viterbi29_polynomial(int polys[2]);
       int init_viterbi29(void *vp,int starting_state);
       int update_viterbi29_blk(void *vp,unsigned char syms[],int nbits);
       int chainback_viterbi29(void *vp, unsigned char *data,unsigned int nbits,unsigned int endstate);
       void delete_viterbi29(void *vp);

       void *create_viterbi39(int blocklen);
       void set_viterbi39_polynomial(int polys[3]);
       int init_viterbi39(void *vp,int starting_state);
       int update_viterbi39_blk(void *vp,unsigned char syms[],int nbits);
       int chainback_viterbi39(void *vp, unsigned char *data,unsigned int nbits,unsigned int endstate);
       void delete_viterbi39(void *vp);

       void *create_viterbi615(int blocklen);
       void set_viterbi615_polynomial(int polys[6]);
       int init_viterbi615(void *vp,int starting_state);
       int update_viterbi615_blk(void *vp,unsigned char syms[],int nbits);
       int chainback_viterbi615(void *vp, unsigned char *data,unsigned int nbits,unsigned int endstate);
       void delete_viterbi615(void *vp);

DESCRIPTION

       These functions implement high performance Viterbi decoders for four  convolutional  codes:  a  rate  1/2
       constraint  length  7  (k=7)  code  ("viterbi27"), a rate 1/2 k=9 code ("viterbi29"), a rate 1/3 k=9 code
       ("viterbi39") and a rate 1/6 k=15 code ("viterbi615").  The decoders use the Intel IA32 or  PowerPC  SIMD
       instruction sets, if available, to improve decoding speed.

       On the IA32 there are three different SIMD instruction sets. The first and most common is MMX, introduced
       on later Intel Pentiums and then on the Intel Pentium II and most Intel clones (AMD K6, Transmeta Crusoe,
       etc).   SSE was introduced on the Pentium III and later implemented in the AMD Athlon 4 (AMD calls it "3D
       Now!  Professional"). Most recently, SSE2 was introduced in the Intel Pentium 4, and has been adopted  by
       more recent AMD CPUs. The presence of SSE2 implies the existence of SSE, which in turn implies MMX.

       Altivec  is the PowerPC SIMD instruction set. It is roughly comparable to SSE2. Altivec was introduced to
       the general public in the Apple Macintosh G4; it is also  present  in  the  G5.  Altivec  is  actually  a
       Motorola trademark; Apple calls it "Velocity Engine" and IBM calls it "VMX". All refer to the same thing.

       When  built  for  the  IA32  or PPC architectures, the functions automatically use the most powerful SIMD
       instruction set available. If no SIMD instructions are available, or if the library is built for  a  non-
       IA32, non-PPC machine, a portable C version is executed instead.

USAGE

       Four  versions  of  each  function  are provided, one for each code.  In the following discussion, change
       "viterbi" to "viterbi27", "viterbi29", "viterbi39" or "viterbi615" as desired.

       Before Viterbi decoding can begin, an  instance  must  first  be  created  with  create_viterbi().   This
       function  creates  and returns a pointer to an internal control structure containing the path metrics and
       the branch decisions. create_viterbi() takes one argument that gives the length  of  the  data  block  in
       bits. You must not attempt to decode a block longer than the length given to create_viterbi().

       Before  decoding  a  new frame, init_viterbi() must be called to reset the decoder state.  It accepts the
       instance pointer returned by create_viterbi() and the initial starting state of the convolutional encoder
       (usually  0).  If the initial starting state is unknown or incorrect, the decoder will still function but
       the decoded data may be incorrect at the start of the block.

       Blocks of received symbols are  processed  with  calls  to  update_viterbi_blk().   The  nbits  parameter
       specifies  the  number  of  data bits (not channel symbols) represented by the syms buffer. (For rate 1/2
       codes, the number of symbols in syms is twice nbits, and so on.)  Each symbol is expected to range from 0
       through  255,  with  0 corresponding to a "strong 0" and 255 corresponding to a "strong 1". The caller is
       responsible for determining the proper pairing  of  input  symbols  (commonly  known  as  decoder  symbol
       phasing).

       At  the end of the block, the data is recovered with a call to chainback_viterbi(). The arguments are the
       pointer to the decoder instance, a pointer to a user-supplied buffer into which the decoded data is to be
       written,  the  number  of  data  bits  (not  bytes) that are to be decoded, and the terminal state of the
       convolutional encoder at the end of the frame (usually 0). If the terminal state is incorrect or unknown,
       the  decoded  data  bits  at  the end of the frame may be unreliable. The decoded data is written in big-
       endian order, i.e., the first bit in the frame is written into the high order bit of the  first  byte  in
       the  buffer. If the frame is not an integral number of bytes long, the low order bits of the last byte in
       the frame will be unused.

       Note that the decoders assume the use of a tail, i.e., the encoding  and  transmission  of  a  sufficient
       number  of padding bits beyond the end of the user data to force the convolutional encoder into the known
       terminal state given to chainback_viterbi(). The tail is always one bit less than the  constraint  length
       of  the  code, so the k=7 code uses 6 tail bits (12 tail symbols), the k=9 code uses 8 tail bits (16 tail
       symbols) and the k=15 code uses 14 tail bits (84 tail symbols).

       The tail bits are not included in the length arguments to create_viterbi() and  chainback_viterbi().  For
       example,  if  the  block  contains  1000  user  bits,  then  this  would be the length parameter given to
       create_viterbi27() and chainback_viterbi27(), and update_viterbi27_blk() would be called with a total  of
       2012 symbols - the last 12 encoded symbols representing the tail bits.

       After the call to chainback_viterbi(), the decoder may be reset with a call to init_viterbi() and another
       block can be decoded.  Alternatively, delete_viterbi() can be called to free all resources  used  by  the
       Viterbi decoder.

       The  set_viterbi_polynomial()  function  allows use of other than the default code generator polynomials.
       Although only one set of polynomials  are  generally  used  with  each  code,  there  can  are  different
       conventions as to their order and symbol polarity, and these functions simplifies their use.

       The  default  polynomials  for  the  viterbi27 routes are those of the NASA-JPL convention without symbol
       inversion.  The NASA-JPL convention normally inverts the first symbol.   The  CCSDS/NASA-GSFC  convention
       swaps the two symbols and inverts the second.

       To set the NASA-JPL convention with symbol inversion:

       int polys[2] = { -V27POLYA,V27POLYB };
       set_viterbi27_polynomial(polys);

       and to set the CCSDS convention with symbol inversion:

       int polys[2] = { V27POLYB,-V27POLYA };
       set_viterbi27_polynomial(polys);

       The  default  polynomials  for  the  viterbi615 routines are those used by the Cassini spacecraft without
       symbol inversion. Mars Pathfinder  (MPF)  and  STEREO  swap  the  third  and  fourth  polynomials.   Both
       conventions  invert  the  first,  third  and  fifth  symbols.  Refer to fec.h for the polynomial constant
       definitions.

       To set the Cassini convention with symbol inversion, do the following:

       int polys[6] = { -V615POLYA,V615POLYB,-V615POLYC,V615POLYD,-V615POLYE,V615POLYF };
       set_viterbi615_polynomial(polys);

       and to set the MPF/STEREO convention with symbol inversion:

       int polys[6] = { -V615POLYA,V615POLYB,-V615POLYD,V615POLYC,-V615POLYE,V615POLYF };
       set_viterbi615_polynomial(polys);

       For performance reasons, calling this function changes the code generator polynomials for  all  instances
       of corresponding Viterbi decoder, including those already created.

ERROR PERFORMANCE

       These  decoders  have  all  been extensively tested and found to provide performance consistent with that
       expected for soft-decision Viterbi decoding with 8-bit symbols.

       Due to internal differences, the implementations vary slightly in  error  performance.  In  general,  the
       portable  C  versions  exhibit the best error performance because they use full-sized branch metrics, and
       the MMX versions exhibit the worst because they use 8-bit branch metrics  with  modulo  comparisons.  The
       SSE,  SSE2  and  Altivec  implementations  of the r=1/2 k=7 and r=1/2 k=9 codes use unsigned 8-bit branch
       metrics, and are almost as good as the C versions.  The r=1/3 k=9 and r=1/6 k=15  codes  are  implemented
       with 16-bit path metrics in all SIMD versions.

DIRECT ACCESS TO SPECIFIC FUNCTION VERSIONS

       Calling  the functions listed above automatically calls the appropriate version of the function depending
       on the CPU type and available SIMD instructions. A particular version can  also  be  called  directly  by
       appending  the  appropriate  suffix  to  the  function  name.  The available suffixes are "_mmx", "_sse",
       "_sse2", "_av" and "_port", for the MMX, SSE, SSE2, Altivec  and  portable  versions,  respectively.  For
       example,    the   SSE2   version   of   the   update_viterbi27_blk()   function   can   be   invoked   as
       update_viterbi27_blk_sse2().

       Naturally, the _av functions are only available on the PowerPC and the _mmx, _sse and _sse2 versions  are
       only  available  on  IA-32. Calling a SIMD-enabled function on a CPU that doesn't support the appropriate
       set of instructions will result in an illegal instruction exception.

RETURN VALUES

       create_viterbi returns a pointer to the structure containing the  decoder  state.   The  other  functions
       return -1 on error, 0 otherwise.

AUTHOR & COPYRIGHT

       Phil Karn, KA9Q (karn@ka9q.net)

LICENSE

       This software may be used under the terms of the GNU Limited General Public License (LGPL).

                                                                                                 SIMD-VITERBI(3)