Ubuntu Manpage: YUV4MPEG2 - video stream format used by pipe-based MJPEGtools

Provided by: mjpegtools_2.1.0+debian-4_amd64

NAME

       YUV4MPEG2 - video stream format used by pipe-based MJPEGtools

DESCRIPTION

       Many  of  the  MJPEGtools  communicate via pipes and act as filters (or sources or sinks).
       The format of video data used in  the  pipes  is  referred  to  as  "YUV4MPEG",  or,  more
       precisely,  "YUV4MPEG2".   (The  format  was  extended  and  codified during v1.5.x of the
       tools.)

       The basic structure is a stream header followed by an unlimited number  of  frames.   Each
       frame  itself consists of a header followed by video data.  The headers are vaguely human-
       readable ASCII, but the video data is simple byte-size binary.

       The MJPEGtools distribution has a C library (libmjpegutils) which contains  functions  for
       manipulating  YUV4MPEG2  streams.   We  recommend  that  you  use this library rather than
       writing your own code if possible.  See the header file "yuv4mpeg.h" for a description  of
       these functions.

       Design Goals:

       o      Easy to parse both via C or sh.

       o      Extensible; easy to add new parameters while maintaining backwards compatibility.

       o      Simple upgrade from original "YUV4MPEG" format.

       Drawbacks:

       o      Frame headers do not have constant size, so streams are not seekable.

GRAMMAR

       The precise description of the the YUV4MPEG2 stream format is as follows:

       STREAM consists of

       -  one STREAM-HEADER

       -  unlimited number of FRAMEs

       STREAM-HEADER consists of

       -  magic string "YUV4MPEG2"

       -  unlimited number of TAGGED-FIELDs, each preceeded by a ' ' (single space) separator

       -  single '\n' line terminator

       FRAME consists of

       -  one FRAME-HEADER

       -  "length" octets of planar YCbCr 4:2:0 image data (If the stream is interlaced, then the
          two fields per frame are interleaved, with proper spatial ordering.)

       FRAME-HEADER consists of

       -  magic string "FRAME"

       -  unlimited number of TAGGED-FIELDs, each preceeded by a ' ' (single space) separator

       -  single '\n' line terminator

       TAGGED-FIELD consists of

       -  single ASCII character tag

       -  VALUE (which does not contain whitespace)

       VALUE consists of

       -  RATIO, or

       -  integer (base 10 ASCII representation), or

       -  single ascii character, or

       -  string (multiple ASCII characters)

       RATIO consists of

       -  numerator (base 10 ASCII integer)

       -  ':' (a colon)

       -  denominator (base 10 ASCII integer)

       Header tags fall into three categories:  optional,  required  and  has-default.   Optional
       tags  are  completely  optional  and  may be omitted from a header.  Required tags must be
       present in a header.  Has-Default tags have a default value which is implied if the tag is
       not present.  Independent of these categories, some tags allow an "unknown" value and some
       do not.

       The supported tags for the STREAM-HEADER:

              W[integer] - frame width in pixels, must be > 0 (required)

              H[integer] - frame height in pixels, must be > 0 (required)

              C[string] - chroma subsampling, image data format (has default)
                      420jpeg  - 4:2:0 with JPEG/MPEG-1 siting (default)
                      420mpeg2 - 4:2:0 with MPEG-2 siting
                      420paldv - 4:2:0 with PAL-DV siting
                      411      - 4:1:1, cosited
                      422      - 4:2:2, cosited
                      444      - 4:4:4 (no subsampling)
                      444alpha - 4:4:4 with an alpha channel
                      mono     - luma (Y') plane only

              I[char] - interlacing specification:  (has default)
                      ? - unknown (default)
                      p - progressive/none
                      t - top-field-first
                      b - bottom-field-first
                      m - mixed-mode: refer to 'I' tag in frame header

              F[ratio] - frame-rate (has default of 0:0 == unknown)

              A[ratio] - sample aspect ratio (has default of 0:0 == unknown)

              X[string] - 'metadata' (optional; unparsed, but passed around)

       The currently supported tags for the FRAME-HEADER:

              I[string] - framing and sampling (required if-and-only-if Im is present  in  stream
                     header).   Value  is  a  string  of  three  characters  "xyz" which have the
                     following meanings:
                      x: frame presentation
                         t - top-field-first
                         T - top-field-first and repeat
                         b - bottom-field-first
                         B - bottom-field-first and repeat
                         1 - single progressive frame
                         2 - double progressive frame (repeat)
                         3 - triple progressive frame (repeat)
                      y: frame temporal sampling
                         p - progressive (fields sampled at same time)
                         i - interlaced (fields sampled at different times)
                      z: frame chroma-subsampling
                         p - progressive (subsampling over whole frame)
                         i - interlaced (each field subsampled independently)
                         ? - unknown (allowed only for non-4:2:0 subsampling)

              X[string] - 'metadata' (optional; unparsed, but passed around)

       Note that a filter application must faithfully forward all "X" tags  from  input  pipe  to
       output  pipe  (unless it uses one of those tags, of course).  The supplied library will do
       this automatically if the functions y4m_copy_stream_info() and  y4m_copy_frame_info()  are
       used appropriately.

NOTES ON IMAGE DATA

       All image data is in the CCIR-601 Y'CbCr colorspace, presented plane-by-plane in row-major
       order.  Each sample within each plane is one octet (8-bits) in size.  When all planes  are
       present,  they  are  transmitted  in  the  order  Y',  Cb,  Cr, potentially followed by an
       alpha/transparency mask plane (for the 444alpha chroma format).  The alpha channel data is
       follows  the  same  range  as  the  Y'  luma channel:  full transparency is at 16 and full
       opacity is at 235.

       All Y' and alpha planes consist of (height X width) octets.  The size of the chroma planes
       depends on the subsampling mode:

       -  4:4:4 - (height X width) octets

       -  4:2:2 - (height X width) / 2 octets

       -  4:1:1 - (height X width) / 4 octets

       -  4:2:0 - (height X width) / 4 octets

NOTES ON FRAMING

       (More to come here.)

AUTHOR

       This manual page Copyright 2004 Matthew J. Marjanovic.

NAME

DESCRIPTION

GRAMMAR

NOTES ON IMAGE DATA

NOTES ON FRAMING

SEE ALSO

AUTHOR