Ubuntu Manpage: scan_utf8 - decode an unsigned integer from UTF-8 encoding

Provided by: libowfat-dev_0.32-4.1build1_amd64

NAME

       scan_utf8 - decode an unsigned integer from UTF-8 encoding

SYNTAX

       #include <libowfat/scan.h>

       size_t scan_utf8(const char *src,size_t len,uint32_t *dest);

       size_t scan_utf8_sem(const char *src,size_t len,uint32_t *dest);

DESCRIPTION

       scan_utf8  decodes an unsigned integer in UTF-8 encoding from a memory area holding binary
       data.  It writes the decode value in dest and returns the number of  bytes  it  read  from
       src.

       scan_utf8  never reads more than len bytes from src.  If the sequence is longer than that,
       or the memory area contains an invalid sequence, scan_utf8 returns 0 and  does  not  touch
       dest.

       The length of the longest valid UTF-8 sequence is 6.

       scan_utf8  will reject syntactically invalid encodings, but not semantically invalid ones.
       scan_utf8_sem will additionally reject surrogates.

NOTE

       fmt_utf8 and scan_utf8 implement the encoding from UTF-8, but are  meant  to  be  able  to
       store  integers, not just Unicode code points.  Values above 0x10ffff are not valid UTF-8.
       If you are using this function to parse UTF-8, you need to reject them (see RFC 3629).

NAME

SYNTAX

DESCRIPTION

NOTE

SEE ALSO