Ubuntu Manpage: unicode_u_ucs4_native, unicode_u_ucs2_native, unicode_convert_init, unicode

Provided by: libcourier-unicode-dev_2.0-2_amd64

NAME

       unicode_u_ucs4_native, unicode_u_ucs2_native, unicode_convert_init, unicode_convert,
       unicode_convert_deinit, unicode_convert_tocbuf_init, unicode_convert_tou_init,
       unicode_convert_fromu_init, unicode_convert_uc, unicode_convert_tocbuf_toutf8_init,
       unicode_convert_tocbuf_fromutf8_init, unicode_convert_toutf8, unicode_convert_fromutf8,
       unicode_convert_tobuf, unicode_convert_tou_tobuf, unicode_convert_fromu_tobuf - unicode
       character set conversion

SYNOPSIS

       #include <courier-unicode.h>

                extern const char unicode_u_ucs4_native[];

                extern const char unicode_u_ucs2_native[];

       unicode_convert_handle_t unicode_convert_init(const char *src_chset,
                                                     const char *dst_chset, void *cb_arg);

       int unicode_convert(unicode_convert_handle_t handle, const char *text, size_t cnt);

       int unicode_convert_deinit(unicode_convert_handle_t handle, int *errptr);

       unicode_convert_handle_t unicode_convert_tocbuf_init(const char *src_chset,
                                                            const char *dst_chset,
                                                            char **cbufptr_ret,
                                                            size_t *cbufsize_ret,
                                                            int nullterminate);

       unicode_convert_handle_t unicode_convert_tocbuf_toutf8_init(const char *src_chset,
                                                                   char **cbufptr_ret,
                                                                   size_t *cbufsize_ret,
                                                                   int nullterminate);

       unicode_convert_handle_t unicode_convert_tocbuf_fromutf8_init(const char *dst_chset,
                                                                     char **cbufptr_ret,
                                                                     size_t *cbufsize_ret,
                                                                     int nullterminate);

       unicode_convert_handle_t unicode_convert_tou_init(const char *src_chset,
                                                         char32_t **ucptr_ret,
                                                         size_t *ucsize_ret, int nullterminate);

       unicode_convert_handle_t unicode_convert_fromu_init(const char *dst_chset,
                                                           char **cbufptr_ret,
                                                           size_t *cbufsize_ret,
                                                           int nullterminate);

       int unicode_convert_uc(unicode_convert_handle_t handle, const char32_t *text, size_t cnt);

       char *unicode_convert_toutf8(const char *text, const char *charset, int *error);

       char *unicode_convert_fromutf8(const char *text, const char *charset, int *error);

       char *unicode_convert_tobuf(const char *text, const char *charset, const char *dstcharset,
                                   int *error);

       int unicode_convert_toubuf(const char *text, size_t text_l, const char *charset,
                                  char32_t **uc, size_t *ucsize, int *error);

       int unicode_convert_fromu_tobuf(const char32_t *utext, size_t utext_l,
                                       const char *charset, char **c, size_t *csize, int *error);

DESCRIPTION

       unicode_u_ucs4_native[] contains the string “UCS-4BE” or “UCS-4LE”, matching the native
       char32_t endianness.

       unicode_u_ucs2_native[] contains the string “UCS-2BE” or “UCS-2LE”, matching the native
       char32_t endianness.

       unicode_convert_init(), unicode_convert(), and unicode_convert_deinit() are an adaption of
       th iconv(3)[1] API that uses the same calling convention as the other algorithms in this
       unicode library, with some value-added features. These functions use iconv(3) to effect
       the actual character set conversion.

       unicode_convert_init() returns a non-NULL handle for the requested conversion, or NULL if
       the requested conversion is not available.  unicode_convert_init() takes a pointer to the
       output function that receives receives converted character text. The output function
       receives a pointer to the converted character text, and the number of characters in the
       converted text. The output function gets repeatedly called, until it receives the entire
       converted text.

       The character text to convert gets passed, repeatedly, to unicode_convert(). Each call to
       unicode_convert() results in the output function getting invoked, zero or more times, with
       each successive part of the converted text. Finally, unicode_convert_deinit() stops the
       conversion and deallocates the conversion handle.

       It's possible that a call to unicode_convert_deinit() results in some additional calls to
       the output function, passing the remaining, final parts, of the converted text, before
       unicode_convert_deinit() deallocates the handle, and returns.

       The output function should return 0 normally. A non-0 return indicates n error condition.
       unicode_convert_deinit() returns non-zero if any previous invocation of the output
       function returned non-zero (this includes any invocations of the output function resulting
       from this call, or prior unicode_convert() calls), or 0 if all invocations of the output
       function returned 0.

       If the errptr is not NULL, *errptr gets set to non-zero if there were any conversion
       errors -- if there was any text that could not be converted to the destination character
       text.

       unicode_convert() also returns non-zero if it calls the output function and it returns
       non-zero, however the conversion handle remains allocated, so unicode_convert_deinit()
       must still be called, to clean that up.

   Collecting converted text into a buffer
       Call unicode_convert_tocbuf_init() instead of unicode_convert_init(), then call
       unicode_convert() and unicode_convert_deinit() normally. The parameters to
       unicode_convert_init() specify the source and the destination character sets.
       unicode_convert_tocbuf_toutf8_init() is just an alias that specifies UTF-8 as the
       destination character set.  unicode_convert_tocbuf_fromutf8_init() is just an alias that
       specifies UTF-8 as the source character st.

       These functions supply an output function that collects the converted text into a
       malloc()ed buffer. If unicode_convert_deinit() returns 0, *cbufptr_ret gets initialized to
       a malloc()ed buffer, and the number of converted characters, the size of the malloc()ed
       buffer, get placed into *cbufsize_ret.

           Note
           If the converted string is an empty string, *cbufsize_ret gets set to 0, but
           *cbufptr_ret still gets initialized (to a dummy malloced buffer).

       A non-zero nullterminate places a trailing \0 character after the converted string (this
       is included in *cbufsize_ret).

   Converting between character sets and unicode
       unicode_convert_tou_init() converts character text into a char32_t buffer. It works just
       like unicode_convert_tocbuf_init(), except that only the source character set gets
       specified and the output buffer is a char32_t buffer.  nullterminate terminates the
       converted unicode characters with a U+0000.

       unicode_convert_fromu_init() converts char32_ts to the output character set, and also
       works like unicode_convert_tocbuf_init(). Additionally, in this case, unicode_convert_uc()
       works just like unicode_convert() except that the input sequence is a char32_t sequence,
       and the count parameter is th enumber of unicode characters.

   One-shot conversions
       unicode_convert_toutf8() converts the specified text in the specified text into a UTF-8
       string, returning a malloced buffer. If error is not NULL, even if
       unicode_convert_toutf8() returns a non NULL value *error gets set to a non-zero value if a
       character conversion error has occurred, and some characters could not be converted.

       unicode_convert_fromutf8() does a similar conversion from UTF-8 text to the specified
       character set.

       unicode_convert_tobuf() does a similar conversion between two different character sets.

       unicode_convert_tou_tobuf() calls unicode_convert_tou_init(), feeds the character string
       through unicode_convert(), then calls unicode_convert_deinit(). If this function returns
       0, *uc and *ucsize are set to a malloced buffer+size holding the unicode char array.

       unicode_convert_fromu_tobuf() calls unicode_convert_fromu_init(), feeds the unicode array
       through unicode_convert_uc(), then calls unicode_convert_deinit(). If this function
       returns 0, *c and *csize are set to a malloced buffer+size holding the char array.

AUTHOR

       Sam Varshavchik
           Author

NOTES

        1.

                      iconv(3)
           http://manpages.courier-mta.org/htmlman3/iconv.3.html

NAME

SYNOPSIS

DESCRIPTION

SEE ALSO

AUTHOR

NOTES