Provided by: libunicode-maputf8-perl_1.11-4_all bug

NAME

       Unicode::MapUTF8 - Conversions to and from arbitrary character sets and UTF8

SYNOPSIS

        use Unicode::MapUTF8 qw(to_utf8 from_utf8 utf8_supported_charset);

        # Convert a string in 'ISO-8859-1' to 'UTF8'
        my $output = to_utf8({ -string => 'An example', -charset => 'ISO-8859-1' });

        # Convert a string in 'UTF8' encoding to encoding 'ISO-8859-1'
        my $other  = from_utf8({ -string => 'Other text', -charset => 'ISO-8859-1' });

        # List available character set encodings
        my @character_sets = utf8_supported_charset;

        # Add a character set alias
        utf8_charset_alias({ 'ms-japanese' => 'sjis' });

        # Convert between two arbitrary (but largely compatible) charset encodings
        # (SJIS to EUC-JP)
        my $utf8_string   = to_utf8({ -string =>$sjis_string, -charset => 'sjis'});
        my $euc_jp_string = from_utf8({ -string => $utf8_string, -charset => 'euc-jp' })

        # Verify that a specific character set is supported
        if (utf8_supported_charset('ISO-8859-1') {
            # Yes
        }

DESCRIPTION

       Provides an adapter layer between core routines for converting to and from UTF8 and other
       encodings. In essence, a way to give multiple existing Unicode modules a single common
       interface so you don't have to know the underlaying implementations to do simple UTF8 to-
       from other character set encoding conversions. As such, it wraps the Unicode::String,
       Unicode::Map8, Unicode::Map and Jcode modules in a standardized and simple API.

       This also provides general character set conversion operation based on UTF8 - it is
       possible to convert between any two compatible and supported character sets via a simple
       two step chaining of conversions.

       As with most things Perlish - if you give it a few big chunks of text to chew on instead
       of lots of small ones it will handle many more characters per second.

       By design, it can be easily extended to encompass any new charset encoding conversion
       modules that arrive on the scene.

       This module is intended to provide good Unicode support to versions of Perl prior to 5.8.
       If you are using Perl 5.8.0 or later, you probably want to be using the Encode module
       instead. This module does work with Perl 5.8, but Encode is the preferred method in that
       environment.

CHANGES

        1.11 2005.10.10   Documentation changes. Addition of Build.PL support.
                          Added various build tests, LICENSE, Artistic_License.txt,
                          GPL_License.txt. Split documentation into seperate
                          .pod file. Added Japanese translation of POD.

        1.10 2005.05.22 - Fixed bug in conversion of ISO-2022-JP to UTF-8.
                          Problem and fix found by Masahiro HONMA
                          <masahiro.honma@tsutaya.co.jp>.

                          Similar bugs in conversions of shift_jis and euc-jp
                          to UTF-8 fixed as well.

        1.09 2001.08.22 - Fixed multiple typo occurances of 'uft'
                          where 'utf' was meant in code. Problem affected
                          utf16 and utf7 encodings. Problem found
                          by devon smith <devon@taller.PSCL.cwru.edu>

        1.08 2000.11.06 - Added 'utf8_charset_alias' function to
                          allow for runtime setting of character
                          set aliases. Added several alternate
                          names for 'sjis' (shiftjis, shift-jis,
                          shift_jis, s-jis, and s_jis).

                          Corrected 'croak' messages for
                          'from_utf8' functions to appropriate
                          function name.

                          Tightened up initialization encapsulation

                          Corrected fatal problem in jcode from
                          unicode internals. Problem and fix
                          found by Brian Wisti <wbrian2@uswest.net>.

        1.07 2000.11.01 - Added 'croak' to use Carp declaration to
                          fix error messages.  Problem and fix
                          found by Brian Wisti
                          <wbrian2@uswest.net>.

        1.06 2000.10.30 - Fix to handle change in stringification
                          of overloaded objects between Perl 5.005
                          and 5.6. Problem noticed by Brian Wisti
                          <wbrian2@uswest.net>.

        1.05 2000.10.23 - Error in conversions from UTF8 to
                          multibyte encodings corrected

        1.04 2000.10.23 - Additional diagnostic messages added
                          for internal error conditions

        1.03 2000.10.22 - Bug fix for load time autodetction of
                          Unicode::Map8 encodings

        1.02 2000.10.22 - Added load time autodetection of
                          Unicode::Map8 supported character set
                          encodings.

                          Fixed internal calling error for some
                          character sets with 'from_utf8'. Thanks
                          goes to Ilia Lobsanov
                          <ilia@lobsanov.com> for reporting this
                          problem.

        1.01 2000.10.02 - Fixed handling of empty strings and
                          added more identification for error
                          messages.

        1.00 2000.09.29 - Pre-release version

FUNCTIONS

       utf8_charset_alias({ $alias => $charset });
           Used for runtime assignment of character set aliases.

           Called with no parameters, returns a hash of defined aliases and the character sets
           they map to.

           Example:

             my $aliases     = utf8_charset_alias;
             my @alias_names = keys %$aliases;

           If called with ONE parameter, returns the name of the 'real' charset if the alias is
           defined. Returns undef if it is not found in the aliases.

           Example:

               if (! utf8_charset_alias('VISCII')) {
                   # No alias for this
               }

           If called with a list of 'alias' => 'charset' pairs, defines those aliases for use.

           Example:

               utf8_charset_alias({ 'japanese' => 'sjis', 'japan' => 'sjis' });

           Note: It will croak if a passed pair does not map to a character set defined in the
           predefined set of character encoding. It is NOT allowed to alias something to another
           alias.

           Multiple character set aliases can be set with a single call.

           To clear an alias, pass a character set mapping of undef.

           Example:

               utf8_charset_alias({ 'japanese' => undef });

           While an alias is set, the 'utf8_supported_charset' function will return the alias as
           if it were a predefined charset.

           Overriding a base defined character encoding with an alias will generate a warning
           message to STDERR.

       utf8_supported_charset($charset_name);
           Returns true if the named charset is supported (including user defined aliases).

           Returns false if it is not.

           Example:

               if (! utf8_supported_charset('VISCII')) {
                   # No support yet
               }

           If called in a list context with no parameters, it will return a list of all supported
           character set names (including user defined aliases).

           Example:

               my @charsets = utf8_supported_charset;

       to_utf8({ -string => $string, -charset => $source_charset });
           Returns the string converted to UTF8 from the specified source charset.

       from_utf8({ -string => $string, -charset => $target_charset});
           Returns the string converted from UTF8 to the specified target charset.

VERSION

       1.11 2005.10.10

TODO

       Regression tests for Jcode, 2-byte encodings and encoding aliases

SEE ALSO

       Unicode::String Unicode::Map8 Unicode::Map Jcode Encode

COPYRIGHT

       Copyright 2000-2005, Benjamin Franz. All rights reserved.

AUTHOR

       Benjamin Franz <snowhare@nihongo.org>

LICENSE

       This program is free software; you can redistribute it and/or modify it under the same
       terms and conditions as Perl itself.

       This means that you can, at your option, redistribute it and/or modify it under either the
       terms the GNU Public License (GPL) version 1 or later, or under the Perl Artistic License.

       See http://dev.perl.org/licenses/

DISCLAIMER

       THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES,
       INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
       PARTICULAR PURPOSE.

       Use of this software in any way or in any form, source or binary, is not allowed in any
       country which prohibits disclaimers of any implied warranties of merchantability or
       fitness for a particular purpose or any disclaimers of a similar nature.

       IN NO EVENT SHALL I BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL,  OR
       CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION
       (INCLUDING, BUT NOT LIMITED TO, LOST PROFITS) EVEN IF I HAVE BEEN ADVISED OF THE
       POSSIBILITY OF SUCH DAMAGE