Provided by: libintl-perl_1.33-1_all bug

NAME

       Locale::TextDomain::FAQ - Frequently asked questions for libintl-perl

DESCRIPTION

       This FAQ

QUESTIONS AND ANSWERS

   Why is libintl-perl so big?  Why don't you use Encode(3pm) for character set conversion
       instead of rolling your own version?
       Encode(3pm) requires at least Perl 5.7.x, whereas libintl-perl needs to be operational on
       Perl 5.004.  Internally, libintl-perl uses Encode(3pm) if it is available.

   Why do the gettext functions always unset the utf-8 flag on the strings it returns?
       Because the gettext functions do not know whether the string is encoded in utf-8 or not.
       Instead of taking guesses, it rather unsets the flag.

   Can I set the utf-8 flag on strings returned by the gettext family of functions?
       Yes, but it is not recommended.  If you absolutely want to do it, use the function
       bind_textdomain_filter in Locale::Messages for it.

       The strings returned by gettext and friends are by default encoded in the preferred
       charset for the user's locale, but there is no portable way to find out, whether this is
       utf-8 or not.  That means, you either have to enforce utf-8 as the output character set
       (by means of bind_textdomain_codeset() and/or the environment variable OUTPUT_CHARSET) and
       override the user preference, or you run the risk of marking strings as utf-8 which really
       aren't utf-8.

       The whole concept behind that utf-8 flag introduced in Perl 5.6 is seriously broken, and
       the above described dilemma is a proof for that.  The best thing you can do with that flag
       is get rid of it, and turn it off.  Your code will benefit from it and become less error
       prone, more portable and faster.

   Why do non-ASCII characters in my Gtk2 application look messed up?
       The Perl binding of Gtk2 has a design flaw.  It expects all UI messages to be in UTF-8 and
       it also expects messages to be flagged as utf-8.  The only solution for you is to enforce
       all your po files to be encoded in utf-8 (convert them manually, if you need to), and also
       enforce that charset in your application, regardless of the user's locale settings.
       Assumed that your textdomain is "org.bar.foo", you have to code the following into your
       main module or script:

         BEGIN {
             bind_textdomain_filter 'org.bar.foo', \&turn_utf_8_on;
             bind_textdomain_codeset 'org.bar.foo', 'utf-8';
         }

       See the File GTestRunner.pm of Test::Unit::GTestRunner(3pm) for details.

   How do I interface Glade2 UI definitions with libintl-perl?
       Gtk2::GladeXML(3pm) seems to ignore calls to bind_textdomain().  See the File
       GTestRunner.pm of Test::Unit::GTestRunner(3pm) for a possible solution.

   Why does Locale::TextDomain use a double underscore?  I am used to a single underscore from C
       or other languages.
       Function names that consist of exactly one non-alphanumerical character make the function
       automatically global in Perl.  Besides, in Perl 6 the concatenation operator will be the
       underscore instead of the dot.

   How do I switch languages or force a certain language independently from user settings read
       from the environment?
       The simple answer is:

           use POSIX qw (setlocale LC_ALL);

           my $language = 'fr';
           my $country = 'FR';
           my $charset = 'iso-8859-1';

           setlocale LC_ALL, "${language}_$country.$charset";

       Sadly enough, this will fail in many cases.  The problem is that locale identifiers are
       not standardized and are completely system-dependent.  Not only their overall format, but
       also other details like case-sensitivity.  Some systems are very forgiving about the
       system - for example normalizing charset descriptions - others very strict.  In order to
       be reasonably platform independent, you should try a list of possible locale identifiers
       for your desired settings.  This is about what I would try for achieving the above:

          my @tries = qw (
               fr_FR.iso-8859-1 fr_FR.iso8859-1 fr_FR.iso88591
               fr_FR.ISO-8859-1 fr_FR.ISO8859-1 fr_FR.ISO88591
               fr.iso-8859-1 fr.iso8859-1 fr.iso88591
               fr.ISO-8859-1 fr.ISO8859-1 fr.ISO88591
               fr_FR
               French_France.iso-8859-1 French_France.iso8859-1 French_France.iso88591
               French_France.ISO-8859-1 French_France.ISO8859-1 French_France.ISO88591
               French.iso-8859-1 French.iso8859-1 French.iso88591
               French.ISO-8859-1 French.ISO8859-1 French.ISO88591
          );
          foreach my $try (@tries) {
               last if setlocale LC_ALL, $try;
          }

       Set Locale::Util(3pm) for functions that help you with this.

       Alternatively, you can force a certain language by setting the environment variables
       LANGUAGE, LANG and OUTPUT_CHARSET, but this is only guaranteed to work, if you use the
       pure Perl implementation of gettext (see the documentation for select_package() in
       Locale::Messages(3pm)). You would do the above like this:

           use Locale::Messages qw (nl_putenv);

           # LANGUAGE is a colon separated list of languages.
           nl_putenv("LANGUAGE=fr_FR");

           # If LANGUAGE is set, LANG should be set to the primary language.
           # This is not needed for gettext, but for other parts of the system
           # it is.
           nl_putenv("LANG=fr_FR");

           # Force an output charset like this:
           nl_putenv("OUTPUT_CHARSET=iso-8859-1");

           setlocale (LC_MESSAGES, 'C');

       These environment variables are GNU extensions, and they are also honored by libintl-perl.
       Still, you should always try to set the locale with setlocale for the catch-all category
       LC_ALL.  If you miss to do so, your program's output maybe cluttered, mixing languages and
       charsets, if the system runs in a locale that is not compatible with your own language
       settings.

       Remember that these environment variables are not guaranteed to work, if you use an XS
       version of gettext.  In order to force usage of the pure Perl implementation, do the
       following:

           Locale::Messages->select_package ('gettext_pp');

       If you think, this is brain-damaged, you are right, but I cannot help you.  Actually there
       should be a more flexible API than setlocale, but at the time of this writing there isn't.
       Until then, the recommentation goes like this:

               1) Try setting LC_ALL with Locale::Util.
               2) If that does not succeed, either give up or ...
               3) Reset LC_MESSAGES to C/POSIX.
               4) Switch to pure Perl for gettext.
               5) Set the environment variables LANGUAGE, LANG,
                  and OUTPUT_CHARSET to your desired values.

   What is the advantage of libintl-perl over Locale::Maketext?
       Of course, I can only give my personal opinion as an answer.

       Locale::Maketext claims to fix design flaws in gettext.  These alleged design flaws,
       however, boil down to one pathological case which always has a workaround.  But both
       programmers and translators pay this fix with an unnecessarily complicated interface.

       The paramount advantage of libintl-perl is that it uses an approved technology and
       concept.  Except for Java(tm) programs, this is the state-of-the-art concept for
       localizing Un*x software.  Programmers that have already localized software in C, C++, C#,
       Python, PHP, or a number of other languages will feel instantly at home, when localizing
       software written in Perl with libintl-perl.  The same holds true for the translators,
       because the files they deal with have exactly the same format as those for other
       programming languages.  They can use the same set of tools, and even the commands they
       have to execute are the same.

       With libintl-perl refactoring of the software is painless, even if you modify, add or
       delete translatable strings.  The gettext tools are powerful enough to reduce the effort
       of the translators to the bare minimum.  Maintaining the message catalogs of
       Locale::Maketext in larger scale projects, is IMHO unfeasible.

       Editing the message catalogs of Locale::Maketext - they are really Perl modules - asks too
       much from most translators, unless they are programmers.  The portable object (po) files
       used by libintl-perl have a simple syntax, and there are a bunch of specialized GUI
       editors for these files, that facilitate the translation process and hide most complexity
       from the user.

       Furthermore, libintl-perl makes it possible to mix programming languages without a
       paradigm shift in localization.  Without any special efforts, you can write a localized
       software that has modules written in C, modules in Perl, and builds a Gtk user interface
       with Glade.  All translatable strings end up in one single message catalog.

       Last but not least, the interface used by libintl-perl is plain simple:  Prepend
       translatable strings with a double underscore, and you are done in most cases.

   Why do single-quoted strings not work?
       You probably write something like this:

           print __'Hello';

       And you get an error message like "Can't find string terminator "'" anywhere before EOF at
       ...", or even "Bareword found where operator expected at ... Might be a runaway multi-line
       '' string starting on".  The above line is (really!) essentially the same as writing:

           print __::Hello';

       A lesser know feature of Perl is that you can use a single quote ("'") as the separator in
       packages instead of the double colon (":").  What the Perl parser sees in the first
       example is a valid package name ("__") followed by the separator ("'"), then another valid
       package name ("Hello") followed by a lone single quote.  It is therefore not a problem in
       libintl-perl but simple wrong Perl syntax.  You have to correct alternatives:

           print __ 'Hello';   # Insert a space to disambiguate.

       Or use double-quotes:

           print __"Hello";

       Thanks to Slavi Agafonkin for pointing me to the solution of this mystery.

   What options should be used with xgettext?
       More precise, the question should be which '--keyword' and '--flag' options for xgettext
       should be used.  All other options are completely dependent on your use-case.

       If you are using Locale::Messages or Locale::Gettext for localizing Perl code, the default
       keywords and default flags built into xgettext are correct.

       If you are using Locale::TextDomain you have to use a long plethora of command-line
       options for xgettext.  Beginning with libintl-perl 1.28 you can use the library itself to
       produce these options:

           perl -MLocale::TextDomain -e 'print Locale::TextDomain->options'

       If you want to disable the use of the built-in default keywords, precede the output of the
       above command with '--keyword=""'.  That will reset the keywords for xgettext.

   Why Isn't There A Function N__x(), N__nx(), or N__px()?
       The sole purpose of these functions would be to set proper flags in the output of
       xgettext(1).  You probably thought of something like this:

           xgettext --keyword=N__x --flag=N__x:1:perl-brace-format filename.pl

       First of all, xgettext(1) will always set the flag correctly if the argument to N__()
       looks like a brace format string.

       Second, you can set any flag you want on the PO entry with a source code comment:

          # xgettext: no-perl-brace-format
          my $msg = N__("Placeholders are enclosed in {curly} braces.");

       When xgettext(1) extracts the string, it will appear like this in the .pot file:

           #: filename.pl:2304
           #, no-perl-brace-format
           msgid "Placeholders are enclosed in {curly} braces."
           msgstr ""

       No reason to pollute the namespace with N__x functions.