Provided by: perl-doc_5.40.0-8_all bug

NAME

       perlclib - Interacting with standard C library functions

DESCRIPTION

       The perl interpreter is written in C; XS code also expands to C.  Inevitably, this code
       will call some functions from the C library, "libc".  This document gives some guidance on
       interfacing with that library.

       One thing Perl porters should note is that perl doesn't tend to use that much of the C
       standard library internally; you'll see very little use of, for example, the ctype.h
       functions in there. This is because Perl tends to reimplement or abstract standard library
       functions, so that we know exactly how they're going to operate.

libc functions to avoid

       There are many many libc functions.  Most of them are fair game to use, but some are not.
       Some of the possible reasons are:

       •   They likely will interfere with the perl interpreter's functioning, such as its
           bookkeeping, or signal handling, or memory allocation, or any number of harmful
           things.

       •   They aren't implemented on all platforms, but there is an alternative that is.

           Or they may be buggy or deprecated on some or all platforms.

       •   They aren't suitable for multi-threaded operation, but there is an alternative that
           is, and is just as easily usable.

           You may not expect your code to ever be used under threads, but code has a way of
           being adapted beyond our initial expectations.  If it is just as easy to use something
           that can be used under threads, it's better to use that now, just in case.

       •   In functions that deal with strings, complications may arise because the string may be
           encoded in different ways, for example in UTF-8.  For these, it is likely better to
           place the string in a SV and use the Perl SV string handling functions that contain
           extensive logic to deal with this.

       •   In functions that deal with numbers, complications may arise because the numbers get
           too big or small, and what those limits are depends on the current platform.  Again,
           the Perl SV numeric data types have extensive logic to take care of these kinds of
           issues.

       •   They are locale-aware, and your caller may not want this.

       The following commentary and tables give some functions in the first column that shouldn't
       be used in C or XS code, with the preferred alternative (if any) in the second column.

   Conventions
       In the following tables:

       "~"
          marks the function as deprecated; it should not be used regardless.

       "t"
          is a type.

       "p"
          is a pointer.

       "n"
          is a number.

       "s"
          is a string.

       "sv", "av", "hv", etc. represent variables of their respective types.

   File Operations
       Instead of the stdio.h functions, you should use the Perl abstraction layer. Instead of
       "FILE*" types, you need to be handling "PerlIO*" types.  Don't forget that with the new
       PerlIO layered I/O abstraction "FILE*" types may not even be available. See also the
       "perlapio" documentation for more information about the following functions:

         Instead Of:                 Use:

         stdin                       PerlIO_stdin()
         stdout                      PerlIO_stdout()
         stderr                      PerlIO_stderr()

         fopen(fn, mode)             PerlIO_open(fn, mode)
         freopen(fn, mode, stream)   PerlIO_reopen(fn, mode, perlio) (Dep-
                                       recated)
         fflush(stream)              PerlIO_flush(perlio)
         fclose(stream)              PerlIO_close(perlio)

   File Input and Output
         Instead Of:                 Use:

         fprintf(stream, fmt, ...)   PerlIO_printf(perlio, fmt, ...)

         [f]getc(stream)             PerlIO_getc(perlio)
         [f]putc(stream, n)          PerlIO_putc(perlio, n)
         ungetc(n, stream)           PerlIO_ungetc(perlio, n)

       Note that the PerlIO equivalents of "fread" and "fwrite" are slightly different from their
       C library counterparts:

         fread(p, size, n, stream)   PerlIO_read(perlio, buf, numbytes)
         fwrite(p, size, n, stream)  PerlIO_write(perlio, buf, numbytes)

         fputs(s, stream)            PerlIO_puts(perlio, s)

       There is no equivalent to "fgets"; one should use "sv_gets" instead:

         fgets(s, n, stream)         sv_gets(sv, perlio, append)

   File Positioning
         Instead Of:                 Use:

         feof(stream)                PerlIO_eof(perlio)
         fseek(stream, n, whence)    PerlIO_seek(perlio, n, whence)
         rewind(stream)              PerlIO_rewind(perlio)

         fgetpos(stream, p)          PerlIO_getpos(perlio, sv)
         fsetpos(stream, p)          PerlIO_setpos(perlio, sv)

         ferror(stream)              PerlIO_error(perlio)
         clearerr(stream)            PerlIO_clearerr(perlio)

   Memory Management and String Handling
         Instead Of:                    Use:

         t* p = malloc(n)               Newx(p, n, t)
         t* p = calloc(n, s)            Newxz(p, n, t)
         p = realloc(p, n)              Renew(p, n, t)
         memcpy(dst, src, n)            Copy(src, dst, n, t)
         memmove(dst, src, n)           Move(src, dst, n, t)
         memcpy(dst, src, sizeof(t))    StructCopy(src, dst, t)
         memset(dst, 0, n * sizeof(t))  Zero(dst, n, t)
         memzero(dst, 0)                Zero(dst, n, char)
         free(p)                        Safefree(p)

         strdup(p)                      savepv(p)
         strndup(p, n)                  savepvn(p, n) (Hey, strndup doesn't
                                                       exist!)

         strstr(big, little)            instr(big, little)
         memmem(big, blen, little, len) ninstr(big, bigend, little, little_end)
         strcmp(s1, s2)                 strLE(s1, s2) / strEQ(s1, s2)
                                                      / strGT(s1,s2)
         strncmp(s1, s2, n)             strnNE(s1, s2, n) / strnEQ(s1, s2, n)

         memcmp(p1, p2, n)              memNE(p1, p2, n)
         !memcmp(p1, p2, n)             memEQ(p1, p2, n)

       Notice the different order of arguments to "Copy" and "Move" than used in "memcpy" and
       "memmove".

       Most of the time, though, you'll want to be dealing with SVs internally instead of raw
       "char *" strings:

         strlen(s)                   sv_len(sv)
         strcpy(dt, src)             sv_setpv(sv, s)
         strncpy(dt, src, n)         sv_setpvn(sv, s, n)
         strcat(dt, src)             sv_catpv(sv, s)
         strncat(dt, src)            sv_catpvn(sv, s)
         sprintf(s, fmt, ...)        sv_setpvf(sv, fmt, ...)

       If you do need raw strings, some platforms have safer interfaces, and Perl makes sure a
       version of these are available on all platforms:

         strlcat(dt, src, sizeof(dt)) my_strlcat(dt, src, sizeof(dt))
         strlcpy(dt, src, sizeof(dt)) my_strlcpy(dt, src, sizeof(dt))
         strnlen(s)                   my_strnlen(s, maxlen)

       Note also the existence of "sv_catpvf" and "sv_vcatpvfn", combining concatenation with
       formatting.

       Sometimes instead of zeroing the allocated heap by using Newxz() you should consider
       "poisoning" the data.  This means writing a bit pattern into it that should be illegal as
       pointers (and floating point numbers), and also hopefully surprising enough as integers,
       so that any code attempting to use the data without forethought will break sooner rather
       than later.  Poisoning can be done using the Poison() macros, which have similar arguments
       to Zero():

         PoisonWith(dst, n, t, b)    scribble memory with byte b
         PoisonNew(dst, n, t)        equal to PoisonWith(dst, n, t, 0xAB)
         PoisonFree(dst, n, t)       equal to PoisonWith(dst, n, t, 0xEF)
         Poison(dst, n, t)           equal to PoisonFree(dst, n, t)

   Character Class Tests
       There are several types of character class tests that Perl implements.  All are more fully
       described in "Character classification" in perlapi and "Character case changing" in
       perlapi.

       The C library routines listed in the table below return values based on the current
       locale.  Use the entries in the final column for that functionality.  The other two
       columns always assume a POSIX (or C) locale.  The entries in the ASCII column are only
       meaningful for ASCII inputs, returning FALSE for anything else.  Use these only when you
       know that is what you want.  The entries in the Latin1 column assume that the non-ASCII
       8-bit characters are as Unicode defines them, the same as ISO-8859-1, often called Latin
       1.

         Instead Of:  Use for ASCII:   Use for Latin1:      Use for locale:

         isalnum(c)  isALPHANUMERIC(c) isALPHANUMERIC_L1(c) isALPHANUMERIC_LC(c)
         isalpha(c)  isALPHA(c)        isALPHA_L1(c)        isALPHA_LC(u )
         isascii(c)  isASCII(c)                             isASCII_LC(c)
         isblank(c)  isBLANK(c)        isBLANK_L1(c)        isBLANK_LC(c)
         iscntrl(c)  isCNTRL(c)        isCNTRL_L1(c)        isCNTRL_LC(c)
         isdigit(c)  isDIGIT(c)        isDIGIT_L1(c)        isDIGIT_LC(c)
         isgraph(c)  isGRAPH(c)        isGRAPH_L1(c)        isGRAPH_LC(c)
         islower(c)  isLOWER(c)        isLOWER_L1(c)        isLOWER_LC(c)
         isprint(c)  isPRINT(c)        isPRINT_L1(c)        isPRINT_LC(c)
         ispunct(c)  isPUNCT(c)        isPUNCT_L1(c)        isPUNCT_LC(c)
         isspace(c)  isSPACE(c)        isSPACE_L1(c)        isSPACE_LC(c)
         isupper(c)  isUPPER(c)        isUPPER_L1(c)        isUPPER_LC(c)
         isxdigit(c) isXDIGIT(c)       isXDIGIT_L1(c)       isXDIGIT_LC(c)

         tolower(c)  toLOWER(c)        toLOWER_L1(c)
         toupper(c)  toUPPER(c)

       For the corresponding functions like iswupper(), etc., use isUPPER_uvchr() for non-locale;
       or isUPPER_LC_uvchr() for locale.  And use toLOWER_uvchr() instead of towlower(), etc..
       There are no direct equivalents for locale; best to put the string into an SV.

       Don't use any of the functions like isalnum_l().  Those are non-portable, and interfere
       with Perl's internal handling.

       To emphasize that you are operating only on ASCII characters, you can append "_A" to each
       of the macros in the ASCII column: "isALPHA_A", "isDIGIT_A", and so on.

       (There is no entry in the Latin1 column for "isascii" even though there is an
       "isASCII_L1", which is identical to "isASCII";  the latter name is clearer.  There is no
       entry in the Latin1 column for "toupper" because the result can be non-Latin1.  You have
       to use "toUPPER_uvchr", as described in "Character case changing" in perlapi.)

       Note that the libc caseless comparisons are crippled; Unicode provides a richer set, using
       the concept of folding.  If you need more than equality/non-equality, it's probably best
       to store your strings in an SV and use SV functions to do the comparision.  Similarly for
       collation.

   stdlib.h functions
         Instead Of:                 Use:

         atof(s)                     my_atof(s) or Atof(s)
         atoi(s)                     grok_atoUV(s, &uv, &e)
         atol(s)                     grok_atoUV(s, &uv, &e)
         strtod(s, &p)               Strtod(s, &p)
         strtol(s, &p, n)            Strtol(s, &p, b)
         strtoul(s, &p, n)           Strtoul(s, &p, b)

       But note that these are subject to locale; see "Dealing with locales".

       Typical use is to do range checks on "uv" before casting:

          int i; UV uv;
          char* end_ptr = input_end;
          if (grok_atoUV(input, &uv, &end_ptr)
              && uv <= INT_MAX)
            i = (int)uv;
            ... /* continue parsing from end_ptr */
          } else {
            ... /* parse error: not a decimal integer in range 0 .. MAX_IV */
          }

       Notice also the "grok_bin", "grok_hex", and "grok_oct" functions in numeric.c for
       converting strings representing numbers in the respective bases into "NV"s.  Note that
       grok_atoUV() doesn't handle negative inputs, or leading whitespace (being purposefully
       strict).

   Miscellaneous functions
       You should not even want to use setjmp.h functions, but if you think you do, use the
       "JMPENV" stack in scope.h instead.

        ~asctime()              Perl_sv_strftime_tm()
        ~asctime_r()            Perl_sv_strftime_tm()
         chsize()               my_chsize()
        ~ctime()                Perl_sv_strftime_tm()
        ~ctime_r()              Perl_sv_strftime_tm()
        ~cuserid()              DO NOT USE; see its man page
         dirfd()                my_dirfd()
         duplocale()            Perl_setlocale()
        ~ecvt()                 my_snprintf()
        ~endgrent_r()           endgrent()
        ~endhostent_r()         endhostent()
        ~endnetent_r()          endnetent()
        ~endprotoent_r()        endprotoent()
        ~endpwent_r()           endpwent()
        ~endservent_r()         endservent()
        ~endutent()             endutxent()
         exit(n)                my_exit(n)
        ~fcvt()                 my_snprintf()
         freelocale()           Perl_setlocale()
        ~ftw()                  nftw()
         getenv(s)              PerlEnv_getenv(s)
        ~gethostbyaddr()        getaddrinfo()
        ~gethostbyname()        getnameinfo()
        ~getpass()              DO NOT USE; see its man page
        ~getpw()                getpwuid()
        ~getutent()             getutxent()
        ~getutid()              getutxid()
        ~getutline()            getutxline()
        ~gsignal()              DO NOT USE; see its man page
         localeconv()           Perl_localeconv()
         mblen()                mbrlen()
         mbtowc()               mbrtowc()
         newlocale()            Perl_setlocale()
         pclose()               my_pclose()
         popen()                my_popen()
        ~pututline()            pututxline()
        ~qecvt()                my_snprintf()
        ~qfcvt()                my_snprintf()
         querylocale()          Perl_setlocale()
         int rand()             double Drand01()
         srand(n)               { seedDrand01((Rand_seed_t)n);
                                  PL_srand_called = TRUE; }
        ~readdir_r()            readdir()
         realloc()              saferealloc(), Renew() or Renewc()
        ~re_comp()              regcomp()
        ~re_exec()              regexec()
        ~rexec()                rcmd()
        ~rexec_af()             rcmd()
         setenv(s, val)         my_setenv(s, val)
        ~setgrent_r()           setgrent()
        ~sethostent_r()         sethostent()
         setlocale()            Perl_setlocale()
         setlocale_r()          Perl_setlocale()
        ~setnetent_r()          setnetent()
        ~setprotoent_r()        setprotoent()
        ~setpwent_r()           setpwent()
        ~setservent_r()         setservent()
        ~setutent()             setutxent()
         sigaction()            rsignal(signo, handler)
        ~siginterrupt()         rsignal() with the SA_RESTART flag instead
         signal(signo, handler) rsignal(signo, handler)
        ~ssignal()              DO NOT USE; see its man page
         strcasecmp()           a Perl foldEQ-family function
         strerror()             sv_string_from_errnum()
         strerror_l()           sv_string_from_errnum()
         strerror_r()           sv_string_from_errnum()
         strftime()             Perl_sv_strftime_tm()
         strtod()               my_strtod() or Strtod()
         system(s)              Don't. Look at pp_system or use my_popen.
        ~tempnam()              mkstemp() or tmpfile()
        ~tmpnam()               mkstemp() or tmpfile()
         tmpnam_r()             mkstemp() or tmpfile()
         uselocale()            Perl_setlocale()
         vsnprintf()            my_vsnprintf()
         wctob()                wcrtomb()
         wctomb()               wcrtomb()
         wsetlocale()           Perl_setlocale()

       The Perl-furnished alternatives are documented in perlapi, which you should peruse anyway
       to see what all is available to you.

       The lists are incomplete.  Think when using an unlisted function if it seems likely to
       interfere with Perl.

Dealing with locales

       Like it or not, your code will be executed in the context of a locale, as are all C
       language programs.  See perllocale.  Most libc calls are not affected by the locale, but a
       surprising number are:

        addmntent()           getspent_r()        sethostent()
        alphasort()           getspnam()          sethostent_r()
        asctime()             getspnam_r()        setnetent()
        asctime_r()           getwc()             setnetent_r()
        asprintf()            getwchar()          setnetgrent()
        atof()                glob()              setprotoent()
        atoi()                gmtime()            setprotoent_r()
        atol()                gmtime_r()          setpwent()
        atoll()               grantpt()           setpwent_r()
        btowc()               iconv_open()        setrpcent()
        catopen()             inet_addr()         setservent()
        ctime()               inet_aton()         setservent_r()
        ctime_r()             inet_network()      setspent()
        cuserid()             inet_ntoa()         sgetspent_r()
        daylight              inet_ntop()         shm_open()
        dirname()             inet_pton()         shm_unlink()
        dprintf()             initgroups()        snprintf()
        endaliasent()         innetgr()           sprintf()
        endgrent()            iruserok()          sscanf()
        endgrent_r()          iruserok_af()       strcasecmp()
        endhostent()          isalnum()           strcasestr()
        endhostent_r()        isalnum_l()         strcoll()
        endnetent()           isalpha()           strerror()
        endnetent_r()         isalpha_l()         strerror_l()
        endprotoent()         isascii()           strerror_r()
        endprotoent_r()       isascii_l()         strfmon()
        endpwent()            isblank()           strfmon_l()
        endpwent_r()          isblank_l()         strfromd()
        endrpcent()           iscntrl()           strfromf()
        endservent()          iscntrl_l()         strfroml()
        endservent_r()        isdigit()           strftime()
        endspent()            isdigit_l()         strftime_l()
        err()                 isgraph()           strncasecmp()
        error()               isgraph_l()         strptime()
        error_at_line()       islower()           strsignal()
        errx()                islower_l()         strtod()
        fgetwc()              isprint()           strtof()
        fgetwc_unlocked()     isprint_l()         strtoimax()
        fgetws()              ispunct()           strtol()
        fgetws_unlocked()     ispunct_l()         strtold()
        fnmatch()             isspace()           strtoll()
        forkpty()             isspace_l()         strtoq()
        fprintf()             isupper()           strtoul()
        fputwc()              isupper_l()         strtoull()
        fputwc_unlocked()     iswalnum()          strtoumax()
        fputws()              iswalnum_l()        strtouq()
        fputws_unlocked()     iswalpha()          strverscmp()
        fscanf()              iswalpha_l()        strxfrm()
        fwprintf()            iswblank()          swprintf()
        fwscanf()             iswblank_l()        swscanf()
        getaddrinfo()         iswcntrl()          syslog()
        getaliasbyname_r()    iswcntrl_l()        timegm()
        getaliasent_r()       iswdigit()          timelocal()
        getdate()             iswdigit_l()        timezone
        getdate_r()           iswgraph()          tolower()
        getfsent()            iswgraph_l()        tolower_l()
        getfsfile()           iswlower()          toupper()
        getfsspec()           iswlower_l()        toupper_l()
        getgrent()            iswprint()          towctrans()
        getgrent_r()          iswprint_l()        towlower()
        getgrgid()            iswpunct()          towlower_l()
        getgrgid_r()          iswpunct_l()        towupper()
        getgrnam()            iswspace()          towupper_l()
        getgrnam_r()          iswspace_l()        tzname
        getgrouplist()        iswupper()          tzset()
        gethostbyaddr()       iswupper_l()        ungetwc()
        gethostbyaddr_r()     iswxdigit()         vasprintf()
        gethostbyname()       iswxdigit_l()       vdprintf()
        gethostbyname2()      isxdigit()          verr()
        gethostbyname2_r()    isxdigit_l()        verrx()
        gethostbyname_r()     localeconv()        versionsort()
        gethostent()          localtime()         vfprintf()
        gethostent_r()        localtime_r()       vfscanf()
        gethostid()           MB_CUR_MAX          vfwprintf()
        getlogin()            mblen()             vprintf()
        getlogin_r()          mbrlen()            vscanf()
        getmntent()           mbrtowc()           vsnprintf()
        getmntent_r()         mbsinit()           vsprintf()
        getnameinfo()         mbsnrtowcs()        vsscanf()
        getnetbyaddr()        mbsrtowcs()         vswprintf()
        getnetbyaddr_r()      mbstowcs()          vsyslog()
        getnetbyname()        mbtowc()            vwarn()
        getnetbyname_r()      mktime()            vwarnx()
        getnetent()           nan()               vwprintf()
        getnetent_r()         nanf()              warn()
        getnetgrent()         nanl()              warnx()
        getnetgrent_r()       nl_langinfo()       wcrtomb()
        getprotobyname()      openpty()           wcscasecmp()
        getprotobyname_r()    printf()            wcschr()
        getprotobynumber()    psiginfo()          wcscoll()
        getprotobynumber_r()  psignal()           wcsftime()
        getprotoent()         putpwent()          wcsncasecmp()
        getprotoent_r()       putspent()          wcsnrtombs()
        getpw()               putwc()             wcsrchr()
        getpwent()            putwchar()          wcsrtombs()
        getpwent_r()          regcomp()           wcstod()
        getpwnam()            regexec()           wcstof()
        getpwnam_r()          res_nclose()        wcstoimax()
        getpwuid()            res_ninit()         wcstold()
        getpwuid_r()          res_nquery()        wcstombs()
        getrpcbyname_r()      res_nquerydomain()  wcstoumax()
        getrpcbynumber_r()    res_nsearch()       wcswidth()
        getrpcent_r()         res_nsend()         wcsxfrm()
        getrpcport()          rpmatch()           wctob()
        getservbyname()       ruserok()           wctomb()
        getservbyname_r()     ruserok_af()        wctrans()
        getservbyport()       scandir()           wctype()
        getservbyport_r()     scanf()             wcwidth()
        getservent()          setaliasent()       wordexp()
        getservent_r()        setgrent()          wprintf()
        getspent()            setgrent_r()        wscanf()

       (The list doesn't include functions that manipulate the locale, such as setlocale().)

       If any of these functions are called directly or indirectly from your code, you are
       affected by the current locale.

       The first thing to know about this list is that there are better alternatives to many of
       the functions, which it's highly likely that you should be using instead.  See "libc
       functions to avoid" above.  This includes using Perl IO perlapio.

       The second thing to know is that Perl is documented to not pay attention to the current
       locale except for code executed within the scope of a "use locale" statement.  If you
       violate that, you may be creating bugs, depending on the application.

       The next thing to know is that many of these functions depend only on the locale in
       regards to numeric values.  Your code is likely to have been written expecting that the
       decimal point (radix) character is a dot (U+002E: FULL STOP), and that strings of integer
       numbers are not separated into groups (1,000,000 in an American locale means a million;
       your code is likely not expecting the commas.)  The good news is that normally (as of Perl
       v5.22), your code will get called with the locale set so those expectations are met.
       Explicit action has to be taken to change this (described a little ways below).  This is
       accomplished by Perl not actually switching into a locale that doesn't conform to these
       expectations, except when explicitly told to do so.  The Perl input/output and formatting
       routines do this switching for you automatically, if appropriate, and then switch back.
       If, for some reason, you need to do it yourself, the easiest way from C and XS code is to
       use the macro ""WITH_LC_NUMERIC_SET_TO_NEEDED"" in perlapi.  You can wrap this macro
       around an entire block of code that you want to be executed in the correct environment.
       The bottom line is that your code is likely to work as expected in this regard without you
       having to take any action.

       This leaves the remaining functions.  Your code will get called with all but the numeric
       locale portions set to the underlying locale.  Often, the locale is of not much import to
       your code, and you also won't have to take any action; things will just work out.  But you
       should examine the man pages of the ones you use to verify this.  Often, Perl has better
       ways of doing the same functionality.  Consider using SVs and their access routines rather
       than calling the low level functions that, for example, find how many bytes are in a UTF-8
       encoded character.

       You can determine if you have been called from within the scope of a "use locale" by using
       the boolen macro ""IN_LOCALE"" in perlapi.

       If you need to not be in the underlying locale, you can call ""Perl_setlocale"" in perlapi
       to change it temporarily to the one you need (likely the "C" locale), and then change it
       back before returning.  This can be very problematic on threaded perls on some platforms.
       See "Dealing with embedded perls and threads".

       A problem with changing the locale of a single category is that mojibake can arise on some
       platforms if the "LC_CTYPE" category and the changed one are not the same.  On the
       platforms that that isn't an issue, the preprocessor directive
       "LIBC_HANDLES_MISMATCHED_CTYPE" will be defined.  Otherwise, you may have to change more
       than one category to correctly accomplish your task.  And, there will be many locale
       combinations where the mojibake likely won't happen, so you won't be confronted with this
       until the code gets executed in the field by someone who doesn't speak your language very
       well.

       Earlier we mentioned that explicit action is required to have your code get called with
       the numeric portions of the locale not meeting the the typical expectations of having a
       dot for the radix character and no punctuation separating groups of digits.  That action
       is to call the function ""switch_to_global_locale"" in perlapi.

       switch_to_global_locale() was written initially to cope with the "Tk" library, but is
       general enough for other similar situations.  "Tk" changes the global locale to match its
       expectations (later versions of it allow this to be turned off).  This presents a conflict
       with Perl thinking it also controls the locale.  Calling this function tells Perl to yield
       control.  Calling ""sync_locale"" in perlapi tells Perl to take control again, accepting
       whatever the locale has been changed to in the interim.  If your code is called during
       that interim, all portions of the locale will be the raw underlying values.  Should you
       need to manipulate numbers, you are on your own with regard to the radix character and
       grouping.  If you find yourself in this situation, it is generally best to make the
       interval between the calls to these two functions as short as possible, and avoid
       calculations until after perl has control again.

       It is important for perl to know about all the possible locale categories on the platform,
       even if they aren't apparently used in your program.  Perl knows all of the Linux ones.
       If your platform has others, you can submit an issue at
       <https://github.com/Perl/perl5/issues> for inclusion of it in the next release.  In the
       meantime, it is possible to edit the Perl source to teach it about the category, and then
       recompile.  Search for instances of, say, "LC_PAPER" in the source, and use that as a
       template to add the omitted one.

       There are further complications under multi-threaded operation.  Keep on reading.

Dealing with embedded perls and threads

       It is possible to embed a Perl interpreter within a larger program.  See perlembed.

       MULTIPLICITY is the way this is accomplished internally; it is described in "How multiple
       interpreters and concurrency are supported" in perlguts.  Multiple Perl interpreters may
       be embedded.

       It is also possible to compile perl to support threading.  See perlthrtut.  Perl's
       implementation of threading requires MULTIPLICITY, but not the other way around.

       MULTIPLICITY without threading means that only one thing runs at a time, so there are no
       concurrency issues, but each component or instance can affect the global state,
       potentially interfering with the execution of other instance.  This can happen if one
       instance:

       •   changes the current working directory

       •   changes the process's environment

       •   changes the global locale the process is operating under

       •   writes to shared memory or to a shared file

       •   uses a shared file descriptor (including a database iterator)

       •   raises a signal that functions in other instances are sensitive to

       If your code doesn't do any of these things, nor depends on any of their values, then
       Congratulations!!, you don't have to worry about MULTIPLICITY or threading.  But wait, a
       surprising number of libc functions do depend on data global to the process in some way
       that may not be immediately obvious.  For example, calling strtok(3) changes the global
       state of a process, and thus needs special attention.

       The section 3 libc uses that we know about that have MULTIPLICITY and/or multi-thread
       issues are:

        addmntent()             getrpcent_r()        re_exec()
        alphasort()             getrpcport()         regcomp()
        asctime()               getservbyname()      regerror()
        asctime_r()             getservbyname_r()    regexec()
        asprintf()              getservbyport()      res_nclose()
        atof()                  getservbyport_r()    res_ninit()
        atoi()                  getservent()         res_nquery()
        atol()                  getservent_r()       res_nquerydomain()
        atoll()                 getspent()           res_nsearch()
        basename()              getspent_r()         res_nsend()
        btowc()                 getspnam()           rexec()
        catgets()               getspnam_r()         rexec_af()
        catopen()               getttyent()          rpmatch()
        clearenv()              getttynam()          ruserok()
        clearerr_unlocked()     getusershell()       ruserok_af()
        crypt()                 getutent()           scandir()
        crypt_gensalt()         getutid()            scanf()
        crypt_r()               getutline()          secure_getenv()
        ctermid()               getutxent()          seed48()
        ctermid_r()             getutxid()           seed48_r()
        ctime()                 getutxline()         setaliasent()
        ctime_r()               getwc()              setcontext()
        cuserid()               getwchar()           setenv()
        daylight                getwchar_unlocked()  setfsent()
        dbm_clearerr()          getwc_unlocked()     setgrent()
        dbm_close()             glob()               setgrent_r()
        dbm_delete()            gmtime()             sethostent()
        dbm_error()             gmtime_r()           sethostent_r()
        dbm_fetch()             grantpt()            sethostid()
        dbm_firstkey()          hcreate()            setkey()
        dbm_nextkey()           hcreate_r()          setlocale()
        dbm_open()              hdestroy()           setlocale_r()
        dbm_store()             hdestroy_r()         setlogmask()
        dirname()               hsearch()            setnetent()
        dlerror()               hsearch_r()          setnetent_r()
        dprintf()               iconv()              setnetgrent()
        drand48()               iconv_open()         setprotoent()
        drand48_r()             inet_addr()          setprotoent_r()
        ecvt()                  inet_aton()          setpwent()
        encrypt()               inet_network()       setpwent_r()
        endaliasent()           inet_ntoa()          setrpcent()
        endfsent()              inet_ntop()          setservent()
        endgrent()              inet_pton()          setservent_r()
        endgrent_r()            initgroups()         setspent()
        endhostent()            initstate_r()        setstate_r()
        endhostent_r()          innetgr()            setttyent()
        endnetent()             iruserok()           setusershell()
        endnetent_r()           iruserok_af()        setutent()
        endnetgrent()           isalnum()            setutxent()
        endprotoent()           isalnum_l()          sgetspent()
        endprotoent_r()         isalpha()            sgetspent_r()
        endpwent()              isalpha_l()          shm_open()
        endpwent_r()            isascii()            shm_unlink()
        endrpcent()             isascii_l()          siginterrupt()
        endservent()            isblank()            sleep()
        endservent_r()          isblank_l()          snprintf()
        endspent()              iscntrl()            sprintf()
        endttyent()             iscntrl_l()          srand48()
        endusershell()          isdigit()            srand48_r()
        endutent()              isdigit_l()          srandom_r()
        endutxent()             isgraph()            sscanf()
        erand48()               isgraph_l()          ssignal()
        erand48_r()             islower()            strcasecmp()
        err()                   islower_l()          strcasestr()
        error()                 isprint()            strcoll()
        error_at_line()         isprint_l()          strerror()
        errx()                  ispunct()            strerror_l()
        ether_aton()            ispunct_l()          strerror_r()
        ether_ntoa()            isspace()            strfmon()
        execlp()                isspace_l()          strfmon_l()
        execvp()                isupper()            strfromd()
        execvpe()               isupper_l()          strfromf()
        exit()                  iswalnum()           strfroml()
        __fbufsize()            iswalnum_l()         strftime()
        fcloseall()             iswalpha()           strftime_l()
        fcvt()                  iswalpha_l()         strncasecmp()
        fflush_unlocked()       iswblank()           strptime()
        fgetc_unlocked()        iswblank_l()         strsignal()
        fgetgrent()             iswcntrl()           strtod()
        fgetpwent()             iswcntrl_l()         strtof()
        fgetspent()             iswdigit()           strtoimax()
        fgets_unlocked()        iswdigit_l()         strtok()
        fgetwc()                iswgraph()           strtol()
        fgetwc_unlocked()       iswgraph_l()         strtold()
        fgetws()                iswlower()           strtoll()
        fgetws_unlocked()       iswlower_l()         strtoq()
        fnmatch()               iswprint()           strtoul()
        forkpty()               iswprint_l()         strtoull()
        __fpending()            iswpunct()           strtoumax()
        fprintf()               iswpunct_l()         strtouq()
        __fpurge()              iswspace()           strverscmp()
        fputc_unlocked()        iswspace_l()         strxfrm()
        fputs_unlocked()        iswupper()           swapcontext()
        fputwc()                iswupper_l()         swprintf()
        fputwc_unlocked()       iswxdigit()          swscanf()
        fputws()                iswxdigit_l()        sysconf()
        fputws_unlocked()       isxdigit()           syslog()
        fread_unlocked()        isxdigit_l()         system()
        fscanf()                jrand48()            tdelete()
        __fsetlocking()         jrand48_r()          tempnam()
        fts_children()          l64a()               tfind()
        fts_read()              lcong48()            timegm()
        ftw()                   lcong48_r()          timelocal()
        fwprintf()              lgamma()             timezone
        fwrite_unlocked()       lgammaf()            tmpnam()
        fwscanf()               lgammal()            tmpnam_r()
        gamma()                 localeconv()         tolower()
        gammaf()                localtime()          tolower_l()
        gammal()                localtime_r()        toupper()
        getaddrinfo()           login()              toupper_l()
        getaliasbyname()        login_tty()          towctrans()
        getaliasbyname_r()      logout()             towlower()
        getaliasent()           logwtmp()            towlower_l()
        getaliasent_r()         lrand48()            towupper()
        getchar_unlocked()      lrand48_r()          towupper_l()
        getcontext()            makecontext()        tsearch()
        getc_unlocked()         mallinfo()           ttyname()
        get_current_dir_name()  MB_CUR_MAX           ttyname_r()
        getdate()               mblen()              ttyslot()
        getdate_r()             mbrlen()             twalk()
        getenv()                mbrtowc()            twalk_r()
        getfsent()              mbsinit()            tzname
        getfsfile()             mbsnrtowcs()         tzset()
        getfsspec()             mbsrtowcs()          ungetwc()
        getgrent()              mbstowcs()           unsetenv()
        getgrent_r()            mbtowc()             updwtmp()
        getgrgid()              mcheck()             utmpname()
        getgrgid_r()            mcheck_check_all()   va_arg()
        getgrnam()              mcheck_pedantic()    valloc()
        getgrnam_r()            mktime()             vasprintf()
        getgrouplist()          mprobe()             vdprintf()
        gethostbyaddr()         mrand48()            verr()
        gethostbyaddr_r()       mrand48_r()          verrx()
        gethostbyname()         mtrace()             versionsort()
        gethostbyname2()        muntrace()           vfprintf()
        gethostbyname2_r()      nan()                vfscanf()
        gethostbyname_r()       nanf()               vfwprintf()
        gethostent()            nanl()               vprintf()
        gethostent_r()          newlocale()          vscanf()
        gethostid()             nftw()               vsnprintf()
        getlogin()              nl_langinfo()        vsprintf()
        getlogin_r()            nrand48()            vsscanf()
        getmntent()             nrand48_r()          vswprintf()
        getmntent_r()           openpty()            vsyslog()
        getnameinfo()           perror()             vwarn()
        getnetbyaddr()          posix_fallocate()    vwarnx()
        getnetbyaddr_r()        printf()             vwprintf()
        getnetbyname()          profil()             warn()
        getnetbyname_r()        psiginfo()           warnx()
        getnetent()             psignal()            wcrtomb()
        getnetent_r()           ptsname()            wcscasecmp()
        getnetgrent()           putchar_unlocked()   wcschr()
        getnetgrent_r()         putc_unlocked()      wcscoll()
        getopt()                putenv()             wcsftime()
        getopt_long()           putpwent()           wcsncasecmp()
        getopt_long_only()      putspent()           wcsnrtombs()
        getpass()               pututline()          wcsrchr()
        getprotobyname()        pututxline()         wcsrtombs()
        getprotobyname_r()      putwc()              wcstod()
        getprotobynumber()      putwchar()           wcstof()
        getprotobynumber_r()    putwchar_unlocked()  wcstoimax()
        getprotoent()           putwc_unlocked()     wcstold()
        getprotoent_r()         pvalloc()            wcstombs()
        getpw()                 qecvt()              wcstoumax()
        getpwent()              qfcvt()              wcswidth()
        getpwent_r()            querylocale()        wcsxfrm()
        getpwnam()              rand()               wctob()
        getpwnam_r()            random_r()           wctomb()
        getpwuid()              rcmd()               wctrans()
        getpwuid_r()            rcmd_af()            wctype()
        getrpcbyname()          readdir()            wcwidth()
        getrpcbyname_r()        readdir64()          wordexp()
        getrpcbynumber()        readdir64_r()        wprintf()
        getrpcbynumber_r()      readdir_r()          wscanf()
        getrpcent()             re_comp()            wsetlocale()

       (If you know of additional functions that are unsafe on some platform or another, notify
       us via filing a bug report at <https://github.com/Perl/perl5/issues>.)

       Some of these are safe under MULTIPLICITY, problematic only under threading.  If a use
       doesn't appear in the above list, we think it is MULTIPLICITY and thread-safe on all
       platforms.

       All the uses listed above are function calls, except for these:

        daylight  MB_CUR_MAX  timezone  tzname

       There are three main approaches to coping with issues involving these constructs, each
       suitable for different circumstances:

       •   Don't use them.  Some of them have preferred alternatives.  Use the list above in
           "libc functions to avoid" to replace your uses with ones that are thread-friendly.
           For example I/O, should be done via perlapio.

           If you must use them, many, but not all, of them will be ok as long as their use is
           confined to a single thread that has no interaction with conflicting uses in other
           threads.  You will need to closely examine their man pages for this, and be aware that
           vendor documentation is often imprecise.

       •   Do all your business before any other code can change things.  If you make changes,
           change back before returning.

       •   Save the result of a query of global information to a per-instance area before
           allowing another instance to execute.  Then you can work on it at your leisure.  This
           might be an automatic C variable for non-pointers, or something as described above in
           ""Safely Storing Static Data in XS" in perlxs".

       Without threading, you don't have to worry about being interrupted by the system giving
       control to another thread.  With threading, you will have to uses mutexes, and be
       concerned with the possibility of deadlock.

   Functions always unsuitable for use under multi-threads
       A few functions are considered totally unsuited for use in a multi-thread environment.
       These must be called only during single-thread operation.

         endusershell()    @getaliasent()      muntrace()   rexec()
         ether_aton()      @getrpcbyname()     profil()     rexec_af()
         ether_ntoa()      @getrpcbynumber()   rcmd()       setusershell()
         fts_children()    @getrpcent()        rcmd_af()    ttyslot()
         fts_read()         getusershell()     re_comp()
        @getaliasbyname()   mtrace()           re_exec()

       "@" above marks the functions for which there are preferred alternatives available on some
       platforms, and those alternatives may be suitable for multi-thread use.

   Functions which must be called at least once before starting threads
       Some functions perform initialization on their first call that must be done while still in
       a single-thread environment, but subsequent calls are thread-safe when executed in a
       critical section.  Therefore, they must be called at least once before switching to multi-
       threads:

        getutent()  getutline()  getutxid()    mallinfo()  valloc()
        getutid()   getutxent()  getutxline()  pvalloc()

   Functions that are thread-safe when called with appropriate arguments
       Some of the functions are thread-safe if called with arguments that comply with certain
       (easily met) restrictions.  These are:

        ctermid()        mbrlen()      mbsrtowcs()  wcrtomb()
        cuserid()        mbrtowc()     tmpnam()     wcsnrtombs()
        error_at_line()  mbsnrtowcs()  va_arg()     wcsrtombs()

       See the man pages of each for details.  (For completeness, the list includes functions
       that you shouldn't be using anyway because of other reasons.)

   Functions vulnerable to signals
       Some functions are vulnerable to asynchronous signals.  These are:

        getlogin()    getutid()    getutxid()    login()   pututline()  updwtmp()
        getlogin_r()  getutline()  getutxline()  logout()  pututxline() wordexp()
        getutent()    getutxent()  glob()        logwtmp() sleep()

       Some libc's implement 'system()' thread-safely.  But in others, it also has signal issues.

   General issues with thread-safety
       Some libc functions use and/or modify a global state, such as a database.  The libc
       functions presume that there is only one instance at a time operating on that database.
       Unpredictable results occur if more than one does, even if the database is not changed.
       For example, typically there is a global iterator for such a data base and that iterator
       is maintained by libc, so that each new read from any instance advances it, meaning that
       no instance will see all the entries.  The only way to make these thread-safe is to have
       an exclusive lock on a mutex from the open call through the close.  You are advised to not
       use such databases from more than one instance at a time.

       Other examples of functions that use a global state include pseudo-random number
       generators.  Some libc implementations of 'rand()', for example, may share the data across
       threads; and others may have per-thread data.  The shared ones will have unreproducible
       results, as the threads will vary in their timings and interactions.  This may be what you
       want; or it may not be.  (This particular function is a candidate to be removed from the
       POSIX Standard because of these issues.)

       Functions that output to a stream also are considered thread-unsafe when locking is not
       done.  But the typical consequences are just that the data is output in an unpredictable
       order; that outcome may be totally acceptable to you.

       Since the current working directory is global to a process, all instances depend on it.
       One instance doing a chdir(2) affects all the other instances.  In a multi-threaded
       environment, any libc call that expects the directory to not change for the duration of
       its execution will have undefined results if another thread interrupts it at just the
       wrong time and changes the directory.  The man pages only list one such call, nftw().  But
       there may be other issues lurking.

   Reentrant equivalent functions
       Some functions that are problematic with regard to MULTIPLICITY have reentrant versions
       (on some or all platforms) that are better suited, with fewer (perhaps no) races when run
       under threads.

       Some of these reentrant functions that are available on all platforms should always be
       used anyway; they are in the lists directly under "libc functions to avoid".

       Others may not be available on some platforms, or have issues that makes them undesirable
       to use even when they are available.  Or it may just be more complicated and tedious to
       use the reentrant version.  For these, perl has a mechanism for automatically substituting
       that reentrant version when available and desirable, while hiding the complications from
       your code.  This feature is enabled by default for code in the Perl core and its
       extensions.  To enable it in other XS modules,

          #define PERL_REENTRANT

       It is simpler for you to use the unpreferred version in your code, and rely on this
       feature to do the better thing, in part because no substitution is done if the alternative
       is not available or desirable on the platform, nor if threads aren't enabled.  You just
       write as if there weren't threads, and you get the better behavior without having to think
       about it.

       On some platforms the safer library functions may fail if the result buffer is too small
       (for example the user group databases may be rather large, and the reentrant functions may
       have to carry around a full snapshot of those databases).  Perl will start with a small
       buffer, but keep retrying and growing the result buffer until the result fits.  If this
       limitless growing sounds bad for security or memory consumption reasons you can recompile
       Perl with "PERL_REENTRANT_MAXSIZE" #defined to the maximum number of bytes you will allow.

       Below is a list of the non-reentrant functions and their reentrant alternatives.  This
       substitution is done even on functions that you shouldn't be using in the first place.
       These are marked by a "*".  You should instead use the alternate given in the lists
       directly under "libc functions to avoid".

       Even so, some of the preferred alternatives are considered obsolete or otherwise unwise to
       use on some platforms.  These are marked with a '?'.  Also, some alternatives aren't Perl-
       defined functions and aren't in in the POSIX Standard, so won't be widely available.
       These are marked with '~'.  (Remember that the automatic substitution only happens when
       they are available and desirable, so you can just use the unpreferred alternative.)

        *asctime()             ?asctime_r()
         crypt()               ~crypt_r()
         ctermid()             ~ctermid_r()
        *ctime()               ?ctime_r()
         endgrent()           ?~endgrent_r()
         endhostent()         ?~endhostent_r()
         endnetent()          ?~endnetent_r()
         endprotoent()        ?~endprotoent_r()
         endpwent()           ?~endpwent_r()
         endservent()         ?~endservent_r()
         getgrent()            ~getgrent_r()
         getgrgid()             getgrgid_r()
         getgrnam()             getgrnam_r()
         gethostbyaddr()       ~gethostbyaddr_r()
         gethostbyname()       ~gethostbyname_r()
         gethostent()          ~gethostent_r()
         getlogin()             getlogin_r()
         getnetbyaddr()        ~getnetbyaddr_r()
         getnetbyname()        ~getnetbyname_r()
         getnetent()           ~getnetent_r()
         getprotobyname()      ~getprotobyname_r()
         getprotobynumber()    ~getprotobynumber_r()
         getprotoent()         ~getprotoent_r()
         getpwent()            ~getpwent_r()
         getpwnam()             getpwnam_r()
         getpwuid()             getpwuid_r()
         getservbyname()       ~getservbyname_r()
         getservbyport()       ~getservbyport_r()
         getservent()          ~getservent_r()
         getspnam()            ~getspnam_r()
         gmtime()               gmtime_r()
         localtime()            localtime_r()
         readdir()             ?readdir_r()
         readdir64()           ~readdir64_r()
         setgrent()           ?~setgrent_r()
         sethostent()         ?~sethostent_r()
        *setlocale()          ?~setlocale_r()
         setnetent()          ?~setnetent_r()
         setprotoent()        ?~setprotoent_r()
         setpwent()           ?~setpwent_r()
         setservent()         ?~setservent_r()
        *strerror()             strerror_r()
        *tmpnam()              ~tmpnam_r()
         ttyname()              ttyname_r()

       The Perl-furnished items are documented in perlapi.

       The bottom line is:

       For items marked "*"
           Replace all uses of these with the preferred alternative given in the lists directly
           under "libc functions to avoid".

       For the remaining items
           If you really need to use these functions, you have two choices:

           If you #define PERL_REENTRANT
               Use the function in the first column as-is, and let perl do the work of
               substituting the function in the right column if available on the platform, and it
               is deemed suitable for use.

               You should look at the man pages for both versions to find any other gotchas.

           If you don't enable automatic substitution
               You should examine the application's code to determine if the column 1 function
               presents a real problem under threads given the circumstances it is used in.  You
               can go directly to the column 2 replacement, but beware of the ones that are
               marked.  Some of those may be nonexistent or flaky on some platforms.

   Functions that need the environment to be constant
       Since the environment is global to a process, all instances depend on it.  One instance
       changing the environment affects all the other instances.  Under threads, any libc call
       that expects the environment to not change for the duration of its execution will have
       undefined results if another thread interrupts it at just the wrong time and changes it.
       These are the functions that the man pages list as being sensitive to that.

        catopen()               gethostbyname2()    newlocale()
        ctime()                 gethostbyname2_r()  regerror()
        ctime_r()               gethostbyname_r()   secure_getenv()
        endhostent()            gethostent()        sethostent()
        endhostent_r()          gethostent_r()      sethostent_r()
        endnetent()             gethostid()         setlocale()
        endnetent_r()           getnameinfo()       setlocale_r()
        execlp()                getnetbyname()      setnetent()
        execvp()                getnetent()         setnetent_r()
        execvpe()               getopt()            strftime()
        fnmatch()               getopt_long()       strptime()
        getaddrinfo()           getopt_long_only()  sysconf()
        get_current_dir_name()  getrpcport()        syslog()
        getdate()               glob()              tempnam()
        getdate_r()             gmtime()            timegm()
        getenv()                gmtime_r()          timelocal()
        gethostbyaddr()         localtime()         tzset()
        gethostbyaddr_r()       localtime_r()       vsyslog()
        gethostbyname()         mktime()

       Many of these functions are problematic under threads for other reasons as well.  See the
       man pages for any you use.

       Perl defines mutexes "ENV_READ_LOCK" and "ENV_READ_UNLOCK" with which to wrap calls to
       these functions.  You need to consider the possibility of deadlock.  It is expected that a
       different mechanism will be in place and preferred for Perl v5.42.

   Locale-specific issues
       C language programs originally had a single locale global to the entire process.  This was
       later found to be inadequate for many purposes, so later extensions changed that, first
       with Windows, and then POSIX 2008.  In Windows, you can change any thread at any time to
       operate either with a per-thread locale, or with the global one, using a special new libc
       function.  In POSIX, the original API operates only on the global locale, but there is an
       entirely new API to manipulate either per-thread locales or the global one.  As with
       Windows (but using the new API), a thread can be switched at any time to operate on the
       global locale, or a per-thread one.

       When one instance changes the global locale, all other instances using the global locale
       are affected.  Almost all the locale-related functions in the list directly under "Dealing
       with embedded perls and threads" have undefined behavior if another thread interrupts
       their execution and changes the locale.  Under threads, another thread could do exactly
       that.

       But, on systems that have per-thread locales, starting with Perl v5.28, perl uses them
       after initialization; the global locale is not used except if XS code has called
       switch_to_global_locale().  Doing so affects only the thread that called it.  If a maximum
       of one instance is using the global locale, no other instances are affected, the locale of
       concurrently executing functions in other threads is not changed, and this becomes a non-
       issue.  The C preprocessor symbol "USE_THREAD_SAFE_LOCALE" will be defined if per-thread
       locales are available and perl has been compiled to use them.  The implementation of per-
       thread locales on some platforms, like most *BSD-based ones, is so buggy that the perl
       hints files for them deliberately turn off the possibility of using them.

       The converse is that on systems with only a global locale, having different threads using
       different locales is not likely to work well; and changing the locale is dangerous, often
       leading to crashes.

       Perl has extensive code to work as well as possible on both types of systems.  You should
       always use Perl_setlocale() to change and query the locale, as it portably works across
       the range of possibilities.

SEE ALSO

       perlapi, perlapio, perlguts, perlxs