oracular (3) binary.3erl.gz

Provided by: erlang-manpages_25.3.2.12+dfsg-1ubuntu2_all bug

NAME

       binary - Library for handling binary data.

DESCRIPTION

       This  module  contains  functions  for  manipulating  byte-oriented binaries. Although the
       majority of functions could be provided using bit-syntax, the functions  in  this  library
       are  highly optimized and are expected to either execute faster or consume less memory, or
       both, than a counterpart written in pure Erlang.

       The module is provided according to Erlang Enhancement Proposal (EEP) 31.

   Note:
       The library handles byte-oriented data. For bitstrings that are  not  binaries  (does  not
       contain  whole  octets  of bits) a badarg exception is thrown from any of the functions in
       this module.

DATA TYPES

       cp()

              Opaque data type representing a compiled search pattern. Guaranteed to be a tuple()
              to allow programs to distinguish it from non-precompiled search patterns.

       part() = {Start :: integer() >= 0, Length :: integer()}

              A  representation  of  a  part (or range) in a binary. Start is a zero-based offset
              into a binary() and Length is the length of that part. As  input  to  functions  in
              this  module,  a reverse part specification is allowed, constructed with a negative
              Length, so that the part of the binary begins at Start  +  Length  and  is  -Length
              long. This is useful for referencing the last N bytes of a binary as {size(Binary),
              -N}. The functions in this module always return part()s with positive Length.

EXPORTS

       at(Subject, Pos) -> byte()

              Types:

                 Subject = binary()
                 Pos = integer() >= 0

              Returns the byte at position Pos (zero-based) in binary Subject as an  integer.  If
              Pos >= byte_size(Subject), a badarg exception is raised.

       bin_to_list(Subject) -> [byte()]

              Types:

                 Subject = binary()

              Same as bin_to_list(Subject, {0,byte_size(Subject)}).

       bin_to_list(Subject, PosLen) -> [byte()]

              Types:

                 Subject = binary()
                 PosLen = part()

              Converts  Subject  to  a  list of byte()s, each representing the value of one byte.
              part() denotes which part of the binary() to convert.

              Example:

              1> binary:bin_to_list(<<"erlang">>, {1,3}).
              "rla"
              %% or [114,108,97] in list notation.

              If PosLen in any way references outside the binary, a badarg exception is raised.

       bin_to_list(Subject, Pos, Len) -> [byte()]

              Types:

                 Subject = binary()
                 Pos = integer() >= 0
                 Len = integer()

              Same as bin_to_list(Subject, {Pos, Len}).

       compile_pattern(Pattern) -> cp()

              Types:

                 Pattern = binary() | [binary()]

              Builds an internal structure representing a compilation of a search pattern,  later
              to  be  used  in  functions  match/3,  matches/3,  split/3,  or replace/4. The cp()
              returned is guaranteed to be a tuple() to allow programs  to  distinguish  it  from
              non-precompiled search patterns.

              When  a  list of binaries is specified, it denotes a set of alternative binaries to
              search for. For example, if [<<"functional">>,<<"programming">>]  is  specified  as
              Pattern, this means either <<"functional">> or <<"programming">>". The pattern is a
              set of alternatives; when only a single binary is specified, the set has  only  one
              element. The order of alternatives in a pattern is not significant.

              The list of binaries used for search alternatives must be flat and proper.

              If  Pattern  is  not  a binary or a flat proper list of binaries with length > 0, a
              badarg exception is raised.

       copy(Subject) -> binary()

              Types:

                 Subject = binary()

              Same as copy(Subject, 1).

       copy(Subject, N) -> binary()

              Types:

                 Subject = binary()
                 N = integer() >= 0

              Creates a binary with the content of Subject duplicated N times.

              This function always creates a new binary, even if N = 1.  By  using  copy/1  on  a
              binary  referencing  a larger binary, one can free up the larger binary for garbage
              collection.

          Note:
              By deliberately copying a single binary to avoid referencing a larger  binary,  one
              can,  instead  of freeing up the larger binary for later garbage collection, create
              much more binary data than needed. Sharing binary data is  usually  good.  Only  in
              special cases, when small parts reference large binaries and the large binaries are
              no longer used in any process, deliberate copying can be a good idea.

              If N < 0, a badarg exception is raised.

       decode_unsigned(Subject) -> Unsigned

              Types:

                 Subject = binary()
                 Unsigned = integer() >= 0

              Same as decode_unsigned(Subject, big).

       decode_unsigned(Subject, Endianness) -> Unsigned

              Types:

                 Subject = binary()
                 Endianness = big | little
                 Unsigned = integer() >= 0

              Converts the binary digit representation, in big endian  or  little  endian,  of  a
              positive integer in Subject to an Erlang integer().

              Example:

              1> binary:decode_unsigned(<<169,138,199>>,big).
              11111111

       encode_unsigned(Unsigned) -> binary()

              Types:

                 Unsigned = integer() >= 0

              Same as encode_unsigned(Unsigned, big).

       encode_unsigned(Unsigned, Endianness) -> binary()

              Types:

                 Unsigned = integer() >= 0
                 Endianness = big | little

              Converts  a  positive  integer  to the smallest possible representation in a binary
              digit representation, either big endian or little endian.

              Example:

              1> binary:encode_unsigned(11111111, big).
              <<169,138,199>>

       encode_hex(Bin) -> Bin2

              Types:

                 Bin = binary()
                 Bin2 = <<_:_*16>>

              Encodes a binary into a hex encoded binary.

              Example:

              1> binary:encode_hex(<<"f">>).
              <<"66">>

       decode_hex(Bin) -> Bin2

              Types:

                 Bin = <<_:_*16>>
                 Bin2 = binary()

              Decodes a hex encoded binary into a binary.

              Example

              1> binary:decode_hex(<<"66">>).
              <<"f">>

       first(Subject) -> byte()

              Types:

                 Subject = binary()

              Returns the first byte of binary Subject as an integer. If the size of  Subject  is
              zero, a badarg exception is raised.

       last(Subject) -> byte()

              Types:

                 Subject = binary()

              Returns  the  last  byte of binary Subject as an integer. If the size of Subject is
              zero, a badarg exception is raised.

       list_to_bin(ByteList) -> binary()

              Types:

                 ByteList = iolist()

              Works exactly as erlang:list_to_binary/1, added for completeness.

       longest_common_prefix(Binaries) -> integer() >= 0

              Types:

                 Binaries = [binary()]

              Returns the length of the longest common prefix of the binaries in list Binaries.

              Example:

              1> binary:longest_common_prefix([<<"erlang">>, <<"ergonomy">>]).
              2
              2> binary:longest_common_prefix([<<"erlang">>, <<"perl">>]).
              0

              If Binaries is not a flat list of binaries, a badarg exception is raised.

       longest_common_suffix(Binaries) -> integer() >= 0

              Types:

                 Binaries = [binary()]

              Returns the length of the longest common suffix of the binaries in list Binaries.

              Example:

              1> binary:longest_common_suffix([<<"erlang">>, <<"fang">>]).
              3
              2> binary:longest_common_suffix([<<"erlang">>, <<"perl">>]).
              0

              If Binaries is not a flat list of binaries, a badarg exception is raised.

       match(Subject, Pattern) -> Found | nomatch

              Types:

                 Subject = binary()
                 Pattern = binary() | [binary()] | cp()
                 Found = part()

              Same as match(Subject, Pattern, []).

       match(Subject, Pattern, Options) -> Found | nomatch

              Types:

                 Subject = binary()
                 Pattern = binary() | [binary()] | cp()
                 Found = part()
                 Options = [Option]
                 Option = {scope, part()}
                 part() = {Start :: integer() >= 0, Length :: integer()}

              Searches for the first occurrence of Pattern in Subject and  returns  the  position
              and length.

              The  function  returns  {Pos,  Length}  for  the binary in Pattern, starting at the
              lowest position in Subject.

              Example:

              1> binary:match(<<"abcde">>, [<<"bcde">>, <<"cd">>],[]).
              {1,4}

              Even though <<"cd">>  ends  before  <<"bcde">>,  <<"bcde">>  begins  first  and  is
              therefore  the  first match. If two overlapping matches begin at the same position,
              the longest is returned.

              Summary of the options:

                {scope, {Start, Length}}:
                  Only the specified part is searched. Return values still have offsets from  the
                  beginning of Subject. A negative Length is allowed as described in section Data
                  Types in this manual.

              If none of the strings in Pattern is found, the atom nomatch is returned.

              For a description of Pattern, see function compile_pattern/1.

              If {scope, {Start,Length}} is specified in the options such that Start  >  size  of
              Subject, Start + Length < 0 or Start + Length > size of Subject, a badarg exception
              is raised.

       matches(Subject, Pattern) -> Found

              Types:

                 Subject = binary()
                 Pattern = binary() | [binary()] | cp()
                 Found = [part()]

              Same as matches(Subject, Pattern, []).

       matches(Subject, Pattern, Options) -> Found

              Types:

                 Subject = binary()
                 Pattern = binary() | [binary()] | cp()
                 Found = [part()]
                 Options = [Option]
                 Option = {scope, part()}
                 part() = {Start :: integer() >= 0, Length :: integer()}

              As match/2, but Subject is  searched  until  exhausted  and  a  list  of  all  non-
              overlapping parts matching Pattern is returned (in order).

              The  first and longest match is preferred to a shorter, which is illustrated by the
              following example:

              1> binary:matches(<<"abcde">>,
                                [<<"bcde">>,<<"bc">>,<<"de">>],[]).
              [{1,4}]

              The result shows that <<"bcde">> is selected instead of the shorter match  <<"bc">>
              (which would have given raise to one more match, <<"de">>). This corresponds to the
              behavior of  POSIX  regular  expressions  (and  programs  like  awk),  but  is  not
              consistent  with  alternative  matches  in  re  (and  Perl),  where instead lexical
              ordering in the search pattern selects which string matches.

              If none of the strings in a pattern is found, an empty list is returned.

              For a description of Pattern, see compile_pattern/1. For a description of available
              options, see match/3.

              If  {scope,  {Start,Length}}  is specified in the options such that Start > size of
              Subject, Start + Length < 0 or Start + Length  is  >  size  of  Subject,  a  badarg
              exception is raised.

       part(Subject, PosLen) -> binary()

              Types:

                 Subject = binary()
                 PosLen = part()

              Extracts the part of binary Subject described by PosLen.

              A negative length can be used to extract bytes at the end of a binary:

              1> Bin = <<1,2,3,4,5,6,7,8,9,10>>.
              2> binary:part(Bin, {byte_size(Bin), -5}).
              <<6,7,8,9,10>>

          Note:
              part/2  and  part/3  are  also  available  in  the  erlang  module  under the names
              binary_part/2 and binary_part/3. Those BIFs are allowed in guard tests.

              If PosLen in any way references outside the binary, a badarg exception is raised.

       part(Subject, Pos, Len) -> binary()

              Types:

                 Subject = binary()
                 Pos = integer() >= 0
                 Len = integer()

              Same as part(Subject, {Pos, Len}).

       referenced_byte_size(Binary) -> integer() >= 0

              Types:

                 Binary = binary()

              If a binary references a larger binary (often described as being a  subbinary),  it
              can  be  useful to get the size of the referenced binary. This function can be used
              in a program to trigger the use of copy/1. By copying a binary, one can dereference
              the original, possibly large, binary that a smaller binary is a reference to.

              Example:

              store(Binary, GBSet) ->
                NewBin =
                    case binary:referenced_byte_size(Binary) of
                        Large when Large > 2 * byte_size(Binary) ->
                           binary:copy(Binary);
                        _ ->
                           Binary
                    end,
                gb_sets:insert(NewBin,GBSet).

              In  this  example,  we  chose  to  copy  the  binary content before inserting it in
              gb_sets:set() if it references a binary more than twice the data size  we  want  to
              keep. Of course, different rules apply when copying to different programs.

              Binary  sharing  occurs  whenever binaries are taken apart. This is the fundamental
              reason  why  binaries  are  fast,  decomposition  can  always  be  done  with  O(1)
              complexity.  In  rare  circumstances  this data sharing is however undesirable, why
              this function together with copy/1 can be useful when optimizing for memory use.

              Example of binary sharing:

              1> A = binary:copy(<<1>>, 100).
              <<1,1,1,1,1 ...
              2> byte_size(A).
              100
              3> binary:referenced_byte_size(A).
              100
              4> <<B:10/binary, C:90/binary>> = A.
              <<1,1,1,1,1 ...
              5> {byte_size(B), binary:referenced_byte_size(B)}.
              {10,10}
              6> {byte_size(C), binary:referenced_byte_size(C)}.
              {90,100}

              In the above example, the small binary B was  copied  while  the  larger  binary  C
              references binary A.

          Note:
              Binary  data  is  shared  among  processes. If another process still references the
              larger binary, copying the part this process uses only  consumes  more  memory  and
              does  not  free  up  the  larger  binary  for  garbage collection. Use this kind of
              intrusive functions with extreme care and only if a real problem is detected.

       replace(Subject, Pattern, Replacement) -> Result

              Types:

                 Subject = binary()
                 Pattern = binary() | [binary()] | cp()
                 Replacement = Result = binary()

              Same as replace(Subject, Pattern, Replacement,[]).

       replace(Subject, Pattern, Replacement, Options) -> Result

              Types:

                 Subject = binary()
                 Pattern = binary() | [binary()] | cp()
                 Replacement = binary()
                 Options = [Option]
                 Option = global | {scope, part()} | {insert_replaced, InsPos}
                 InsPos = OnePos | [OnePos]
                 OnePos = integer() >= 0
                   An integer() =< byte_size(Replacement)
                 Result = binary()

              Constructs a new binary by replacing the parts in Subject matching Pattern with the
              content of Replacement.

              If  the  matching  subpart  of  Subject  giving  raise  to the replacement is to be
              inserted in the result, option {insert_replaced, InsPos} inserts the matching  part
              into  Replacement  at  the  specified  position  (or  positions)  before  inserting
              Replacement into Subject.

              Example:

              1> binary:replace(<<"abcde">>,<<"b">>,<<"[]">>, [{insert_replaced,1}]).
              <<"a[b]cde">>
              2> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[]">>,[global,{insert_replaced,1}]).
              <<"a[b]c[d]e">>
              3> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[]">>,[global,{insert_replaced,[1,1]}]).
              <<"a[bb]c[dd]e">>
              4> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[-]">>,[global,{insert_replaced,[1,2]}]).
              <<"a[b-b]c[d-d]e">>

              If any position specified in InsPos > size of  the  replacement  binary,  a  badarg
              exception is raised.

              Options global and {scope, part()} work as for split/3. The return type is always a
              binary().

              For a description of Pattern, see compile_pattern/1.

       split(Subject, Pattern) -> Parts

              Types:

                 Subject = binary()
                 Pattern = binary() | [binary()] | cp()
                 Parts = [binary()]

              Same as split(Subject, Pattern, []).

       split(Subject, Pattern, Options) -> Parts

              Types:

                 Subject = binary()
                 Pattern = binary() | [binary()] | cp()
                 Options = [Option]
                 Option = {scope, part()} | trim | global | trim_all
                 Parts = [binary()]

              Splits Subject into a list of binaries based on Pattern. If option  global  is  not
              specified, only the first occurrence of Pattern in Subject gives rise to a split.

              The parts of Pattern found in Subject are not included in the result.

              Example:

              1> binary:split(<<1,255,4,0,0,0,2,3>>, [<<0,0,0>>,<<2>>],[]).
              [<<1,255,4>>, <<2,3>>]
              2> binary:split(<<0,1,0,0,4,255,255,9>>, [<<0,0>>, <<255,255>>],[global]).
              [<<0,1>>,<<4>>,<<9>>]

              Summary of options:

                {scope, part()}:
                  Works  as  in match/3 and matches/3. Notice that this only defines the scope of
                  the search for matching strings, it does not cut the binary  before  splitting.
                  The  bytes  before  and after the scope are kept in the result. See the example
                  below.

                trim:
                  Removes trailing empty parts of the result (as does trim in re:split/3.

                trim_all:
                  Removes all empty parts of the result.

                global:
                  Repeats the split until Subject is exhausted. Conceptually option global  makes
                  split  work  on the positions returned by matches/3, while it normally works on
                  the position returned by match/3.

              Example of the difference between a  scope  and  taking  the  binary  apart  before
              splitting:

              1> binary:split(<<"banana">>, [<<"a">>],[{scope,{2,3}}]).
              [<<"ban">>,<<"na">>]
              2> binary:split(binary:part(<<"banana">>,{2,3}), [<<"a">>],[]).
              [<<"n">>,<<"n">>]

              The return type is always a list of binaries that are all referencing Subject. This
              means that the data in Subject is not copied to  new  binaries,  and  that  Subject
              cannot  be  garbage  collected  until  the  results  of  the  split  are  no longer
              referenced.

              For a description of Pattern, see compile_pattern/1.