bionic (3) erl_scan.3erl.gz

Provided by: erlang-manpages_20.2.2+dfsg-1ubuntu2_all bug

NAME

       erl_scan - The Erlang token scanner.

DESCRIPTION

       This module contains functions for tokenizing (scanning) characters into Erlang tokens.

DATA TYPES

       category() = atom()

       error_description() = term()

       error_info() =
           {erl_anno:location(), module(), error_description()}

       option() =
           return |
           return_white_spaces |
           return_comments |
           text |
           {reserved_word_fun, resword_fun()}

       options() = option() | [option()]

       symbol() = atom() | float() | integer() | string()

       resword_fun() = fun((atom()) -> boolean())

       token() =
           {category(), Anno :: erl_anno:anno(), symbol()} |
           {category(), Anno :: erl_anno:anno()}

       tokens() = [token()]

       tokens_result() =
           {ok, Tokens :: tokens(), EndLocation :: erl_anno:location()} |
           {eof, EndLocation :: erl_anno:location()} |
           {error,
            ErrorInfo :: error_info(),
            EndLocation :: erl_anno:location()}

EXPORTS

       category(Token) -> category()

              Types:

                 Token = token()

              Returns the category of Token.

       column(Token) -> erl_anno:column() | undefined

              Types:

                 Token = token()

              Returns the column of Token's collection of annotations.

       end_location(Token) -> erl_anno:location() | undefined

              Types:

                 Token = token()

              Returns  the  end  location of the text of Token's collection of annotations. If there is no text,
              undefined is returned.

       format_error(ErrorDescriptor) -> string()

              Types:

                 ErrorDescriptor = error_description()

              Uses an ErrorDescriptor and returns a string that describes the error or warning. This function is
              usually   called   implicitly  when  an  ErrorInfo  structure  is  processed  (see  section  Error
              Information).

       line(Token) -> erl_anno:line()

              Types:

                 Token = token()

              Returns the line of Token's collection of annotations.

       location(Token) -> erl_anno:location()

              Types:

                 Token = token()

              Returns the location of Token's collection of annotations.

       reserved_word(Atom :: atom()) -> boolean()

              Returns true if Atom is an Erlang reserved word, otherwise false.

       string(String) -> Return

       string(String, StartLocation) -> Return

       string(String, StartLocation, Options) -> Return

              Types:

                 String = string()
                 Options = options()
                 Return =
                     {ok, Tokens :: tokens(), EndLocation} |
                     {error, ErrorInfo :: error_info(), ErrorLocation}
                 StartLocation = EndLocation = ErrorLocation = erl_anno:location()

              Takes the list of characters String and  tries  to  scan  (tokenize)  them.  Returns  one  of  the
              following:

                {ok, Tokens, EndLocation}:
                  Tokens  are  the  Erlang  tokens from String. EndLocation is the first location after the last
                  token.

                {error, ErrorInfo, ErrorLocation}:
                  An error occurred. ErrorLocation is the first location after the erroneous token.

              string(String) is equivalent to string(String, 1), and string(String, StartLocation) is equivalent
              to string(String, StartLocation, []).

              StartLocation  indicates  the  initial  location when scanning starts. If StartLocation is a line,
              Anno, EndLocation, and ErrorLocation are lines. If StartLocation is a pair of a line and a column,
              Anno  takes  the form of an opaque compound data type, and EndLocation and ErrorLocation are pairs
              of a line and a column. The token annotations contain information about the column  and  the  line
              where  the  token  begins,  as well as the text of the token (if option text is specified), all of
              which can be accessed by calling column/1, line/1, location/1, and text/1.

              A token is a tuple containing information about syntactic category, the token annotations, and the
              terminal symbol. For punctuation characters (such as ; and |) and reserved words, the category and
              the symbol coincide, and the token is represented by a two-tuple. Three-tuples  have  one  of  the
              following forms:

                * {atom, Anno, atom()}

                * {char, Anno, char()}

                * {comment, Anno, string()}

                * {float, Anno, float()}

                * {integer, Anno, integer()}

                * {var, Anno, atom()}

                * {white_space, Anno, string()}

              Valid options:

                {reserved_word_fun, reserved_word_fun()}:
                  A  callback  function  that  is  called  when  the  scanner has found an unquoted atom. If the
                  function returns true, the unquoted atom itself becomes the category  of  the  token.  If  the
                  function returns false, atom becomes the category of the unquoted atom.

                return_comments:
                  Return comment tokens.

                return_white_spaces:
                  Return white space tokens. By convention, a newline character, if present, is always the first
                  character of the text (there cannot be more than one newline in a white space token).

                return:
                  Short for [return_comments, return_white_spaces].

                text:
                  Include the token  text  in  the  token  annotation.  The  text  is  the  part  of  the  input
                  corresponding to the token.

       symbol(Token) -> symbol()

              Types:

                 Token = token()

              Returns the symbol of Token.

       text(Token) -> erl_anno:text() | undefined

              Types:

                 Token = token()

              Returns the text of Token's collection of annotations. If there is no text, undefined is returned.

       tokens(Continuation, CharSpec, StartLocation) -> Return

       tokens(Continuation, CharSpec, StartLocation, Options) -> Return

              Types:

                 Continuation = return_cont() | []
                 CharSpec = char_spec()
                 StartLocation = erl_anno:location()
                 Options = options()
                 Return =
                     {done,
                      Result :: tokens_result(),
                      LeftOverChars :: char_spec()} |
                     {more, Continuation1 :: return_cont()}
                 char_spec() = string() | eof
                 return_cont()
                   An opaque continuation.

              This is the re-entrant scanner, which scans characters until either a dot ('.' followed by a white
              space) or eof is reached. It returns:

                {done, Result, LeftOverChars}:
                  Indicates that there is sufficient input data to get a result. Result is:

                  {ok, Tokens, EndLocation}:
                    The scanning was successful. Tokens is the list of tokens including dot.

                  {eof, EndLocation}:
                    End of file was encountered before any more tokens.

                  {error, ErrorInfo, EndLocation}:
                    An error occurred. LeftOverChars is the remaining characters of  the  input  data,  starting
                    from EndLocation.

                {more, Continuation1}:
                  More  data  is  required  for  building  a term. Continuation1 must be passed in a new call to
                  tokens/3,4 when more data is available.

              The CharSpec eof signals end of file. LeftOverChars then takes the value eof as well.

              tokens(Continuation, CharSpec, StartLocation)  is  equivalent  to  tokens(Continuation,  CharSpec,
              StartLocation, []).

              For a description of the options, see string/3.

ERROR INFORMATION

       ErrorInfo  is  the  standard  ErrorInfo structure that is returned from all I/O modules. The format is as
       follows:

       {ErrorLocation, Module, ErrorDescriptor}

       A string describing the error is obtained with the following call:

       Module:format_error(ErrorDescriptor)

NOTES

       The continuation of the first call to  the  re-entrant  input  functions  must  be  [].  For  a  complete
       description  of  how  the re-entrant input scheme works, see Armstrong, Virding and Williams: 'Concurrent
       Programming in Erlang', Chapter 13.

SEE ALSO

       erl_anno(3erl), erl_parse(3erl), io(3erl)