Provided by: erlang-manpages_24.3.4.1+dfsg-1_all bug

NAME

       erl_scan - The Erlang token scanner.

DESCRIPTION

       This module contains functions for tokenizing (scanning) characters into Erlang tokens.

DATA TYPES

       category() = atom()

       error_description() = term()

       error_info() =
           {erl_anno:location(), module(), error_description()}

       option() =
           return | return_white_spaces | return_comments | text |
           {reserved_word_fun, resword_fun()}

       options() = option() | [option()]

       symbol() = atom() | float() | integer() | string()

       resword_fun() = fun((atom()) -> boolean())

       token() =
           {category(), Anno :: erl_anno:anno(), symbol()} |
           {category(), Anno :: erl_anno:anno()}

       tokens() = [token()]

       tokens_result() =
           {ok, Tokens :: tokens(), EndLocation :: erl_anno:location()} |
           {eof, EndLocation :: erl_anno:location()} |
           {error,
            ErrorInfo :: error_info(),
            EndLocation :: erl_anno:location()}

EXPORTS

       category(Token) -> category()

              Types:

                 Token = token()

              Returns the category of Token.

       column(Token) -> erl_anno:column() | undefined

              Types:

                 Token = token()

              Returns the column of Token's collection of annotations.

       end_location(Token) -> erl_anno:location() | undefined

              Types:

                 Token = token()

              Returns the end location of the text of Token's collection of annotations. If there
              is no text, undefined is returned.

       format_error(ErrorDescriptor) -> string()

              Types:

                 ErrorDescriptor = error_description()

              Uses an ErrorDescriptor and returns a string that describes the error  or  warning.
              This function is usually called implicitly when an ErrorInfo structure is processed
              (see section Error Information).

       line(Token) -> erl_anno:line()

              Types:

                 Token = token()

              Returns the line of Token's collection of annotations.

       location(Token) -> erl_anno:location()

              Types:

                 Token = token()

              Returns the location of Token's collection of annotations.

       reserved_word(Atom :: atom()) -> boolean()

              Returns true if Atom is an Erlang reserved word, otherwise false.

       string(String) -> Return

       string(String, StartLocation) -> Return

       string(String, StartLocation, Options) -> Return

              Types:

                 String = string()
                 Options = options()
                 Return =
                     {ok, Tokens :: tokens(), EndLocation} |
                     {error, ErrorInfo :: error_info(), ErrorLocation}
                 StartLocation = EndLocation = ErrorLocation = erl_anno:location()

              Takes the list of characters String and tries to scan (tokenize) them. Returns  one
              of the following:

                {ok, Tokens, EndLocation}:
                  Tokens  are  the  Erlang  tokens from String. EndLocation is the first location
                  after the last token.

                {error, ErrorInfo, ErrorLocation}:
                  An error occurred. ErrorLocation is the  first  location  after  the  erroneous
                  token.

              string(String)   is   equivalent   to   string(String,   1),   and   string(String,
              StartLocation) is equivalent to string(String, StartLocation, []).

              StartLocation indicates the initial location when scanning starts. If StartLocation
              is  a  line,  Anno, EndLocation, and ErrorLocation are lines. If StartLocation is a
              pair of a line and a column, Anno takes the form of an opaque compound  data  type,
              and  EndLocation  and  ErrorLocation  are  pairs  of a line and a column. The token
              annotations contain information about the column  and  the  line  where  the  token
              begins,  as  well  as  the  text of the token (if option text is specified), all of
              which can be accessed by calling column/1, line/1, location/1, and text/1.

              A token is a tuple containing  information  about  syntactic  category,  the  token
              annotations,  and the terminal symbol. For punctuation characters (such as ; and |)
              and reserved words, the  category  and  the  symbol  coincide,  and  the  token  is
              represented by a two-tuple. Three-tuples have one of the following forms:

                * {atom, Anno, atom()}

                * {char, Anno, char()}

                * {comment, Anno, string()}

                * {float, Anno, float()}

                * {integer, Anno, integer()}

                * {var, Anno, atom()}

                * {white_space, Anno, string()}

              Valid options:

                {reserved_word_fun, reserved_word_fun()}:
                  A callback function that is called when the scanner has found an unquoted atom.
                  If the function returns true, the unquoted atom itself becomes the category  of
                  the  token.  If  the  function  returns false, atom becomes the category of the
                  unquoted atom.

                return_comments:
                  Return comment tokens.

                return_white_spaces:
                  Return white space tokens. By convention, a newline character, if  present,  is
                  always  the  first character of the text (there cannot be more than one newline
                  in a white space token).

                return:
                  Short for [return_comments, return_white_spaces].

                text:
                  Include the token text in the token annotation. The text is  the  part  of  the
                  input corresponding to the token.

       symbol(Token) -> symbol()

              Types:

                 Token = token()

              Returns the symbol of Token.

       text(Token) -> erl_anno:text() | undefined

              Types:

                 Token = token()

              Returns  the  text  of  Token's  collection  of  annotations.  If there is no text,
              undefined is returned.

       tokens(Continuation, CharSpec, StartLocation) -> Return

       tokens(Continuation, CharSpec, StartLocation, Options) -> Return

              Types:

                 Continuation = return_cont() | []
                 CharSpec = char_spec()
                 StartLocation = erl_anno:location()
                 Options = options()
                 Return =
                     {done,
                      Result :: tokens_result(),
                      LeftOverChars :: char_spec()} |
                     {more, Continuation1 :: return_cont()}
                 char_spec() = string() | eof
                 return_cont()
                   An opaque continuation.

              This is the re-entrant scanner, which scans characters  until  either  a  dot  ('.'
              followed by a white space) or eof is reached. It returns:

                {done, Result, LeftOverChars}:
                  Indicates that there is sufficient input data to get a result. Result is:

                  {ok, Tokens, EndLocation}:
                    The scanning was successful. Tokens is the list of tokens including dot.

                  {eof, EndLocation}:
                    End of file was encountered before any more tokens.

                  {error, ErrorInfo, EndLocation}:
                    An  error  occurred.  LeftOverChars  is the remaining characters of the input
                    data, starting from EndLocation.

                {more, Continuation1}:
                  More data is required for building a term. Continuation1 must be  passed  in  a
                  new call to tokens/3,4 when more data is available.

              The  CharSpec  eof  signals  end of file. LeftOverChars then takes the value eof as
              well.

              tokens(Continuation, CharSpec, StartLocation) is equivalent to tokens(Continuation,
              CharSpec, StartLocation, []).

              For a description of the options, see string/3.

ERROR INFORMATION

       ErrorInfo  is  the standard ErrorInfo structure that is returned from all I/O modules. The
       format is as follows:

       {ErrorLocation, Module, ErrorDescriptor}

       A string describing the error is obtained with the following call:

       Module:format_error(ErrorDescriptor)

NOTES

       The continuation of the first call to the re-entrant input functions must  be  [].  For  a
       complete  description of how the re-entrant input scheme works, see Armstrong, Virding and
       Williams: 'Concurrent Programming in Erlang', Chapter 13.

SEE ALSO

       erl_anno(3erl), erl_parse(3erl), io(3erl)