Provided by: ocaml-man_4.08.1-8_all bug

NAME

       Scanf - Formatted input functions.

Module

       Module   Scanf

Documentation

       Module Scanf
        : sig end

       Formatted input functions.

   Introduction
   Functional input with format strings
       The module Scanf provides formatted input functions or scanners.

       The  formatted  input functions can read from any kind of input, including strings, files,
       or anything that can return characters. The more general source of characters is  named  a
       formatted  input channel (or     scanning buffer) and has type Scanf.Scanning.in_channel .
       The more general formatted input function reads from any  scanning  buffer  and  is  named
       bscanf .

       Generally speaking, the formatted input functions have 3 arguments:

       -the first argument is a source of characters for the input,

       -the second argument is a format string that specifies the values to read,

       -the third argument is a receiver function that is applied to the values read.

       Hence,  a  typical  call to the formatted input function Scanf.bscanf is bscanf ic fmt f ,
       where:

       - ic is a source  of  characters  (typically  a      formatted  input  channel  with  type
       Scanf.Scanning.in_channel ),

       -  fmt  is  a  format string (the same format strings as those used to print material with
       module Printf or Format ),

       - f is a function that has as many arguments as the number of values to read in the  input
       according to fmt .

   A simple example
       As  suggested  above,  the  expression  bscanf  ic %d f reads a decimal integer n from the
       source of characters ic and returns f n .

       For instance,

       -if we use stdin as the source of characters  (  Scanf.Scanning.stdin  is  the  predefined
       formatted input channel that reads from standard input),

       -if we define the receiver f as let f x = x + 1 ,

       then bscanf Scanning.stdin %d f reads an integer n from the standard input and returns f n
       (that is n + 1 ). Thus, if we evaluate bscanf stdin %d f  ,  and  then  enter  41  at  the
       keyboard, the result we get is 42 .

   Formatted input as a functional feature
       The OCaml scanning facility is reminiscent of the corresponding C feature.  However, it is
       also largely different, simpler, and yet more powerful: the formatted input functions  are
       higher-order  functionals and the parameter passing mechanism is just the regular function
       application not the variable assignment based mechanism which  is  typical  for  formatted
       input  in  imperative languages; the OCaml format strings also feature useful additions to
       easily define complex tokens; as expected within a functional  programming  language,  the
       formatted  input  functions also support polymorphism, in particular arbitrary interaction
       with polymorphic user-defined scanners. Furthermore, the OCaml formatted input facility is
       fully type-checked at compile time.

   Formatted input channel
       module Scanning : sig end

   Type of formatted input functions
       type ('a, 'b, 'c, 'd) scanner = ('a, Scanning.in_channel, 'b, 'c, 'a -> 'd, 'd) format6 ->
       'c

       The type of formatted input scanners: ('a, 'b, 'c, 'd) scanner is the type of a  formatted
       input  function  that  reads  from  some  formatted input channel according to some format
       string; more precisely, if scan is some formatted input  function,  then  scan  ic  fmt  f
       applies  f  to all the arguments specified by format string fmt , when scan has read those
       arguments from the Scanf.Scanning.in_channel formatted input channel ic .

       For instance, the Scanf.scanf function below has type ('a, 'b, 'c, 'd) scanner , since  it
       is a formatted input function that reads from Scanf.Scanning.stdin : scanf fmt f applies f
       to the arguments specified by fmt , reading those arguments from stdin as expected.

       If the format fmt has some %r indications, the  corresponding  formatted  input  functions
       must  be  provided  before  receiver  function  f . For instance, if read_elem is an input
       function for values of type t , then bscanf ic %r; read_elem f reads a value v of  type  t
       followed by a ';' character, and returns f v .

       Since 3.10.0

       exception Scan_failure of string

       When  the  input  can  not be read according to the format string specification, formatted
       input functions typically raise exception Scan_failure .

   The general formatted input function
       val bscanf : Scanning.in_channel -> ('a, 'b, 'c, 'd) scanner

       bscanf ic fmt r1 ... rN f reads characters from  the  Scanf.Scanning.in_channel  formatted
       input  channel ic and converts them to values according to format string fmt .  As a final
       step, receiver function f is applied to the values read and gives the result of the bscanf
       call.

       For  instance,  if  f  is the function fun s i -> i + 1 , then Scanf.sscanf x= 1 %s = %i f
       returns 2 .

       Arguments r1 to rN are user-defined input functions that read the  argument  corresponding
       to the %r conversions specified in the format string.

   Format string description
       The format string is a character string which contains three types of objects:

       -plain  characters,  which  are  simply  matched  with the characters of the input (with a
       special case for space and line feed, see Scanf.space ),

       -conversion specifications, each of which causes reading and conversion  of  one  argument
       for the function f (see Scanf.conversion ),

       -scanning indications to specify boundaries of tokens (see scanning Scanf.indication ).

   The space character in format strings
       As  mentioned  above, a plain character in the format string is just matched with the next
       character of the input; however, two characters are special exceptions to this  rule:  the
       space  character  ( ' ' or ASCII code 32) and the line feed character ( '\n' or ASCII code
       10).  A space does not match a single space character, but any amount of  'whitespace'  in
       the  input.  More  precisely,  a space inside the format string matches any number of tab,
       space, line feed and carriage return characters. Similarly, a line feed character  in  the
       format  string  matches  either a single line feed or a carriage return followed by a line
       feed.

       Matching any amount of whitespace, a space in the format string also matches no amount  of
       whitespace  at  all;  hence,  the  call  bscanf  ib Price = %d $ (fun p -> p) succeeds and
       returns 1 when reading an input with various whitespace in it, such as Price = 1 $ , Price
       = 1 $ , or even Price=1$ .

   Conversion specifications in format strings
       Conversion  specifications  consist  in  the % character, followed by an optional flag, an
       optional field width, and followed by one or two conversion characters.

       The conversion characters and their meanings are:

       - d : reads an optionally signed decimal integer ( 0-9 +).

       - i : reads an optionally signed integer (usual input conventions for decimal  (  0-9  +),
       hexadecimal  (  0x[0-9a-f]+ and 0X[0-9A-F]+ ), octal ( 0o[0-7]+ ), and binary ( 0b[0-1]+ )
       notations are understood).

       - u : reads an unsigned decimal integer.

       - x or X : reads an unsigned hexadecimal integer ( [0-9a-fA-F]+ ).

       - o : reads an unsigned octal integer ( [0-7]+ ).

       - s : reads a string argument that spreads  as  much  as  possible,  until  the  following
       bounding condition holds:

       -a whitespace has been found (see Scanf.space ),

       -a scanning indication (see scanning Scanf.indication ) has been encountered,

       -the end-of-input has been reached.

       Hence,  this  conversion  always  succeeds:  it  returns  an  empty string if the bounding
       condition holds when the scan begins.

       - S : reads a delimited string argument (delimiters and special escaped characters  follow
       the lexical conventions of OCaml).

       -  c  :  reads a single character. To test the current input character without reading it,
       specify a null field width, i.e. use specification %0c . Raise Invalid_argument ,  if  the
       field width specification is greater than 1.

       - C : reads a single delimited character (delimiters and special escaped characters follow
       the lexical conventions of OCaml).

       - f , e , E , g , G  :  reads  an  optionally  signed  floating-point  number  in  decimal
       notation, in the style dddd.ddd e/E+-dd .

       - h , H : reads an optionally signed floating-point number in hexadecimal notation.

       -  F  : reads a floating point number according to the lexical conventions of OCaml (hence
       the decimal point is mandatory if the exponent part is not mentioned).

       - B : reads a boolean argument ( true or false ).

       - b : reads a boolean argument (for backward compatibility; do not use in new programs).

       - ld , li , lu , lx , lX , lo : reads an int32 argument to the  format  specified  by  the
       second letter for regular integers.

       -  nd , ni , nu , nx , nX , no : reads a nativeint argument to the format specified by the
       second letter for regular integers.

       - Ld , Li , Lu , Lx , LX , Lo : reads an int64 argument to the  format  specified  by  the
       second letter for regular integers.

       -  [  range ] : reads characters that matches one of the characters mentioned in the range
       of characters range (or not mentioned in it, if the range starts with ^ ). Reads a  string
       that  can  be  empty,  if  the  next  input character does not match the range. The set of
       characters from c1 to c2 (inclusively) is denoted by c1-c2  .   Hence,  %[0-9]  returns  a
       string  representing  a  decimal  number  or an empty string if no decimal digit is found;
       similarly, %[0-9a-f] returns a string of hexadecimal digits.  If a closing bracket appears
       in a range, it must occur as the first character of the range (or just after the ^ in case
       of range negation); hence []] matches a ] character and [^]] matches any character that is
       not ] .  Use %% and %@ to include a % or a @ in a range.

       -  r  :  user-defined reader. Takes the next ri formatted input function and applies it to
       the scanning buffer ib to read the next argument. The input  function  ri  must  therefore
       have type Scanning.in_channel -> 'a and the argument read has type 'a .

       -  {  fmt  %}  : reads a format string argument. The format string read must have the same
       type as the format string specification fmt . For instance, %{  %i  %}  reads  any  format
       string that can read a value of type int ; hence, if s is the string fmt:\ number is %u\""
       , then Scanf.sscanf s fmt: %{%i%} succeeds and returns the format string number is %u .

       - ( fmt %) : scanning sub-format substitution.  Reads a format string  rf  in  the  input,
       then  goes  on  scanning with rf instead of scanning with fmt .  The format string rf must
       have the same type as the format string specification fmt that it replaces.  For instance,
       %(  %i  %)  reads  any  format  string that can read a value of type int .  The conversion
       returns the format string read rf , and then a value read using rf .  Hence, if s  is  the
       string  \  %4d\"1234.00"  ,  then Scanf.sscanf s %(%i%) (fun fmt i -> fmt, i) evaluates to
       ("%4d", 1234) .  This behaviour is not mere  format  substitution,  since  the  conversion
       returns  the  format  string  read  as  additional  argument.  If  you  need  pure  format
       substitution, use special flag _ to discard the extraneous argument: conversion %_( fmt %)
       reads  a  format string rf and then behaves the same as format string rf .  Hence, if s is
       the string \  %4d\"1234.00"  ,  then  Scanf.sscanf  s  %_(%i%)  is  simply  equivalent  to
       Scanf.sscanf 1234.00 %4d .

       - l : returns the number of lines read so far.

       - n : returns the number of characters read so far.

       - N or L : returns the number of tokens read so far.

       - !  : matches the end of input condition.

       - % : matches one % character in the input.

       - @ : matches one @ character in the input.

       - , : does nothing.

       Following  the % character that introduces a conversion, there may be the special flag _ :
       the conversion that follows occurs as usual, but the resulting value  is  discarded.   For
       instance,  if  f  is  the  function  fun  i  ->  i  + 1 , and s is the string x = 1 , then
       Scanf.sscanf s %_s = %i f returns 2 .

       The field width is composed of an optional integer literal indicating the maximal width of
       the  token  to read.  For instance, %6d reads an integer, having at most 6 decimal digits;
       %4f reads a float with at  most  4  characters;  and  %8[\000-\255]  returns  the  next  8
       characters  (or  all  the  characters  still  available,  if  fewer  than 8 characters are
       available in the input).

       Notes:

       -as mentioned above, a %s conversion always succeeds, even if there is nothing to read  in
       the input: in this case, it simply returns  .

       -in  addition  to  the  relevant digits, '_' characters may appear inside numbers (this is
       reminiscent to the usual OCaml lexical conventions). If stricter scanning is desired,  use
       the range conversion facility instead of the number conversions.

       -the  scanf  facility  is  not intended for heavy duty lexical analysis and parsing. If it
       appears not  expressive  enough  for  your  needs,  several  alternative  exists:  regular
       expressions   (module  Str  ),  stream  parsers,  ocamllex  -generated  lexers,  ocamlyacc
       -generated parsers.

   Scanning indications in format strings
       Scanning indications appear just after the string conversions %s and %[ range ] to delimit
       the  end  of  the token. A scanning indication is introduced by a @ character, followed by
       some plain character c . It means that the string token should end just  before  the  next
       matching  c (which is skipped). If no c character is encountered, the string token spreads
       as much as possible. For instance, %s@\t reads a string up to the next tab character or to
       the  end  of  input.  If  a  @ character appears anywhere else in the format string, it is
       treated as a plain character.

       Note:

       -As usual in format strings, % and @ characters must be escaped using %%  and  %@  ;  this
       rule  still  holds  within  range  specifications and scanning indications.  For instance,
       format %s@%% reads a string up to the next % character, and format %s@%@ reads a string up
       to the next @ .

       -The  scanning  indications  introduce  slight  differences  in the syntax of Scanf format
       strings, compared to those used for the Printf module. However, the  scanning  indications
       are similar to those used in the Format module; hence, when producing formatted text to be
       scanned by Scanf.bscanf , it is wise to use printing functions from the Format module (or,
       if  you  need  to  use functions from Printf , banish or carefully double check the format
       strings that contain '@' characters).

   Exceptions during scanning
       Scanners may raise the following exceptions when the input cannot be read according to the
       format string:

       -Raise Scanf.Scan_failure if the input does not match the format.

       -Raise Failure if a conversion to a number is not possible.

       -Raise  End_of_file  if  the  end  of  input is encountered while some more characters are
       needed to read the current conversion specification.

       -Raise Invalid_argument if the format string is invalid.

       Note:

       -as a consequence, scanning a %s conversion never raises exception End_of_file  :  if  the
       end  of input is reached the conversion succeeds and simply returns the characters read so
       far, or  if none were ever read.

   Specialised formatted input functions
       val sscanf : string -> ('a, 'b, 'c, 'd) scanner

       Same as Scanf.bscanf , but reads from the given string.

       val scanf : ('a, 'b, 'c, 'd) scanner

       Same  as  Scanf.bscanf  ,  but  reads  from  the  predefined   formatted   input   channel
       Scanf.Scanning.stdin that is connected to stdin .

       val kscanf : Scanning.in_channel -> (Scanning.in_channel -> exn -> 'd) -> ('a, 'b, 'c, 'd)
       scanner

       Same as Scanf.bscanf , but takes an additional function argument ef that is called in case
       of  error:  if the scanning process or some conversion fails, the scanning function aborts
       and calls the error handling  function  ef  with  the  formatted  input  channel  and  the
       exception that aborted the scanning process as arguments.

       val ksscanf : string -> (Scanning.in_channel -> exn -> 'd) -> ('a, 'b, 'c, 'd) scanner

       Same as Scanf.kscanf but reads from the given string.

       Since 4.02.0

   Reading format strings from input
       val  bscanf_format  : Scanning.in_channel -> ('a, 'b, 'c, 'd, 'e, 'f) format6 -> (('a, 'b,
       'c, 'd, 'e, 'f) format6 -> 'g) -> 'g

       bscanf_format ic fmt f reads a format string token from the formatted input channel  ic  ,
       according  to  the  given format string fmt , and applies f to the resulting format string
       value.  Raise Scanf.Scan_failure if the format string value read does not  have  the  same
       type as fmt .

       Since 3.09.0

       val  sscanf_format  :  string -> ('a, 'b, 'c, 'd, 'e, 'f) format6 -> (('a, 'b, 'c, 'd, 'e,
       'f) format6 -> 'g) -> 'g

       Same as Scanf.bscanf_format , but reads from the given string.

       Since 3.09.0

       val format_from_string : string -> ('a, 'b, 'c, 'd, 'e, 'f) format6 -> ('a,  'b,  'c,  'd,
       'e, 'f) format6

       format_from_string  s  fmt converts a string argument to a format string, according to the
       given format string fmt .  Raise Scanf.Scan_failure if s , considered as a format  string,
       does not have the same type as fmt .

       Since 3.10.0

       val unescaped : string -> string

       unescaped s return a copy of s with escape sequences (according to the lexical conventions
       of  OCaml)  replaced  by  their  corresponding  special   characters.    More   precisely,
       Scanf.unescaped   has   the  following  property:  for  all  string  s  ,  Scanf.unescaped
       (String.escaped s) = s .

       Always return a copy of the argument, even if there is no escape sequence in the argument.
       Raise  Scanf.Scan_failure  if  s  is  not  properly  escaped  (i.e.   s has invalid escape
       sequences  or  special  characters  that  are  not  properly  escaped).    For   instance,
       Scanf.unescaped \" will fail.

       Since 4.00.0

   Deprecated
       val fscanf : in_channel -> ('a, 'b, 'c, 'd) scanner

       Deprecated.

       Scanf.fscanf is error prone and deprecated since 4.03.0.

       This  function  violates the following invariant of the Scanf module: To preserve scanning
       semantics, all scanning  functions  defined  in  Scanf  must  read  from  a  user  defined
       Scanf.Scanning.in_channel formatted input channel.

       If   you   need   to   read  from  a  in_channel  input  channel  ic  ,  simply  define  a
       Scanf.Scanning.in_channel formatted input channel as in let ib = Scanning.from_channel  ic
       , then use Scanf.bscanf ib as usual.

       val kfscanf : in_channel -> (Scanning.in_channel -> exn -> 'd) -> ('a, 'b, 'c, 'd) scanner

       Deprecated.

       Scanf.kfscanf is error prone and deprecated since 4.03.0.