lunar (1) haserl.1.gz

Provided by: haserl_0.9.36-1_amd64 bug

NAME

       haserl - A cgi scripting program for embedded environments

SYNOPSIS

       #!/usr/bin/haserl   [--shell=pathspec]  [--upload-dir=dirspec]  [--upload-handler=handler]
       [--upload-limit=limit] [--accept-all] [--accept-none] [--silent] [--debug]

       [ text ] [ <% shell script %> ] [ text ] ...

DESCRIPTION

       Haserl is a small cgi wrapper that allows "PHP" style cgi programming,  but  uses  a  UNIX
       bash-like  shell  or Lua  as the programming language. It is very small, so it can be used
       in embedded environments, or where something like PHP is too big.

       It combines three features into a small cgi engine:

              It parses POST and GET requests, placing form-elements as name=value pairs into the
              environment for the CGI script to use.  This is somewhat like the uncgi wrapper.

              It  opens  a  shell,  and  translates all text into printable statements.  All text
              within <% ... %> constructs are passed verbatim to the  shell.   This  is  somewhat
              like writing PHP scripts.

              It  can optionally be installed to drop its permissions to the owner of the script,
              giving it some of the security features of suexec or cgiwrapper.

OPTIONS SUMMARY

       This is a summary of the command-line options.  Please see the OPTIONS section  under  the
       long option name for a complete description.

       -a  --accept-all
       -n  --accept-none
       -d  --debug
       -s, --shell
       -S, --silent
       -U, --upload-dir
       -u, --upload-limit
       -H, --upload-handler

OPTIONS

       --accept-all
              The  program  normally  accepts  POST data only when the REQUEST_METHOD is POST and
              only accepts data on the URL  data when the REQUEST_METHOD is  GET.    This  option
              allows  both  POST  and  URL  data to be accepted regardless of the REQUEST_METHOD.
              When this option is set, the REQUEST_METHOD takes precedence (e.g.  if  the  method
              is  POST,  FORM_variables  are  taken from COOKIE data, GET data, and POST data, in
              that order.   If the method is GET, FORM_variables are taken from COOKIE data, POST
              data,  and  GET  data.)   The default is not to accept all input methods - just the
              COOKIE data and the REQUEST_METHOD.

       --accept-none
              If given, haserl will not parse standard input as http  content  before  processing
              the script.  This is useful if calling a haserl script from another haserl script.

       --debug
              Instead  of  executing the script, print out the script that would be executed.  If
              the environment variable 'REQUEST_METHOD'  is  set,  the  data  is  sent  with  the
              plain/text content type.  Otherwise, the shell script is printed verbatim.

       --shell=pathspec
              Specify an alternative bash-like shell to use. Defaults to "/bin/sh"

              To include shell parameters do not use the --shell=/bin/sh format. Instead, use the
              alternative format without the "=", as in --shell "/bin/bash --norc".  Be  sure  to
              quote the option string to protect any special characters.

              If  compiled with Lua libraries, then the string "lua" is used to use an integrated
              Lua vm.  This string is case sensitive.  Example: --shell=lua

              An alternative is "luac".  This causes the haserl and lua parsers to  be  disabled,
              and  the  script is assumed to be a precompiled lua chunk.  See LUAC below for more
              information.

       --silent
              Haserl  normally  prints  an  informational  message  on  error  conditions.   This
              suppresses the error message, so that the use of haserl is not advertised.

       --upload-dir=dirspec
              Defaults  to "/tmp". All uploaded files are created with temporary filename in this
              directory HASERL_xxx_path contains the name of the  temporary  file.  FORM_xxx_name
              contains the original name of the file, as specified by the client.

       --upload-handler=pathspec
              When  specified,  file  uploads are handled by this handler, rather than written to
              temporary files.  The full pathspec must be given (the PATH is not  searched),  and
              the  upload-handler  is  given  one command-line parameter: The name of the FIFO on
              which the upload file will be  sent.   In  addition,  the  handler  may  receive  3
              environment  variables:  CONTENT_TYPE,  FILENAME, and NAME.  These reflect the MIME
              content-disposition headers for the content. Haserl will fork the handler for  each
              file uploaded, and will send the contents of the upload file to the specified FIFO.
              Haserl will then block until the handler terminates.  This method  is  for  experts
              only.

       --upload-limit=limit
              Allow  a  mime-encoded  file up to limit KB to be uploaded.  The default is 0KB (no
              uploads allowed).  Note that mime-encoding adds 33% to the size of the data.

OVERVIEW OF OPERATION

       In general, the web server sets up several environment variables, and then  uses  fork  or
       another  method  to  run  the  CGI script.  If the script uses the haserl interpreter, the
       following happens:

              If haserl is installed suid root, then uid/gid is set to the owner of the script.

              The environment is scanned for HTTP_COOKIE, which may have  been  set  by  the  web
              server.   If it exists, the parsed contents are placed in the local environment.

              The  environment  is  scanned  for REQUEST_METHOD, which was set by the web server.
              Based on the request method,  standard  input  is  read  and  parsed.   The  parsed
              contents are placed in the local environment.

              The  script  is  tokenized,  parsing haserl code blocks from raw text.  Raw text is
              converted into "echo" statements, and then all tokens are sent to the sub-shell.

              haserl forks and a sub-shell (typically /bin/sh) is started.

              All tokens are sent to the STDIN of the sub-shell, with a trailing exit command.

              When the sub-shell terminates, the haserl interpreter performs  final  cleanup  and
              then terminates.

CLIENT SIDE INPUT

       The haserl interpreter will decode data sent via the HTTP_COOKIE environment variable, and
       the GET or POST method from the client, and store them as environment variables  that  can
       be  accessed  by  haserl.   The name of the variable follows the name given in the source,
       except that a prefix ( FORM_) is prepended.  For example, if the client  sends  "foo=bar",
       the environment variable is FORM_foo=bar.

       For  the  HTTP_COOKIE  method, variables are also stored with the prefix ( COOKIE_) added.
       For  example,  if  HTTP_COOKIE   includes   "foo=bar",   the   environment   variable   is
       COOKIE_foo=bar.

       For  the  GET  method,  data  sent  in the form %xx is translated into the characters they
       represent, and variables are also stored with the prefix ( GET_) added.  For  example,  if
       QUERY_STRING includes "foo=bar", the environment variable is GET_foo=bar.

       For  the  POST  method,  variables  are  also  stored with the prefix ( POST_) added.  For
       example, if the post stream includes "foo=bar", the environment variable is POST_foo=bar.

       Also, for the POST method, if the data is sent  using  multipart/form-data  encoding,  the
       data is automatically decoded.   This is typically used when files are uploaded from a web
       client using <input type=file>.

       NOTE   When a file is uploaded  to  the  web  server,  it  is  stored  in  the  upload-dir
              directory.   FORM_variable_name=  contains  the  name  of  the  file  uploaded  (as
              specified by the client.)  HASERL_variable_path= contains the name of the  file  in
              upload-dir  that  holds  the  uploaded content.   To prevent malicious clients from
              filling up upload-dir on your web server, file uploads are only  allowed  when  the
              --upload-limit option is used to specify how large a file can be uploaded.   Haserl
              automatically deletes the temporary file when the script is finished.  To keep  the
              file, move it or rename it somewhere in the script.

              Note that the filename is stored in HASERL_variable_path This is because the FORM_,
              GET_, and POST_ variables are modifiable by the client, and a malicious client  can
              set  a  second  variable with the name variable_path=/etc/passwd.  Earlier versions
              did  not  store  the  pathspec  in  HASERL   namespace.    To   maintain   backward
              compailibility, the name of the temporary file is also stored in FORM_variable= and
              POST_variable=. This is considered unsafe and should not be used.

       If the client sends data both by POST and GET methods, then haserl  will  parse  only  the
       data  that  corresponds with the REQUEST_METHOD variable set by the web server, unless the
       accept-all option has been set.   For example, a form called via POST method, but having a
       URI  of  some.cgi?foo=bar&otherdata=something  will have the POST data parsed, and the foo
       and otherdata variables are ignored.

       If the web server defines a HTTP_COOKIE environment variable, the cookie data  is  parsed.
       Cookie data is parsed before the GET or POST data, so in the event of two variables of the
       same name, the GET or POST data overwrites the cookie information.

       When multiple instances of  the  same  variable  are  sent  from  different  sources,  the
       FORM_variable  will  be  set  according  to  the  order  in which variables are processed.
       HTTP_COOKIE is always processed first, followed by the REQUEST_METHOD.  If the  accept-all
       option  has  been  set,  then  HTTP_COOKIE  is processed first, followed by the method not
       specified by REQUEST_METHOD, followed by the REQUEST_METHOD.  The  last  instance  of  the
       variable  will  be used to set FORM_variable.  Note that the variables are also separately
       creates as COOKIE_variable, GET_variable  and  POST_variable.   This  allows  the  use  of
       overlapping names from each source.

       When  multiple instances of the same variable are sent from the same source, only the last
       one is saved.  To keep all copies (for multi-selects, for instance), add "[]" to  the  end
       of the variable name.  All results will be returned, separated by newlines.   For example,
       host=Enoch&host=Esther&host=Joshua         results         in          "FORM_host=Joshua".
       host[]=Enoch&host[]Esther&host[]=Joshua results in "FORM_host=Enoch\nEsther\nJoshua"

LANGUAGE

       The following language structures are recognized by haserl.

       RUN
              <% [shell script] %>

              Anything  enclosed by <% %> tags is sent to the sub-shell for execution.   The text
              is sent verbatim.

       INCLUDE
              <%in pathspec %>

              Include another file verbatim in this script.  The file is included when the script
              is initially parsed.

       EVAL
              <%= expression %>

              print the shell expression.  Syntactic sugar for "echo expr".

       COMMENT
              <%# comment %>

              Comment  block.  Anything in a comment block is not parsed.  Comments can be nested
              and can contain other haserl elements.

EXAMPLES

       WARNING
              The examples below are simplified to  show  how  to  use  haserl.   You  should  be
              familiar  with  basic  web scripting security before using haserl (or any scripting
              language) in a production environment.

       Simple Command
              #!/usr/local/bin/haserl
              content-type: text/plain

              <%# This is a sample "env" script %>
              <% env %>

              Prints the results of the env command as a mime-type "text/plain" document. This is
              the haserl version of the common printenv cgi.

       Looping with dynamic output
              #!/usr/local/bin/haserl
              Content-type: text/html

              <html>
              <body>
              <table border=1><tr>
              <% for a in Red Blue Yellow Cyan; do %>
                   <td bgcolor="<% echo -n "$a" %>"><% echo -n "$a" %></td>
                   <% done %>
              </tr></table>
              </body>
              </html>

              Sends  a  mime-type  "text/html" document to the client, with an html table of with
              elements labeled with the background color.

       Use Shell defined functions.
              #!/usr/local/bin/haserl
              content-type: text/html

              <% # define a user function
                 table_element() {
                     echo "<td bgcolor=\"$1\">$1</td>"
                  }
                 %>
              <html>
              <body>
              <table border=1><tr>
              <% for a in Red Blue Yellow Cyan; do %>
                   <% table_element $a %>
                   <% done %>
              </tr></table>
              </body>
              </html>

              Same as above, but uses a shell function instead of embedded html.

       Self Referencing CGI with a form
              #!/usr/local/bin/haserl
              content-type: text/html

              <html><body>
              <h1>Sample Form</h1>
              <form action="<% echo -n $SCRIPT_NAME %>" method="GET">
              <% # Do some basic validation of FORM_textfield
                 # To prevent common web attacks
                 FORM_textfield=$( echo "$FORM_textfield" | sed "s/[^A-Za-z0-9 ]//g" )
                 %>
              <input type=text name=textfield
                   Value="<% echo -n "$FORM_textfield" | tr a-z A-Z %>" cols=20>
              <input type=submit value=GO>
              </form></html>
              </body>

              Prints a form.  If the client enters text in the form, the CGI is reloaded (defined
              by  $SCRIPT_NAME)  and  the textfield is sanitized to prevent web attacks, then the
              form is redisplayed with the text the user entered.  The text is uppercased.

       Uploading a File
              #!/usr/local/bin/haserl --upload-limit=4096 --upload-dir=/tmp
              content-type: text/html

              <html><body>
              <form action="<% echo -n $SCRIPT_NAME %>" method=POST enctype="multipart/form-data" >
              <input type=file name=uploadfile>
              <input type=submit value=GO>
              <br>
              <% if test -n "$HASERL_uploadfile_path"; then %>
                      <p>
                      You uploaded a file named <b><% echo -n $FORM_uploadfile_name %></b>, and it was
                      temporarily stored on the server as <i><% echo $HASERL_uploadfile_path %></i>.  The
                      file was <% cat $HASERL_uploadfile_path | wc -c %> bytes long.</p>
                      <% rm -f $HASERL_uploadfile_path %><p>Don't worry, the file has just been deleted
                      from the web server.</p>
              <% else %>
                      You haven't uploaded a file yet.
              <% fi %>
              </form>
              </body></html>

              Displays a form that allows for file uploading.  This is accomplished by using  the
              --upload-limit  and  by  setting  the  form enctype to multipart/form-data.  If the
              client sends a file, then some information regarding the file is printed, and  then
              deleted.  Otherwise, the form states that the client has not uploaded a file.

       RFC-2616 Conformance
              #!/usr/local/bin/haserl
              <% echo -en "content-type: text/html\r\n\r\n" %>
              <html><body>
                ...
              </body></html>

              To  fully  comply  with  the HTTP specification, headers should be terminated using
              CR+LF, rather than the normal unix LF line termination only.  The above syntax  can
              be used to produce RFC 2616 compliant headers.

ENVIRONMENT

       In  addition  to  the  environment  variables inherited from the web server, the following
       environment variables are always defined at startup:

       HASERLVER
              haserl version - an informational tag.

       SESSIONID
              A hexadecimal tag that is unique for the life of the CGI (it is generated when  the
              cgi starts; and does not change until another POST or GET query is generated.)

       HASERL_ACCEPT_ALL
              If the --accept-all flag was set, -1, otherwise 0.

       HASERL_SHELL
              The name of the shell haserl started to run sub-shell commands in.

       HASERL_UPLOAD_DIR
              The directory haserl will use to store uploaded files.

       HASERL_UPLOAD_LIMIT
              The number of KB that are allowed to be sent from the client to the server.

       These  variables  can  be  modified  or  overwritten  within the script, although the ones
       starting with "HASERL_" are informational only, and do not affect the running script.

SAFETY FEATURES

       There is much literature regarding the dangers of using  shell  to  program  CGI  scripts.
       haserl contains some protections to mitigate this risk.

       Environment Variables
              The  code  to  populate  the environment variables is outside the scope of the sub-
              shell.   It parses on the characters ? and  &, so it is harder for a client  to  do
              "injection" attacks.  As an example, foo.cgi?a=test;cat /etc/passwd could result in
              a variable being assigned the value test  and  then  the  results  of  running  cat
              /etc/passwd being sent to the client.  Haserl will assign the variable the complete
              value: test;cat /etc/passwd

              It is safe to use this "dangerous" variable in shell scripts  by  enclosing  it  in
              quotes; although validation should be done on all input fields.

       Privilege Dropping
              If  installed as a suid script, haserl will set its uid/gid to that of the owner of
              the script.  This can be used to have a  set  of  CGI  scripts  that  have  various
              privilege.   If  the haserl binary is not installed suid, then the CGI scripts will
              run with the uid/gid of the web server.

       Reject command line parameters given on the URL
              If the URL does not contain an unencoded "=", then the CGI spec states the  options
              are  to be used as command-line parameters to the program.  For instance, according
              to the CGI spec: http://192.168.0.1/test.cgi?--upload-limit%3d2000&foo%3dbar
              Should set the upload-limit to 2000KB in addition to setting "Foo=bar".  To protect
              against clients enabling their own uploads, haserl rejects any command-line options
              beyond argv[2].   If invoked as a #!   script,  the  interpreter  is  argv[0],  all
              command-line  options  listed  in  the  #!  line are combined into argv[1], and the
              script name is argv[2].

LUA

       If compiled with lua support, --shell=lua will enable lua as the script  language  instead
       of  bash  shell.   The environment variables (SCRIPT_NAME, SERVER_NAME, etc) are placed in
       the ENV table, and the form variables are placed in the  FORM  table.   For  example,  the
       self-referencing form above can be written like this:

              #!/usr/local/bin/haserl --shell=lua
              content-type: text/html

              <html><body>
              <h1>Sample Form</h1>
              <form action="<% io.write(ENV["SCRIPT_NAME"]) %>" method="GET">
              <% # Do some basic validation of FORM_textfield
                 # To prevent common web attacks
                 FORM.textfield=string.gsub(FORM.textfield, "[^%a%d]", "")
                 %>
              <input type=text name=textfield
                   Value="<% io.write (string.upper(FORM.textfield)) %>" cols=20>
              <input type=submit value=GO>
              </form></html>
              </body>

       The  <%=  operator  is syntactic sugar for io.write (tostring( ... )) So, for example, the
       Value= line above could be written: Value="<%= string.upper(FORM.textfield) %>" cols=20>

       haserl lua scripts can use the function  haserl.loadfile(filename)  to  process  a  target
       script as a haserl (lua) script.  The function returns a type of "function".

       For example,

       bar.lsp
              <% io.write ("Hello World" ) %>

              Your message is <%= gvar %>

              -- End of Include file --

       foo.haserl
              #!/usr/local/bin/haserl --shell=lua
              <% m = haserl.loadfile("bar.lsp")
                 gvar = "Run as m()"
                 m()

                 gvar = "Load and run in one step"
                 haserl.loadfile("bar.lsp")()
              %>

       Running foo will produce:

              Hello World
              Your message is Run as m()
              -- End of Include file --
              Hello World
              Your message is Load and run in one step
              -- End of Include file --

              This  function makes it possible to have nested haserl server pages - page snippets
              that are processed by the haserl tokenizer.

LUAC

       The luac "shell" is a precompiled lua chunk, so interactive editing and testing of scripts
       is  not  possible. However, haserl can be compiled with luac support only, and this allows
       lua support even in a small memory environment.  All haserl lua features listed above  are
       still  available.   (If  luac  is the only shell built into haserl, the haserl.loadfile is
       disabled, as the haserl parser is not compiled in.)

       Here is an example of a trivial script, converted into a luac cgi script:

       Given the file test.lua:
              print ("Content-Type: text/plain0)
              print ("Your UUID for this run is: " .. ENV.SESSIONID)

       It can be compiled with luac:
              luac -o test.luac -s test.lua

       And then the haserl header added to it:
              echo '#!/usr/bin/haserl --shell=luac' | cat - test.luac  >luac.cgi

       Alternatively, it is possible to develop an entire website using the standard  lua  shell,
       and  then  have  haserl  itself  preprocess the scripts for the luac compiler as part of a
       build process.  To do this, use --shell=lua, and develop the website.  When ready to build
       the runtime environment, add the --debug line to your lua scripts, and run them outputting
       the results to .lua source files.  For example:

       Given the haserl script test.cgi:
              #!/usr/bin/haserl --shell=lua --debug
              Content-Type: text/plain

              Your UUID for this run is <%= ENV.SESSIONID %>

       Precompile, compile, and add the haserl luac header:
              ./test.cgi > test.lua
              luac -s -o test.luac test.lua
              echo '#!/usr/bin/haserl --shell=luac' | cat - test.luac >luac.cgi

BUGS

       Old versions of haserl used <? ?> as token markers, instead of <% %>.   Haserl  will  fall
       back to using <? ?> if <% does not appear anywhere in the script.

       When files are uploaded using RFC-2388, a temporary file is created.  The name of the file
       is  stored  in  FORM_variable_name,  POST_variable_name,  and  HASERL_variable_name.  Only
       HASERL_variable_name should be used - the others can be overwritten by a malicious client.

NAME

       The  name  "haserl"  comes  from  the Bavarian word for "bunny." At first glance it may be
       small and cute, but haserl is more like the bunny from Monty Python & The Holy Grail.   In
       the  words  of  Tim the Wizard, That's the most foul, cruel & bad-tempered rodent you ever
       set eyes on!

       Haserl can be thought of the cgi equivalent to netcat.  Both are small, powerful, and have
       very little in the way of extra features.  Like netcat, haserl attempts to do its job with
       the least amount of extra "fluff".

AUTHOR

       Nathan Angelacos <nangel@users.sourceforge.net>

SEE ALSO

       php(http://www.php.net)                 uncgi(http://www.midwinter.com/~koreth/uncgi.html)
       cgiwrapper(http://cgiwrapper.sourceforge.net)

                                           October 2010                                 haserl(1)