Provided by: lire-devel-doc_2.1.1-2.1_all bug

NAME

       Lire::W3CExtendedLog - Base implementation of a W3C Extended Log parser

SYNOPSIS

       use Lire::W3CExtendedLog;

       my $parser = new Lire::W3CExtendedLog;

       my $w3c_rec = $parser->parse( $line );

DESCRIPTION

       This module defines objects able to parse W3C Extended Log Format.  This log format is
       defined at http://www.w3.org/TR/WD-logfile.html

       All attributes of the created object can be overriden by e.g. modules extending the
       object.  The attributes are:

   type2regex
       type2regex is a hash containing key-value pairs like

        'name' => '([-_.0-9a-zA-Z]+)'

       Keys are all data formats for log file field entries as defined in the W3C specification:
       'integer', 'fixed', 'uri', 'date', 'time' and 'string', along with 'name' and 'address'
       types.

   identifier2type
       identifier2type is a hash containing key-value pairs like

        'dns'        => 'name',
        'uri-query'  => 'uri',
        'ip'         =>

       Keys are the W3C defined Field identifiers, with their prefixes stripped off.

   field2re
       field2re is subroutine; when called as

        $self->{field2re('c-ip')}

       it will return e.g.

        '(\d+\.\d+\.\d+\.\d+|-)'

       Arguments are as found in the Fields directive, so, in an ideal world, should be
       identifiers.  It uses type2regex.

   field2decoder
       field2decoder is a subroutine; it returns one of \&uri_decode , \&string_decode or undef,
       depending on, a.o., is_iis.  It is used by build_parser.

   parse
       parse is the preferred interface to this module.  It expects a line as its argument, and
       returns a reference to a hash (like &w3c_parser), or executes &parse_directive.

   parse_directive
       parse_directive expects a directive in its argument, it fills the object.

   w3c_parser
       w3c_parser is a subroutine; it expects a logline as argument, and returns a reference to a
       hash, mapping $self->{'fields'} entries to their decoded values.  It uses the &field2re
       and &field2decoder routines.  It is build in build_parser.

   build_parser
       build_parser is a subroutine, it builds and returns &w3c_parser.  It is called in
       &parse_directive.

   log_date and log_time
       log_date and log_time contain strings constructed from the Date directive.

   version and sofware
       version and software contain strings constructed from the Version and Software directives,
       respectively.

   fields
       fields contains the entire string from the Fields directive.

   is_iis
       is_iis is set in case the Software directive contains 'Microsoft Internet' as a substring.
       It is used to enable IIS specific support.

   tab_sep
       tab_sep is set in case tabs are found in the Fields directive.  We assume these will be
       used in the log itself too, and allow unescaped spaces in the log.

       Summarizing:

        &parse --calls--> &parse_directive
               `--calls--> &w3c_parser

        &parse_directive --calls--> &build_parser

        &build_parser --calls--> &field2decoder
                     `--calls--> &field2re
                     `--returns--> &w3c_parser

        &field2decoder --returns--> &uri_decode, &string_decode

        &field2re --uses--> %type2regex
                  `--uses--> %identifier2type

BUILDING INHERITING MODULES

       FIXME .  Needs to be written.  Steal from w3c_extended2dlf's Lire::WWW::ExtendedLog, which
       ISA Lire::W3CExtendedLog.

SEE ALSO

       w3c_extended2dlf(1), ms_isa2dlf(1)

AUTHOR

         Francis J. Lacoste <flacoste@logreport.org>

VERSION

       $Id: W3CExtendedLog.pm,v 1.18 2006/07/23 13:16:30 vanbaal Exp $

COPYRIGHT

       Copyright (C) 2001-2002 Stichting LogReport Foundation LogReport@LogReport.org

       This file is part of Lire.

       Lire is free software; you can redistribute it and/or modify it under the terms of the GNU
       General Public License as published by the Free Software Foundation; either version 2 of
       the License, or (at your option) any later version.

       This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;
       without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
       See the GNU General Public License for more details.

       You should have received a copy of the GNU General Public License along with this program
       (see COPYING); if not, check with http://www.gnu.org/copyleft/gpl.html.