Provided by: awffull_3.10.2-7_amd64 bug

NAME

       AWFFull - A Webalizer Fork, Full o' features

DESCRIPTION

       awffull.conf is the configuration file for awffull(1). awffull.conf is a standard ASCII(7)
       text files that may be created or edited using any standard editor.

       Blank lines and lines that begin with a pound sign ('#') are ignored.

       Any other lines are considered to be configuration  lines,  and  have  the  form  ‘Keyword
       Value’,  where the ‘Keyword’ is one of the currently available configuration keywords, and
       ‘Value’ is the value to assign to that particular option.

       Any text found after the keyword up to the end of the line  is  considered  the  keyword's
       value,  so  you should not include anything after the actual value on the line that is not
       actually part of the  value  being  assigned.  The  file  sample.conf  provided  with  the
       distribution contains lots of useful documentation and examples as well.

       Some  ‘Keywords’  will  accept  a  2^nd value. In those situations, the first value may be
       enclosed in double quotes (") to allow for whitespace.

       Keywords are Case Insensitive. Values are Case Sensitive, with some gotchas:  See  Ignore*
       for details.

WILDCARDS

       Wildcards within AWFFull are a little non standard and may cause some confusion.

       Wildcards are only valid within the Value of certain keywords

       A  Value  can have either a leading or trailing '*' to signify a wildcard character. If no
       wildcard  is  found,  a  match  can  occur  anywhere  in  the  string.  Given   a   string
       ‘www.yourmama.com’, the values ‘your’, ‘*mama.com’ and ‘www.your*’ will all match.

       Thus  the use of the wildcard signifies that the other end of the Value is anchored at the
       Beginning or End of a field to be searched against.

       eg. A Value of ‘Bot*’ implies that the field (probably UserAgent in this case) MUST  start
       with the letters Bot. Or in the case of a Hostname ‘*.gov.au’ implies a match ONLY against
       Australian Government hostnames.

RUN OPTIONS

       The Run Options are the generic ones that tell AWFFull where stuff is and how to generally
       operate. Some of these can modify the results that AWFFull will produce.

       OutputDir
              OutputDir  is  where you want to put the output files. This should should be a full
              path name, however relative ones might work as well.  If  no  output  directory  is
              specified, the current directory will be used.

       LogFile
              LogFile  defines the web server log file to use. If not specified here or on on the
              command line, input will default to STDIN. If the log filename ends in '.gz' (ie: a
              gzip compressed file), it will be decompressed on the fly as it is being read.

       LogType
              LogType  defines  the  log type being processed. Normally, AWFFull expects a CLF or
              Combined web server log as input. Using this option, you can process  ftp  logs  as
              well  (xferlog as produced by wu-ftpd and others), or Squid native logs. Values can
              be 'auto' 'clf', 'combined', 'ftp', 'domino' or 'squid', with 'auto'  the  default.
              The  'auto'  value means that AWFFull will try and work out what log format you are
              sending to it. If no joy, AWFFull will immediately exit.

       GeoIP  GeoIP enables or disables the  use  of  the  GeoIP  capability  for  more  accurate
              detection  of  countries. Default is ‘no’. NOTE! Do not enable GeoIP if you analyse
              files that have had the IP Address translated to a Fully Qualified Host  Name.  Use
              either  raw IP Addresses and GeoIP, or Names and disable GeoIP. ie. Don't use GeoIP
              AND DNShistory.

       GeoIPDatabase
              GeoIPDatabase  is  the  location  of  the   GeoIP   database   file.   Default   is
              /usr/share/GeoIP/GeoIP.dat,  which  is  where  a default GeoIP install will put it.
              Note   that   the   database   is   updated   monthly.   For   the   details   see:
              ⟨http://www.maxmind.com/app/geoip_country⟩

       Incremental
              Incremental  processing allows multiple partial log files to be used instead of one
              huge one. Useful for large sites that have to rotate their log files more than once
              a  month.  AWFFull  will save its internal state before exiting, and restore it the
              next time run, in order to continue processing where it left off.  This  mode  also
              causes  AWFFull to scan for and ignore duplicate records (records already processed
              by a previous run). See the README file for additional information. The  value  may
              be 'yes' or 'no', with a default of 'no'. The file awffull.current is used to store
              the current state data, and is located in  the  output  directory  of  the  program
              (unless  changed  with  the IncrementalName option below). Please read at least the
              section on Incremental processing in the README file before you enable this option.

       TimeMe TimeMe allows you to force  the  display  of  timing  information  at  the  end  of
              processing.  A  value of 'yes' will force the timing information to be displayed. A
              value of 'no' has no effect.

       IgnoreHist
              IgnoreHist should not be used in a standard configuration, but it is  here  because
              it  is  useful  in certain analysis situations. If the history file is ignored, the
              main ‘index.html’ file  will  only  report  on  the  current  log  files  contents.
              Incremental data (if present) is still processed. Useful when you want to reproduce
              the reports from scratch, for example. USE WITH CAUTION! Valid values are ‘yes’  or
              ‘no’. Default is ‘no’.

       IncrementalName
              IncrementalName  allows you to specify the filename for saving the incremental data
              in. It is similar to the HistoryName option where  the  name  is  relative  to  the
              specified  output  directory, unless an absolute filename is specified. The default
              is a file named ‘awffull.current’ kept in the normal output directory. If you don't
              specify Incremental as 'yes' then this option has no meaning.

       HistoryName
              HistoryName allows you to specify the name of the history file produced by AWFFull.
              The history file keeps the data for up  to  12  months  worth  of  logs,  used  for
              generating   the  main  HTML  page  (index.html).  The  default  is  a  file  named
              awffull.hist, stored in the specified output directory. If  you  specify  just  the
              filename  (without  a  path),  it  will  be kept in the specified output directory.
              Otherwise, the path is relative to the output directory, unless  absolute  (leading
              /).

ANALYSIS OPTIONS

       These  are  the basic analysis options that one can and should modify to start fine tuning
       AWFFull against a given website.

       PageType
              PageType lets you tell AWFFull what types of URL's  you  consider  a  'page'.  Most
              people  consider html and cgi documents as pages, while not images and audio files.
              If no types are  specified,  defaults  will  be  used  ('htm',  'html',  'cgi'  and
              HTMLExtension  if  different  for  web  logs, 'txt' for ftp logs). Putting the more
              likely page types first in the list should increase the speed of a run.

              Do Not Use Wildcards Here. It will not work.

       NotPageType
              NotPageType is the direct and incompatible opposite of PageType. You  can  use  one
              set  or  the  other, but not both. PageType specifies what *is* a Page, NotPageType
              specifies what *isn't*, and hence  by  implication,  everything  else  is  a  page.
              Neither  method  is  more  or lessor correct than the other. It's more what is more
              accurate for *your* site. Do not add the "." or use any wildcards.   As  a  general
              rule. There are some assumed internal optimisations that may otherwise break. Those
              who understand pcre's would do well to examine the source of parser.c if they  wish
              to extract greater flexibility from the below.

       FoldSeqErr
              FoldSeqErr  forces  AWFFull  to ignore sequence errors. This is useful for Netscape
              and other web servers that cache the writing of log records and  do  not  guarantee
              that  they  will  be  in chronological order. The use of the FoldSeqErr option will
              cause out of sequence log records to be treated as if they had the same time  stamp
              as  the  last  valid  record.  The  default action is to ignore out of sequence log
              records.

       SearchEngine
              The SearchEngine keywords allow specification of search  engines  and  their  query
              strings  on  the  URL.  These are used to locate and report what search strings are
              used to find your site. The first word is a substring  to  match  in  the  referrer
              field that identifies the search engine, and the second is the URL variable used by
              that search engine to define it's search terms.

       VisitTimeout
              VisitTimeout allows you to set the default timeout for a visit (sometimes called  a
              'session').  The default is 30 minutes, which should be fine for most sites. Visits
              are determined by looking at the time of the current request, and the time  of  the
              last request from the site. If the time difference is greater than the VisitTimeout
              value, it is considered a new visit, and visit totals are incremented. Value is the
              number of seconds to timeout (default=1800=30min)

       TrackPartialRequests
              TrackPartialRequests  is used to track 206 codes. This gives two additional columns
              in the Top URLs tables. The first to "Hits" counts the number of  partial  requests
              The second to "Volume" counts the volume in partial requests This option is more of
              use to those with lots of PDF's.

       MangleAgents
              The MangleAgents allows you to specify how much, if any, AWFFull should mangle user
              agent  names.  This  allows  several levels of detail to be produced when reporting
              user agent statistics. There are six levels that can  be  specified,  which  define
              different  levels  of detail suppression. Level 5 shows only the browser name (MSIE
              or Mozilla) and the major version number. Level 4 adds  the  minor  version  number
              (single  decimal  place). Level 3 displays the minor version to two decimal places.
              Level 2 will add any sub-level designation (such as Mozilla/3.01Gold or MSIE 3.0b).
              Level  1  will  attempt to also add the system type if it is specified. The default
              Level 0 displays the full user agent field without modification  and  produces  the
              greatest  amount  of  detail.  User  agent names that can't be mangled will be left
              unmodified.

       AssignToCountry
              AssignToCountry allows a form of override to force given  domains  to  a  specified
              country. Use the standard 2 letter country codes. Can also use org, com, net and so
              on, if more appropriate.  With judicious use of AllSites,  GroupSite  and  'whois',
              this can cover the majority of your users without too much effort.

       IndexAlias
              AWFFull  normally  strips  the  string  'index.'  off  the end of URL's in order to
              consolidate URL totals. For example, the URL  /somedir/index.html  is  turned  into
              /somedir/  which  is  really  the  same  URL.  This  option  allows  you to specify
              additional strings to treat in the same way. You don't need to specify 'index.'  as
              it  is  always  scanned for by AWFFull, this option is just to specify _additional_
              strings if needed. If you don't need any, don't specify any as each string will  be
              scanned  for in EVERY log record... A bunch of them will degrade performance. Also,
              the string is scanned for anywhere in the URL, so a string of 'home' would turn the
              URL  /somedir/homepages/brad/home.html  into  just  /somedir/ which is probably not
              what was intended.

       IgnoreIndexAlias
              The opposite (in a way) of IndexAlias is IgnoreIndexAlias.  This will STOP any  URL
              variable  stripping,  as well as ignoring the default "index." setting, or any that
              you set above.

IGNORE* OPTIONS

       The Ignore* keywords allow you to completely ignore, or filter away, log records based  on
       hostname,  URL,  user  agent,  referrer  or  user  name.  Use the same syntax as the Hide*
       keywords, where the value can have a leading or trailing wildcard '*'.

       IgnoreURL
              Filters out traffic accessing certain URLs. eg You may wish to avoid seeing traffic
              that  accesses  administration  functions,  thus "IgnoreURL /admin*". URLs are case
              sensitive.

       IgnoreSite
              Ignore sites that visit this website. Ignore by what is presented to awffull - name
              or  IP  Address. Sites are lowercased prior to filtering, so if Ignore'ing by name,
              do use a lowercased Value.

       IgnoreReferrer
              Ignore  specified  referrers.  Very  useful  for  filtering  away  SPAM  Referrers.
              Referrers are partially case sensitive. \o/ The host portion is lowercased; the URI
              is case sensitive.

       IgnoreUser
              Ignore specified users. User names are lowercased prior to filtering.

       IgnoreAgent
              Agents are case sensitive.

INCLUDE* OPTIONS

       The Include* keywords allow you to force the inclusion of log records based  on  hostname,
       URL,  user  agent,  referrer  or user name. The Include* keywords take precedence over the
       Ignore* keywords.

       Note: Using Ignore/Include combinations to selectively process parts  of  a  web  site  is
       _extremely  inefficient_!!!  Avoid  doing so if possible ie: grep or gawk the records to a
       separate file if you really want that kind of report.

       IncludeURL

       IncludeSite

       IncludeReferrer

       IncludeUser

       IncludeAgent

SEGMENTING OPTIONS

       Segmenting is a bit like the Ignore*  and  Include*  keywords.  Where  it  differs  is  in
       "remembering".  Such  that, as a ‘session’ (or ‘visit’) moves away from the original entry
       condition, that session is still tracked. So if you segment on a referal from Google, only
       sessions  that were refered to the analysed website, from Google, will be tracked. Even as
       that same session accesses other pages within the website.

       eg. Google -> Site Page 1 -> Site Page 2 -> Site Page 3

       Whereas Ignore/Include would only filter the first interaction. eg.  Google -> Site Page 1

       By "session" (or ‘visit’) it is meant that the time limitation of a session (typically  30
       minutes  timeout) will impact. So in the above example from Google, if the last step (from
       Page 2 to Page 3) occured 31+ minutes after the Page 1 to Page  2  transition,  then  this
       final step would NOT be included. The trail would be:

       Google -> Site Page 1 -> Site Page 2

       Please  do be aware that currently AWFFull uses IP Addresses to determine the continuation
       of a given session. This will be most flawed if you  have  a  user  population  that  sits
       behind corporate firewalls, or ISP Proxies. To mention two major problem areas.

       Why do Segmenting?

       ⟨http://judah.webanalyticsdemystified.com/2007/11/a-few-tips-on-web-analytics-
       segmentation.html⟩

       ‘Segment analysis will tell you different things about your audience than you will realize
       from studying overall population metrics.’

       ‘The  goal  of segmentation is to maximize future value of that segment by optimizing your
       marketing mix.’

       With apologies to Judah for mixing his phrase order around.  :-)

       SegCountry
              Segment by Country: Only track sessions that come  from  the  following  countries.
              This will be determined by:

              1.  Use of AssignToCountry overrides

              2.  GeoIP lookups if so configured and enabled

              3.  Hostname TLD. eg .au

       The  third  option  is generally going to be the worst for accuracy. eg. We have plenty of
       Australian IP addresses that otherwise resolve to .com or .net etc.

       It is strongly advised to enable GeoIP if you wish to use this option.

       SegReferer
              Segment by  Referer:  Only  track  sessions  that  originated  from  the  following
              referers. NOTE!!!! SegReferer only works against the HOST name. Not the full URL.

DISPLAY OPTIONS

       The  Display  Options  modify the resulting output that AWFFull produces. Things like HTML
       Headers and Footers to add on every page. These options  don't  change  the  numbers  that
       AWFFull  will  calculate,  but  may  change  which  ones  appear, giving the illusion of a
       numerical change.

       ReportTitle
              ReportTitle is the text to display as the title. The  hostname  (unless  blank)  is
              appended  to  the end of this string (separated with a space) to generate the final
              full title string. Default is (for English) ‘Usage Statistics for’.

       HostName
              HostName defines the hostname for the report. This is used in  the  title,  and  is
              prepended to the URL table items. This allows clicking on URL's in the report to go
              to the proper location in the event you are running the report on a  'virtual'  web
              server,  or  for  a  server  different  than  the one the report resides on. If not
              specified here, or on the command line, AWFFull will try to get the hostname via  a
              uname system call. If that fails, it will default to ‘localhost’.

       IndexMonths
              This  option  controls how many years worth of data to display on the front summary
              page. In months. eg: Display the last 5 years: 5 x 12 = 60

       DailyStats
              DailyStats allows the daily statistics table to be disabled - not displayed. Values
              may be ‘yes’ or ‘no’. Default is ‘yes’ - do display the Daily Statistics table.

       HourlyStats
              HourlyGraph and HourlyStats allows the hourly statistics graph and statistics table
              to be disabled (not displayed). Values may be "yes" or "no". Default is "yes".

       CSSFilename
              CSSFilename is used to set the name of the CSS file to use in conjunction with  the
              generated  html.  An existing file is not overwritten, so feel free to make you own
              changes to the default file. The default is awffull.css.

       FlagsLocation
              FlagsLocation will enable the display of  country  flag  pictures  in  the  country
              table.  The  path  is  that  for  a  webserver, not file system. Can be relative or
              complete. The trailing slash is not necessary. The default location is not set  and
              hence will not be used.

       YearlySubtotals
              YearlySubtotals  will  display the subtotal for a given year in the main page. This
              is in addition to the Grand Total of all years.

       GroupShading
              The GroupShading allows grouped rows to be shaded in the report. Useful if you have
              lots  of groups and individual records that intermingle in the report, and you want
              to differentiate the group records a little more. Value can be ‘yes’ or ‘no’,  with
              ‘yes’ being the default.

       GroupHighlight
              GroupHighlight allows the group record to be displayed in BOLD. Can be either ‘yes’
              or ‘no’ with the default being ‘yes’.

       HTMLExtension
              HTMLExtension allows you to specify the filename extension  to  use  for  generated
              HTML  pages.  Normally,  this  defaults to "html", but can be changed for sites who
              need it (like for PHP embedded pages).

       UseHTTPS
              UseHTTPS should be used if the analysis is being run on a secure server, and  links
              to  urls  should use ‘https://’ instead of the default ‘http://’. If you need this,
              set it to ‘yes’. Default is ‘no’. This only changes the behaviour of the ‘Top URLs’
              table.

       Top*   The various ‘Top’ options below define the number of entries for each table. Tables
              may be disabled by using zero (0) for the value.

       TopURLs
              The most accessed URLs or Resources by number of  requests  (hits).  Includes  both
              Pages and Images, for example. Defaults to 30 URLs.

       TopKURLs
              The greatest volume generating URLs. Defaults to 10 URL's.

       TopEntry
              The  most  accessed  initial URLs within a complete Visit. Will also display Single
              Access counts, Stickiness ration and Popularity ratio. Defaults to 10 URLs.

       TopExit
              The most accessed last URLs within a complete Visit. ie: The last page recorded  of
              a Visit. Also displays the Popularity ratio.  Defaults to 10 URLs.

       Top404Errors
              The  most  seen error requests and a corresponding referring URL. Defaults to 0, ie
              not shown.

       TopSites
              Those Sites that have accessed the most Pages. Default is 30 Sites.

       TopKSites
              Those Sites that have downloaded the greatest Volume. Default is 10 Sites.

       TopReferrers
              Those local and remote URLs that refer the most requests.  Default is 30 Referrers.

       TopSearch
              Those words and phrases used at remote  Search  Engines  to  direct  traffic  here.
              Default is 20 Phrases.

       TopUsers
              Those logged in users who most use the site. Default is 20 Users.

       TopAgents
              The Browser Agents that are busiest against this site. Default is 15 Agents.

       TopCountries
              A view of all traffic against this site via country.

       All*   The  All*  keywords  allow  the  display  of all the below measures.  If enabled, a
              separate HTML page will be created, and a link will be added to the bottom  of  the
              appropriate "Top" table. There are a couple of conditions for this to occur. First,
              there must be more items than will fit in the "Top" table (otherwise it would  just
              be duplicating what is already displayed). Second, the listing will only show those
              items that are normally visible, which means it will not  show  any  hidden  items.
              Grouped  entries  will be listed first, followed by individual items. The value for
              these keywords can be either 'yes' or 'no', with the default being 'no'. Please  be
              aware that these pages can be quite large in size, particularly the sites page, and
              separate pages are generated for each month, which can consume quite a lot of  disk
              space depending on the traffic to your site.

       AllURLs
              All accessed URLs

       AllEntryPages
              All Pages that initialised a Visit

       AllExitPages
              All the last or exit pages in all Visits.

       All404Errors
              All ErrorRequests and the corresponding referral URLs.

       AllSites
              All remote sites that accessed this website.

       AllReferrers
              All local and remote referring URLs

       AllSearchStr
              All Remote Search Engine words and Phrases used to refer traffic here.

       AllUsers
              All users who logged into this website.

       AllAgents
              All Browser Agents used to access this site. Useful for identifying robots.

       GMTTime
              GMTTime  allows reports to show GMT (UTC) time instead of local time. Default is to
              display the time the report was generated in the timezone  of  the  local  machine,
              such as EDT or PST. This keyword allows you to have times displayed in UTC instead.
              Use only if you really have a good reason, since it  will  probably  screw  up  the
              reporting periods by however many hours your local time zone is off of GMT.

       HTMLPre
              HTMLPre  defines  HTML code to insert at the very beginning of the file. Default is
              the DOCTYPE line shown below. Max line length is 80  characters,  so  use  multiple
              HTMLPre lines if you need more.

       HTMLHead
              HTMLHead  defines  HTML  code to insert within the <HEAD></HEAD> block, immediately
              after the <TITLE> line. Maximum line length is 80 characters, so use multiple lines
              if needed.

       HTMLBody
              HTMLBody defined the HTML code to be inserted, starting with the <BODY> tag. If not
              specified, the default is shown below.  If used, you MUST include your  own  <BODY>
              tag  as  the  first  line.  Maximum  line  length is 80 char, use multiple lines if
              needed.

       HTMLPost
              HTMLPost defines the HTML code to insert immediately before the first <HR>  on  the
              document, which is just after the title and "summary period"-"Generated on:" lines.
              If anything, this should be used to clean up in case an  image  was  inserted  with
              HTMLBody.  As  with  HTMLHead, you can define as many of these as you want and they
              will be inserted in the output stream in order of appearance. Max string size is 80
              characters. Use multiple lines if you need to.

       HTMLTail
              HTMLTail  defines  the  HTML  code  to  insert at the bottom of each HTML document,
              usually to include a link back to your home page or insert a small graphic.  It  is
              inserted  as  a  table  data  element  (ie: <TD> your code here </TD>) and is right
              aligned with the page. The maximum string size is 80 characters.

       HTMLEnd
              HTMLEnd defines the HTML code to add at the very end of  the  generated  files.  It
              defaults  to what is shown below. If used, you MUST specify the </BODY> and </HTML>
              closing tags as the last lines. The maximum string length is 80 characters.

GRAPHING OPTIONS

       As distinct from the general Display Options, the Graphing Options focus  on  manipulating
       the various graphs produced.

       CountryGraph
              CountryGraph allows the usage by country graph to be disabled.  Values can be 'yes'
              or 'no', default is 'yes'.

       DailyGraph
              DailyGraph determines if the daily statistics  graph  will  be  displayed  or  not.
              Values may be "yes" or "no". Default is "yes" - do display the daily graph.

       HourlyGraph
              HourlyGraph  determines  if  the  daily  statistics graph will be displayed or not.
              Values may be "yes" or "no". Default is "yes" - do display the hourly graph.

       TopURLsbyHitsGraph
              Display a pie chart of the top URLs by HITS

       TopURLsbyVolGraph
              Display a pie chart of the top URLs by HITS

       TopExitPagesGraph
              Display Top Exit Pages Pie Chart. Values may be ‘hits’ or ‘visits’ or "no". Default
              is "no"

              ‘hits’ means order the graph by hits

              ‘visits’ means order the graph by visits

       TopEntryPagesGraph
              Display  Top  Entry  Pages  Pie  Chart.  Values  may be ‘hits’ or ‘visits’ or "no".
              Default is "no"

              ‘hits’ means order the graph by hits

              ‘visits’ means order the graph by visits

       TopSitesbyPagesGraph
              Display a pie chart of the Top Sites by Page Impressions

       TopSitesbyVolGraph
              Display a pie chart of the Top Sites by Page Impressions

       TopAgentsGraph
              Display a pie chart of the Top User Agents (by pages)

       GraphLegend
              GraphLegend allows the color coded legends to be turned on or off  in  the  graphs.
              The default is for them to be displayed. This only toggles the color coded legends,
              the other legends are not changed. If you think they are hideous and ugly, say 'no'
              here :)

       GraphLines
              GraphLines  allows  you to have index lines drawn behind the graphs. Anything other
              than "no" will enable the lines.

       Graph*X and Graph*Y
              The following Graph*X and Graph*Y options are used  to  modify  the  sizes  of  the
              created  charts.  The default settings are shown. The defaults are also the minimum
              settings. #define GRAPH_INDEX_X  512  /*  px.  Default  X  size  (512)  */  #define
              GRAPH_INDEX_Y  256  /* px. Default Y size (256) */ #define GRAPH_DAILY_X 512 /* px.
              Daily X size (512) */ #define GRAPH_DAILY_Y 400  /*  px.  Daily  Y  size  (400)  */
              #define  GRAPH_HOURLY_X 512 /* px. Daily X size (512) */ #define GRAPH_HOURLY_Y 400
              /* px. Daily Y size (400) */ #define GRAPH_PIE_X 512 /* px. Pie  X  size  (512)  */
              #define GRAPH_PIE_Y 300 /* px. Pie Y size (300) */

       GraphIndexX
              The main chart on the front page. Summary of all Months.  Default is 512 pixels.

       GraphIndexY
              Default is 256 pixels.

       GraphDailyX
              The  Day  by  Day Summary graph at the start of each Months Summary. Default is 512
              pixels.

       GraphDailyY
              Default is 400 pixels.

       GraphHourlyX
              The Hourly Average graph within each Months Summary. Default is 512 pixels.

       GraphHourlyY
              Default is 400 pixels.

       GraphPieX
              All pie charts are the same size. Default is 512 pixels.

       GraphPieY
              Default is 300 pixels.

       Graph and Table Colours
              The custom bar graph and pie Colours can be overridden with these options.  Declare
              them in the standard hexadecimal way - as per HTML but without the '#'. If none are
              given, you will get the default AWFFull colors.

       ColorHit
              Default value is ‘00805C’ (dark green)

       ColorFile
              Default value is ‘0000FF’ (blue)

       ColorSite
              Default value is ‘FF8000’ (orange)

       ColorKbyte
              Default value is ‘FF0000’ (red)

       ColorPage
              Default value is ‘00E0FF’ (cyan)

       ColorVisit
              Default value is ‘FFFF00’ (yellow)

       PieColor1
              Default value is ‘00805C’ (dark green)

       PieColor2
              Default value is ‘0000FF’ (blue)

       PieColor3
              Default value is ‘FF8000’ (orange)

       PieColor4
              Default value is ‘FF0000’ (red)

GROUP* OPTIONS

       The Group* keywords permit the grouping of similar objects as if they  were  one.  Grouped
       records  are  displayed in the ‘Top’ tables and can optionally be displayed in bold and/or
       shaded. Groups cannot be hidden, and are not  counted  in  the  main  totals.  The  Group*
       options  do  not  hide  the individual items that are members of the Group. If you wish to
       hide the records that match - so just the grouping record is displayed -  follow  with  an
       identical  Hide* keyword with the same value. Or use the single GroupAndHide* keyword that
       matches, instead of the Group* and Hide* combination.

       Group* keywords may have an optional label which will be displayed instead of the keywords
       value. The label should be separated from the value by at least one white-space character,
       such as a space or tab.

       The Hide*, Group* and Ignore* and Include* keywords allow you to  change  the  way  Sites,
       URL's,  Referrers,  User  Agents and User names are manipulated. The Ignore* keywords will
       cause AWFFull to completely ignore records as if they didn't exist (and thus  not  counted
       in  the  main site totals). The Hide* keywords will prevent things from being displayed in
       the 'Top' tables, but will still be counted in the main totals. The Group* keywords  allow
       grouping  similar  objects as if they were one. Grouped records are displayed in the 'Top'
       tables and can optionally be displayed in BOLD and/or shaded. Groups cannot be hidden, and
       are  not  counted  in the main totals. The Group* options do not, by default, hide all the
       items that it matches. If you want to hide the records that match (so  just  the  grouping
       record  is  displayed),  follow  with an identical Hide* keyword with the same value. (see
       example below) In addition, Group* keywords may have  an  optional  label  which  will  be
       displayed  instead of the keywords value.  The label should be separated from the value by
       at least one 'white-space' character, such as a space or tab.

       The value can have either a leading or trailing '*' wildcard character. If no wildcard  is
       found,  a  match  can occur anywhere in the string. Given a string ‘www.yourmama.com’, the
       values ‘your’, ‘*mama.com’ and ‘www.your*’ will all match.

       GroupURL

       GroupSite

       GroupReferrer

       GroupUser

       GroupAgent

       GroupDomains
              The GroupDomains keyword allows you to  group  individual  host  names  into  their
              respective  domains.  The value specifies the level of grouping to perform, and can
              be thought of as 'the number of dots' that will be displayed.  For  example,  if  a
              visiting  host is named cust1.tnt.mia.uu.net, a domain grouping of 1 will result in
              just "uu.net" being displayed, while a 2 will result in "mia.uu.net".  The  default
              value  of  zero  disable this feature.  Domains will only be grouped if they do not
              match any existing "GroupSite" records, which allows overriding this  feature  with
              your own if desired.

HIDE* OPTIONS

       The  Hide*  keywords  will  prevent  things  from being displayed in the 'Top' tables. The
       hidden items will still be counted in the main totals.

       HideURL
              Hide URL matching name.

       HideSite
              Hide site matching name.

       HideReferrer
              Hide referrer matching name.

       HideUser

       HideAgent
              Hide user agents matching name.

       HideAllSites
              HideAllSites allows forcing individual sites to be hidden in the  report.  This  is
              particularly  useful  when  used in conjunction with the "GroupDomain" feature, but
              could be useful in other situations as well, such as when you only want to  display
              grouped  sites  (with the GroupSite keywords...). The value for this keyword can be
              either 'yes' or 'no', with 'no'  the  default,  allowing  individual  sites  to  be
              displayed.

GROUPANDHIDE* OPTIONS

       All  the  Hide  and  Group  "name"  options  can  be  combined in a single config line. eg
       GroupAndHideURL. If you start using the Group* options you will  find  that  you  tend  to
       match  every  Group*  option  with a corresponding Hide* option. The GroupAndHide* options
       simply short circuit this unnecessary duplication.

       GroupAndHideURL

       GroupAndHideSite

       GroupAndHideReferrer

       GroupAndHideUser

       GroupAndHideAgent

DATA DUMP OPTIONS

       The Dump* keywords allow the dumping of Sites, URL's, Referrers User  Agents,  User  names
       and  Search  strings  to  separate tab delimited text files, suitable for import into most
       database or spreadsheet programs.

       DumpPath
              DumpPath specifies the path to dump the files. If not specified, it will default to
              the current output directory. Do not use a trailing slash ('/').

       DumpHeader
              The  DumpHeader keyword specifies if a header record should be written to the file.
              A header record is the first record of the file, and contains the labels  for  each
              field  written.  Normally,  files  that are intended to be imported into a database
              system will not need a header record, while spreadsheets usually do. Value  can  be
              either 'yes' or 'no', with 'no' being the default.

       DumpExtension
              DumpExtension  allow you to specify the dump filename extension to use. The default
              is "tab", but some programs are picky about the filenames  they  use,  so  you  may
              change it here (for example, some people may prefer to use "csv").

       DumpURLs

       DumpEntryPages

       DumpExitPages

       DumpSites

       DumpReferrers

       DumpSearchStr

       DumpUsers

       DumpAgents

       DumpCountries

EXAMPLES

       Sample Extract of a configuration file:

       # The 'auto' value means that AWFFull will try and work out what log format
       # you are sending to it. If no joy, AWFFull will immediately exit.

       LogType        auto

       # OutputDir is where you want to put the output files.  This should
       # should be a full path name, however relative ones might work as well.
       # If no output directory is specified, the current directory will be used.

       OutputDir      .

       Minimal configuration file:

       # Sample *MINIMAL* AWFFull configuration file
       #
       # The below settings are the only ones you *really* need to worry about
       # when configuring AWFFull. See the sample.conf file for all options if
       # the below only serves to whet your appetite.
       #
       # See awfful(1) or sample.conf for full explanations.

       # We can do a little bit each day, or hour...
       Incremental             yes

       # Your server name to display
       HostName                www.my_example.site

       ##---------------------------
       # Use PageType OR NotPageType
       # I personally prefer NotPageType - YMMV!
       PageType                htm
       PageType                html
       PageType                php
       #PageType               pl
       #PageType               cfm
       #PageType               pdf
       #PageType               txt
       #PageType               cgi
       ### OR! ---------------------
       #NotPageType            gif
       #NotPageType            css
       #NotPageType            js
       #NotPageType            jpg
       #NotPageType            ico
       #NotPageType            png
       ##---------------------------

       # Should always fold in Sequence Errors. Logs can be messy...
       FoldSeqErr              yes

       # If you want to see the country flags, uncomment the following.
       # This is the, possibly relative, URL where the flag flies are located.
       #FlagsLocation          flags

       .fi

SEE ALSO

       awffull(1)

BUGS

       None currently known. YMMV....

       Report  bugs  to  ⟨https://bugs.launchpad.net/awffull⟩,  or use the email discussion list:
       <awffull@stedee.id.au>

NOTES

       In case it is not obvious: AWFFull is a play/pun on the word ‘awful’,  and  is  pronounced
       the same way. Yes it was deliberate.

REFERENCES

       [1]   Web  Site  Measurement  Hacks.  Eric  T.  Peterson  (and  others).   O'Reilly.  ISBN
       0-596-00988-7.

                                           2008-Dec-13                            awffull.conf(5)