Provided by: libcflow-perl_0.68-12.1build1_amd64 bug

NAME

       Cflow::find - find "interesting" flows in raw IP flow files

SYNOPSIS

          use Cflow;

          Cflow::verbose(1);
          Cflow::find(\&wanted, <*.flows*>);

          sub wanted { ... }

       or:

          Cflow::find(\&wanted, \&perfile, <*.flows*>);

          sub perfile {
             my $fname = shift;
             ...
          }

BACKROUND

       This module implements an API for processing IP flow accounting information which as been
       collected from routers and written into flow files by one of the various flow collectors
       listed below.

       It was originally conceived and written for use by FlowScan:

          http://net.doit.wisc.edu/~plonka/FlowScan/

Flow File Sources

       This package is of little use on its own.  It requires input in the form of time-stamped
       raw flow files produced by other software packages.  These "flow sources" either snoop a
       local ethernet (via libpcap) or collect flow information from IP routers that are
       configured to export said information.  The following flow sources are supported:

       argus by Carter Bullard:
              http://www.qosient.com/argus/

       flow-tools by Mark Fullmer (with NetFlow v1, v5, v6, or v7):
              http://www.splintered.net/sw/flow-tools/

       CAIDA's cflowd (with NetFlow v5):
              http://www.caida.org/tools/measurement/cflowd/
              http://net.doit.wisc.edu/~plonka/cflowd/

       lfapd by Steve Premeau (with LFAPv4):
              http://www.nmops.org/

DESCRIPTION

       Cflow::find() will iterate across all the flows in the specified files.  It will call your
       wanted() function once per flow record.  If the file name argument passed to find() is
       specified as "-", flows will be read from standard input.

       The wanted() function does whatever you write it to do.  For instance, it could simply
       print interesting flows or it might maintain byte, packet, and flow counters which could
       be written to a database after the find subroutine completes.

       Within your wanted() function, tests on the "current" flow can be performed using the
       following variables:

       $Cflow::unix_secs
           secs since epoch (deprecated)

       $Cflow::exporter
           Exporter IP Address as a host-ordered "long"

       $Cflow::exporterip
           Exporter IP Address as dotted-decimal string

       $Cflow::localtime
           $Cflow::unix_secs interpreted as localtime with this strftime(3) format:

              %Y/%m/%d %H:%M:%S

       $Cflow::srcaddr
           Source IP Address as a host-ordered "long"

       $Cflow::srcip
           Source IP Address as a dotted-decimal string

       $Cflow::dstaddr
           Destination IP Address as a host-ordered "long"

       $Cflow::dstip
           Destination IP Address as a dotted-decimal string

       $Cflow::input_if
           Input interface index

       $Cflow::output_if
           Output interface index

       $Cflow::srcport
           TCP/UDP src port number or equivalent

       $Cflow::dstport
           TCP/UDP dst port number or equivalent

       $Cflow::ICMPType
           high byte of $Cflow::dstport

           Undefined if the current flow is not an ICMP flow.

       $Cflow::ICMPCode
           low byte of $Cflow::dstport

           Undefined if the current flow is not an ICMP flow.

       $Cflow::ICMPTypeCode
           symbolic representation of $Cflow::dstport

           The value is a the type-specific ICMP code, if any, followed by the ICMP type.  E.g.

              ECHO
              HOST_UNREACH

           Undefined if the current flow is not an ICMP flow.

       $Cflow::pkts
           Packets sent in Duration

       $Cflow::bytes
           Octets sent in Duration

       $Cflow::nexthop
           Next hop router's IP Address as a host-ordered "long"

       $Cflow::nexthopip
           Next hop router's IP Address as a dotted-decimal string

       $Cflow::startime
           secs since epoch at start of flow

       $Cflow::start_msecs
           fractional portion of startime (in milliseconds)

           This will be zero unless the source is flow-tools or argus.

       $Cflow::endtime
           secs since epoch at last packet of flow

       $Cflow::end_msecs
           fractional portion of endtime (in milliseconds)

           This will be zero unless the source is flow-tools or argus.

       $Cflow::protocol
           IP protocol number (as is specified in /etc/protocols, i.e.  1=ICMP, 6=TCP, 17=UDP,
           etc.)

       $Cflow::tos
           IP Type-of-Service

       $Cflow::tcp_flags
           bitwise OR of all TCP flags that were set within packets in the flow; 0x10 for non-TCP
           flows

       $Cflow::TCPFlags
           symbolic representation of $Cflow::tcp_flags The value will be a bitwise-or
           expression.  E.g.

              PUSH|SYN|FIN|ACK

           Undefined if the current flow is not a TCP flow.

       $Cflow::raw
           the entire "packed" flow record as read from the input file

           This is useful when the "wanted" subroutine wants to write the flow to another
           FILEHANDLE. E.g.:

              syswrite(FILEHANDLE, $Cflow::raw, length $Cflow::raw)

       $Cflow::reraw
           the entire "re-packed" flow record formatted like $Cflow::raw.

           This is useful when the "wanted" subroutine wants to write a modified flow to another
           FILEHANDLE. E.g.:

              $srcaddr = my_encode($srcaddr);
              $dstaddr = my_encode($dstaddr);
              syswrite(FILEHANDLE, $Cflow::reraw, length $Cflow::raw)

           These flow variables are packed into $Cflow::reraw:

              $Cflow::index, $Cflow::exporter,
              $Cflow::srcaddr, $Cflow::dstaddr,
              $Cflow::input_if, $Cflow::output_if,
              $Cflow::srcport, $Cflow::dstport,
              $Cflow::pkts, $Cflow::bytes,
              $Cflow::nexthop,
              $Cflow::startime, $Cflow::endtime,
              $Cflow::protocol, $Cflow::tos,
              $Cflow::src_as, $Cflow::dst_as,
              $Cflow::src_mask, $Cflow::dst_mask,
              $Cflow::tcp_flags,
              $Cflow::engine_type, $Cflow::engine_id

       $Cflow::Bps
           the minimum bytes per second for the current flow

       $Cflow::pps
           the minimum packets per second for the current flow

       The following variables are undefined if using NetFlow v1 (which does not contain the
       requisite information):

       $Cflow::src_as
           originating or peer AS of source address

       $Cflow::dst_as
           originating or peer AS of destination address

       The following variables are undefined if using NetFlow v1 or LFAPv4 (which do not contain
       the requisite information):

       $Cflow::src_mask
           source address prefix mask bits

       $Cflow::dst_mask
           destination address prefix mask bits

       $Cflow::engine_type
           type of flow switching engine

       $Cflow::engine_id
           ID of the flow switching engine

       Optionally, a reference to a perfile() function can be passed to Cflow::find as the
       argument following the reference to the wanted() function.  This perfile() function will
       be called once for each flow file.  The argument to the perfile() function will be name of
       the flow file which is about to be processed.  The purpose of the perfile() function is to
       allow you to periodically report the progress of Cflow::find() and to provide an
       opportunity to periodically reclaim storage used by data objects that may have been
       allocated or maintained by the wanted() function.  For instance, when counting the number
       of active hosts IP addresses in each time-stamped flow file, perfile() can reset the
       counter to zero and clear the search tree or hash used to remember those IP addresses.

       Since Cflow is an Exporter, you can request that all those scalar flow variables be
       exported (so that you need not use the "Cflow::" prefix):

          use Cflow qw(:flowvars);

       Also, you can request that the symbolic names for the TCP flags, ICMP types, and/or ICMP
       codes be exported:

          use Cflow qw(:tcpflags :icmptypes :icmpcodes);

       The tcpflags are:

          $TH_FIN $TH_SYN $TH_RST $TH_PUSH $TH_ACK $TH_URG

       The icmptypes are:

          $ICMP_ECHOREPLY     $ICMP_DEST_UNREACH $ICMP_SOURCE_QUENCH
          $ICMP_REDIRECT      $ICMP_ECHO         $ICMP_TIME_EXCEEDED
          $ICMP_PARAMETERPROB $ICMP_TIMESTAMP    $ICMP_TIMESTAMPREPLY
          $ICMP_INFO_REQUEST  $ICMP_INFO_REPLY   $ICMP_ADDRESS
          $ICMP_ADDRESSREPLY

       The icmpcodes are:

          $ICMP_NET_UNREACH  $ICMP_HOST_UNREACH $ICMP_PROT_UNREACH
          $ICMP_PORT_UNREACH $ICMP_FRAG_NEEDED  $ICMP_SR_FAILED
          $ICMP_NET_UNKNOWN  $ICMP_HOST_UNKNOWN $ICMP_HOST_ISOLATED
          $ICMP_NET_ANO      $ICMP_HOST_ANO     $ICMP_NET_UNR_TOS
          $ICMP_HOST_UNR_TOS $ICMP_PKT_FILTERED $ICMP_PREC_VIOLATION
          $ICMP_PREC_CUTOFF  $ICMP_UNREACH      $ICMP_REDIR_NET
          $ICMP_REDIR_HOST   $ICMP_REDIR_NETTOS $ICMP_REDIR_HOSTTOS
          $ICMP_EXC_TTL      $ICMP_EXC_FRAGTIME

       Please note that the names above are not necessarily exactly the same as the names of the
       flags, types, and codes as set in the values of the aforemented $Cflow::TCPFlags and
       $Cflow::ICMPTypeCode flow variables.

       Lastly, as is usually the case for modules, the subroutine names can be imported, and a
       minimum version of Cflow can be specified:

          use Cflow qw(:flowvars find verbose 1.031);

       Cflow::find() returns a "hit-ratio".  This hit-ratio is a string formatted similarly to
       that of the value of a perl hash when taken in a scalar context.  This hit-ratio indicates
       ((# of "wanted" flows) / (# of scanned flows)).  A flow is considered to have been
       "wanted" if your wanted() function returns non-zero.

       Cflow::verbose() takes a single scalar boolean argument which indicates whether or not you
       wish warning messages to be generated to STDERR when "problems" occur.  Verbose mode is
       set by default.

EXAMPLES

       Here's a complete example with a sample wanted function.  It will print all UDP flows that
       involve either a source or destination port of 31337 and port on the other end that is
       unreserved (greater than 1024):

          use Cflow qw(:flowvars find);

          my $udp = getprotobyname('udp');
          verbose(0);
          find(\&wanted, @ARGV? @ARGV : <*.flows*>);

          sub wanted {
             return if ($srcport < 1024 || $dstport < 1024);
             return unless (($srcport == 31337 || $dstport == 31337) &&
                             $udp == $protocol);

             printf("%s %15.15s.%-5hu %15.15s.%-5hu %2hu %10u %10u\n",
                    $localtime,
                    $srcip,
                    $srcport,
                    $dstip,
                    $dstport,
                    $protocol,
                    $pkts,
                    $bytes)
          }

       Here's an example which demonstrates a technique which can be used to pass arbitrary
       arguments to your wanted function by passing a reference to an anonymous subroutine as the
       wanted() function argument to Cflow::find():

          sub wanted {
             my @params = @_;
             # ...
          }

          Cflow::find(sub { wanted(@params) }, @files);

ARGUS NOTES

       Argus uses a bidirectional flow model.  This means that some argus flows represent packets
       not only in the forward direction (from "source" to "destination"), but also in the
       reverse direction (from the so-called "destination" to the "source").  However, this
       module uses a unidirection flow model, and therfore splits some argus flows into two
       unidirectional flows for the purpose of reporting.

       Currently, using this module's API there is no way to determine if two subsequently
       reported unidirectional flows were really a single argus flow.  This may be addressed in a
       future release of this package.

       Furthermore, for argus flows which represent bidirectional ICMP traffic, this module
       presumes that all the reverse packets were ECHOREPLYs (sic).  This is sometimes incorrect
       as described here:

          http://www.theorygroup.com/Archive/Argus/2002/msg00016.html

       and will be fixed in a future release of this package.

       Timestamps ($startime and $endtime) are sometimes reported incorrectly for bidirectional
       argus flows that represent only one packet in each direction.  This will be fixed in a
       future release.

       Argus flows sometimes contain information which does not map directly to the flow
       variables presented by this module.  For the time being, this information is simply not
       accessible through this module's API.  This may be addressed in a future release.

       Lastly, argus flows produced from observed traffic on a local ethernet do not contain
       enough information to meaningfully set the values of all this module's flow variables.
       For instance, the next-hop and input/output ifIndex numbers are missing.  For the time
       being, all argus flows accessed throught this module's API will have both the $input_if
       and $output_if as 42.  Althought 42 is the answer to life, the universe, and everthing, in
       this context, it is just an arbitrary number.  It is important that $output_if is non-
       zero, however, since existing FlowScan reports interpret an $output_if value of zero to
       mean that the traffic represented by that flow was not forwarded (i.e.  dropped).  For
       similar reasons, the $nexthopip for all argus flows is reported as "127.0.0.1".

BUGS

       Currently, only NetFlow version 5 is supported when reading cflowd-format raw flow files.

       When built with support for flow-tools and attempting to read a cflowd format raw flow
       file from standard input, you'll get the error:

          open "-": No such file or directory

       For the time being, the workaround is to write the content to a file and read it from
       directly from there rather than from standard input.  (This happens because we can't close
       and re-open file descriptor zero after determining that the content was not in flow-tools
       format.)

       When built with support for flow-tools and using verbose mode, Cflow::find will generate
       warnings if you process a cflowd format raw flow file.  This happens because it will first
       attempt to open the file as a flow-tools format raw flow file (which will produce a
       warning message), and then revert to handling it as cflowd format raw flow file.

       Likewise, when built with support for argus and attempting to read a cflowd format raw
       flow file from standard input, you'll get this warning message:

          not Argus-2.0 data stream.

       This is because argus (as of argus-2.0.4) doesn't seem to have a mode in which such
       warning messages are supressed.

       The $Cflow::raw flow variable contains the flow record in cflowd format, even if it was
       read from a raw flow file produced by flow-tools or argus.  Because cflowd discards the
       fractional portion of the flow start and end time, only the whole seconds portion of these
       times will be retained.  (That is, the raw record in $Cflow::raw does not contain the
       $start_msecs and $end_msecs, so using $Cflow::raw to convert to cflowd format is a lossy
       operation.)

       When used with cflowd, Cflow::find() will generate warnings if the flow data file is
       "invalid" as far as its concerned.  To avoid this, you must be using Cisco version 5 flow-
       export and configure cflowd so that it saves all flow-export data.  This is the default
       behavior when cflowd produces time-stamped raw flow files after being patched as described
       here:

          http://net.doit.wisc.edu/~plonka/cflowd/

NOTES

       The interface presented by this package is a blatant ripoff of File::Find.

AUTHOR

       Dave Plonka <plonka@doit.wisc.edu>

       Copyright (C) 1998-2002  Dave Plonka.  This program is free software; you can redistribute
       it and/or modify it under the terms of the GNU General Public License as published by the
       Free Software Foundation; either version 2 of the License, or (at your option) any later
       version.

VERSION

       The version number is the module file RCS revision number ($Revision: 1.51 $) with the
       minor number printed right justified with leading zeroes to 3 decimal places.  For
       instance, RCS revision 1.1 would yield a package version number of 1.001.

       This is so that revision 1.10 (which is version 1.010), for example, will test greater
       than revision 1.2 (which is version 1.002) when you want to require a minimum version of
       this module.

SEE ALSO

       perl(1), Socket, Net::Netmask, Net::Patricia.