Provided by: argus-client_3.0.8.2-6.1ubuntu1_amd64 bug

NAME

       rabins - process argus(8) data within specified bins.

SYNOPSIS

       rabins [-B secs] -M splitmode [options]] [raoptions] [-- filter-expression]

DESCRIPTION

       Rabins reads argus data from an argus-data source, and adjusts the data so that it is aligned to a set of
       bins, or slots, that are based on either time, input size, or count.   The  resulting  output  is  split,
       modified,  and  optionally  aggregated  so  that  the data fits to the constraints of the specified bins.
       rabins is designed to be a combination of rasplit and racluster, acting on  multiple  contexts  of  argus
       data.

       The  principal  function  of rabins is to align input data to a series of bins, and then process the data
       within the context of each bin.  This is the basis for real-time stream block  processing.   Time  series
       stream block processing is cricital for flow data graphing, comparing, analyzing, and correlation.  Fixed
       load stream block processing, based on the number of argus data records ('count'), or a fixed  volume  of
       data  ('size')  allows for control of resources in processing.  While load based options are very useful,
       they are rather esoteric.  See the online examples and rasplit.1 for examples of  using  these  modes  of
       operation.

Time Series Bins

       Time  series  bin'ing  is  specified  using  the -M time option.  Time bins are specified by the size and
       granularity of the time bin.  The granularity, 's'econds, 'm'inutes, 'h'ours, 'd'ays, 'w'eeks,  'M'onths,
       and  'y'ears, dictates where the bin boundaries lie.  To ensure that 0.5d and 12h start on the same point
       in time, second, minute, hour, and day based bins start at midnight, Jan 1st of the year  of  processing.
       Week, month and year bins all start on natural time boundaries, for the period.

       rabins  provides  a  separate processing context for each bin, so that aggregation and sorting occur only
       within the context of each time period.  Records are placed into bins based on load or  time.   For  load
       based  bins,  input  records  are processed in received order and are not modified. When using time based
       bins, records are placed into bins based on the starting time of the record.  By  default,  records  that
       span  a  time  boundary  are  split into as many records as needed to fit the record into appropriate bin
       sizes, using the algorithms used  by  rasplit.1.   Metrics  are  distributed  uniformly  within  all  the
       appropriate  bins. The result is a series of data and/or fragments that are time aligned, appropriate for
       time seried analysis, and visualization.

       When a record is split to conform to a time series bin, the resulting starting and ending timestamps  may
       or  may not coincide with the timestamps of the bins themselves. For some applications, this treatment is
       critical to the analytics that are working on the resulting data, such as transaction duration, and  flow
       traffic  burst  behavior.   However,  for  other  analytics,  like  average  load,  and rate analysis and
       reporting, the timestamps need to be modified so that they reflect the time range of the actual time  bin
       boundaries.   Rabins  supports  the optional hard option to specify that timestamps should conform to bin
       boundaries.  One of the results of this is that all durations in the reported records  will  be  the  bin
       duration.  This is extremely important when processing certain time series metrics, like load.

Load Based Bins

       Load  based  bin'ing is specified using the -M size or -M count options.  Load bins are used to constrain
       the resource used in bin processing.  So much load is input, aggregation is performed on the input  load,
       and  when  a  threshold  is  reached, the entire aggregation cache is dumped, reinitiallized, and reused.
       These can be used effectively to provide realtime data reduction, but within a fixed amount of memory.

Output Processing

       rabins has two basic modes of output,  the  default  holds  all  output  in  main  memory  until  EOF  is
       encountered  on  input,  where each sorted bin is written out. The second output mode, has rabins writing
       out the contents of individual sorted bins, periodically based on a holding time, specified using the  -B
       secs  option.   The  secs  value  should  be  chosen  such that rabins will have seen all the appropriate
       incoming data for that time period.  This is determined by the  ARGUS_FLOW_STATUS_INTERVAL  used  by  the
       collection  of  argus  data  sources  in  the input data stream, as well as any time drift that may exist
       amoung  argus   data   processin   elements.    When   there   is   good   time   sync,   and   with   an
       ARGUS_FLOW_STATUS_INTERVAL of 5 seconds, appropriate secs values are between 5-15 seconds.

       The  output  of  rabins  when  using  the  -B secs option, is appropriate to drive a number of processing
       elements, such as near real-time visualizations and alarm and reporting.

Output Stream

       Like all ra.1 client programs, the output of rabins.1 is an argus data stream, that  can  be  written  as
       binary  data  to  a file or standard output, or can be printed.  rabins supports all the output functions
       provided by rasplit.1.

       The output files name consists of a prefix, which is specified using the -w ra option, and for all  modes
       except  time  mode,  a  suffix, which is created for each resulting file.  If no prefix is provided, then
       rabins will use 'x' as the default prefix.  The suffix  that  is  used  is  determined  by  the  mode  of
       operation.   When  rabins  is  using  the  default  count mode or the size mode, the suffix is a group of
       letters 'aa', 'ab', and so on, such that concatenating the output files in  sorted  order  by  file  name
       produces  the  original  input file.  If rabins will need to create more output files than are allowed by
       the default suffix strategy, more letters will be added, in order to accomodate the needed files.

       When rabins is spliting based on time, rabins  uses  a  default  extension  of  %Y.%m.%d.%h.%m.%s.   This
       default can be overrided by adding a '%' extension to the name provided using the -w option.

       When  standard out is specified, using -w -, rabins will output a single argus-stream with START and STOP
       argus management records inserted appropriately to indicate where the output is split.  See argus(8)  for
       more information on output stream formats.

       When  rabins  is  spliting on output record count (the default), the number of records is specified as an
       ordinal counter, the default is 1000 records.  When rabins is spliting based on the maximum  output  file
       size,  the size is specified as bytes.  The scale of the bytes can be specified by appending 'b', 'k' and
       'm' to the number provided.

       When rabins is spliting base on time, the time period is specified with the option, and can be any period
       based  in seconds (s), minutes (m), hours (h), days (d), weeks (w), months (M) or years (y).  Rabins will
       create and modify records as required to split on prescribed time boundaries.  If any record spans a time
       boundary,  the  record  is  split  and  the  metrics  are  adjusted using a uniform distribution model to
       distribute the statistics between the two records.

       See rasplit.1 for specifics.

RABINS SPECIFIC OPTIONS

       rabins, like all ra based clients, supports a number of ra options including remote data access,  reading
       from multiple files and filtering of input argus records through a terminating filter expression.  Rabins
       also provides all the functions of racluster.1 and rasplit.1, for processing and outputing data.   rabins
       specific options are:

       -B secs
            Holding time in seconds before closing a bin and outputing its contents.

       -M splitmode
            Supported spliting modes are:

              time <n[smhdwMy]>
                   bin  records  into  time slots of n size.  This is used for time series analytics, especially
                   graphing.  Records, by default are split, so that their timestamps do not span the time range
                   specified.  Metrics are uniformly distributed among the resulting records.

              count <n[kmb]>
                   bin  records into chunks based on the number of records.  This is used for archive management
                   and parallel processing analytics, to limit the size of data processing to fixed  numbers  of
                   records.

              size <n[kmb]>
                   bin  records  into  chunks  based  on  the  number  of total bytes.  This is used for archive
                   management and parallel processing analytics, to limit the size of data processing  to  fixed
                   byte limitations.

       -M modes
            Supported processing modes are:
              hard split  on  hard  time  boundaries.   Each  flow records start and stop times will be the time
                   boundary times.  The default is to use the  original  start  and  stop  timestamps  from  the
                   records that make up the resulting aggregation.
              nomodify
                   Do  not  split  the  record  when  including  it  into a time bin.  This allows a time bin to
                   represent times outside of its defintion.  This option should not be  used  with  the  'hard'
                   option, as you will modify metrics and semantics.
       -m aggregation object
            Supported aggregation objects are:
              none           use a null flow key.
              srcid          argus source identifier.
              smac           source mac(ether) addr.
              dmac           destination mac(ether) addr.
              soui           oui portion of the source mac(ether) addr.
              doui           oui portion of the destination mac(ether) addr.
              smpls          source mpls label.
              dmpls          destination label addr.
              svlan          source vlan label.
              dvlan          destination vlan addr.
              saddr/[l|m]    source IP addr/[cidr len | m.a.s.k].
              daddr/[l|m]    destination IP addr/[cidr len | m.a.s.k].
              matrix/l       sorted src and dst IP addr/cidr len.
              proto          transaction protocol.
              sport          source port number. Implies use of 'proto'.
              dport          destination port number. Implies use of 'proto'.
              stos           source TOS byte value.
              dtos           destination TOS byte value.
              sttl           src -> dst TTL value.
              dttl           dst -> src TTL value.
              stcpb          src -> dst TCP base sequence number.
              dtcpb          dst -> src TCP base sequence number.
              inode[/l|m]]   intermediate node IP addr/[cidr len | m.a.s.k], source of ICMP mapped events.
              sco            source ARIN country code, if present.
              dco            destination ARIN country code, if present.
              sas            source node origin AS number, if available.
              das            destination node origin AS number, if available.
              ias            intermediate node origin AS number, if available.

       -P sort field
            Rabins  can  sort its output based on a sort field specification.  Because the -m option is used for
            aggregation fields, -P is used to specify the print priority order.  See rasort(1) for the  list  of
            sortable fields.

       -w filename
            Rabins supports an extended -w option that allows for output record contents to be inserted into the
            output filename.  Specified using '$' (dollar) notation, any printable  field  can  be  used.   Care
            should  be  taken  to  honor any shell escape requirements when specifying on the command line.  See
            ra(1) for the list of printable fields.

            Another extended feature, when using time mode, rabins will  process  the  supplied  filename  using
            strftime(3), so that time fields can be inserted into the resulting output filename.

INVOCATION

       This  invocation aggregates inputfile based on 10 minute time boundaries.  Input is split to fit within a
       10 minute time boundary, and within those boundaries, argus records are aggregated.  The resulting output
       its streamed to a single file.

          rabins -r * -M time 10m -w outputfile

       This  next  invocation aggregates inputfiles based on 5 minute time boundaries, and the output is written
       to 5 minute files.  Input is split such that all records conform to hard 10 minute time  boundaries,  and
       within those boundaries, argus records are aggregated, in this case, based on IP address matrix.
       The  resulting  output  its  streamed  to  files that are named relative to the records output content, a
       prefix of /matrix/%Y/%m/%d/argus. and the suffixes %H.%M.%S.

          rabins -r * -M hard time 5m -m matrix -w "/matrix/%Y/%m/%d/argus.%H.%M.%S"

       This next invocation aggregates input.stream based on matrix/24 into 10 second time boundaries, holds the
       data  for an additional 5 seconds after the time boundary has passed, and then prints the complete sorted
       contents of each bin to standard output.  The output is printed at 10 second intervals, and the output is
       the  content  of  the previous  10 sec time bin.  This example is meant to provide, every 10 seconds, the
       summary of all Class C subnet activity seen.  It is intended to run indefinately printing out  aggregated
       summary  records.   By  modifying  the  aggregation  model, using the "-f racluster.conf" option, you can
       achieve a great deal of data reduction with a lot of semantic reporting.

       % rabins -S localhost -m matrix/24 -B 5s -M hard time 10s -p0 -s +1trans - ipv4
                  StartTime  Trans  Proto            SrcAddr   Dir            DstAddr  SrcPkts  DstPkts     SrcBytes     DstBytes State
        2012/02/15.13:37:00      5     ip     192.168.0.0/24   <->     192.168.0.0/24       41       40         2860        12122   CON
        2012/02/15.13:37:00      2     ip     192.168.0.0/24    ->       224.0.0.0/24        2        0          319            0   INT
       [ 10 seconds pass]
        2012/02/15.13:37:10     13     ip     192.168.0.0/24   <->    208.59.201.0/24      269      351        97886       398700   CON
        2012/02/15.13:37:10     14     ip     192.168.0.0/24   <->     192.168.0.0/24       86       92         7814        46800   CON
        2012/02/15.13:37:10      1     ip    17.172.224.0/24   <->     192.168.0.0/24       52       37        68125         4372   CON
        2012/02/15.13:37:10      1     ip     192.168.0.0/24   <->      199.7.55.0/24        7        7          784         2566   CON
        2012/02/15.13:37:10      1     ip     184.85.13.0/24   <->     192.168.0.0/24        6        5         3952         2204   CON
        2012/02/15.13:37:10      2     ip    66.235.132.0/24   <->     192.168.0.0/24        5        6          915         3732   CON
        2012/02/15.13:37:10      1     ip    74.125.226.0/24   <->     192.168.0.0/24        3        4          709          888   CON
        2012/02/15.13:37:10      3     ip       66.39.3.0/24   <->     192.168.0.0/24        3        3          369          198   CON
        2012/02/15.13:37:10      1     ip     192.168.0.0/24   <->     205.188.1.0/24        1        1           54          356   CON
       [ 10 seconds pass]
        2012/02/15.13:37:20      6     ip     192.168.0.0/24   <->    208.59.201.0/24      392      461        60531       623894   CON
        2012/02/15.13:37:20      8     ip     192.168.0.0/24   <->     192.168.0.0/24       95      111         6948        93536   CON
        2012/02/15.13:37:20      3     ip     72.14.204.0/24   <->     192.168.0.0/24       38       32        38568         4414   CON
        2012/02/15.13:37:20      1     ip    17.112.156.0/24   <->     192.168.0.0/24       26       13        21798         7116   CON
        2012/02/15.13:37:20      2     ip    66.235.132.0/24   <->     192.168.0.0/24        6        3         1232         4450   CON
        2012/02/15.13:37:20      1     ip    66.235.133.0/24   <->     192.168.0.0/24        1        2           82          132   CON
       [ 10 seconds pass]
        2012/02/15.13:37:30    117     ip     192.168.0.0/24   <->    208.59.201.0/24      697      663       369769       134382   CON
        2012/02/15.13:37:30     11     ip     192.168.0.0/24   <->     192.168.0.0/24      147      187        11210       193253   CON
        2012/02/15.13:37:30      1     ip     184.85.13.0/24   <->     192.168.0.0/24       13        9        13408         9031   CON
        2012/02/15.13:37:30      2     ip    66.235.132.0/24   <->     192.168.0.0/24        8        7         1920        11563   CON
        2012/02/15.13:37:30      1     ip     192.168.0.0/24   <->    207.46.193.0/24        5        3          802          562   CON
        2012/02/15.13:37:30      1     ip    17.112.156.0/24   <->     192.168.0.0/24        5        2          646         3684   CON
        2012/02/15.13:37:30      2     ip     192.168.0.0/24    ->       224.0.0.0/24        2        0          382            0   REQ
       [ 10 seconds pass]

       This next invocation reads IP argus(8) data from inputfile and processes, the argus(8) data stream  based
       on  input  byte  size  of no greater than 1 Megabyte.  The resulting output stream is written to a single
       argus.out data file.

          rabins -r argusfile -M size 1m -s +1dur -m proto -w argus.out - ip

       This invocation reads IP argus(8) data from inputfile and aggregates the argus(8) data  stream  based  on
       input  file  size  of  no greater than 1K flows.  The resulting output stream is printed to the screen as
       standard argus records.

          rabins -r argusfile -M count 1k -m proto -s stime dur proto spkts dpkts - ip

COPYRIGHT

       Copyright (c) 2000-2016 QoSient. All rights reserved.

SEE ALSO

       ra(1), racluster(1), rasplit(1), rarc(5), argus(8),

AUTHORS

       Carter Bullard (carter@qosient.com).