Ubuntu Manpage: lmbench - benchmarking toolbox

Provided by: lmbench-doc_3.0-a9-1.1ubuntu0.1_all

NAME

       lmbench - benchmarking toolbox

SYNOPSIS

       #include ``lmbench.h''

       typedef u_long iter_t

       typedef (*benchmp_f)(iter_t iterations, void* cookie)

       void benchmp(benchmp_f  initialize, benchmp_f benchmark, benchmp_f cleanup, int enough, int parallel, int
       warmup, int repetitions, void* cookie)

       uint64    get_n()

       void milli(char *s, uint64 n)

       void micro(char *s, uint64 n)

       void nano(char *s, uint64 n)

       void mb(uint64 bytes)

       void kb(uint64 bytes)

DESCRIPTION

       Creating benchmarks using the lmbench timing harness is easy.  Since it is so easy to measure performance
       using  lmbench , it is possible to quickly answer questions that arise during system design, development,
       or tuning.  For example, image processing

       There are two attributes that are critical for performance, latency and bandwidth, and  lmbench´s  timing
       harness  makes  it  easy  to  measure  and  report  results  for  both.  Latency is usually important for
       frequently executed operations, and bandwidth is usually important when moving large chunks of data.

       There are a number of factors to consider when building benchmarks.

       The timing harness requires that the benchmarked operation be idempotent  so  that  it  can  be  repeated
       indefinitely.

       The  timing subsystem, benchmp, is passed up to three function pointers.  Some benchmarks may need as few
       as one function pointer (for benchmark).

       void benchmp(initialize, benchmark, cleanup, enough, parallel, warmup, repetitions, cookie)
              measures the performance of benchmark repeatedly and reports the median result.   benchmp  creates
              parallel  sub-processes  which  run  benchmark  in  parallel.   This allows lmbench to measure the
              system's ability to scale as the number of client processes increases.  Each sub-process  executes
              initialize  before  starting  the  benchmarking  cycle  with  iterations  set  to 0.  It will call
              initialize , benchmark , and cleanup with iterations set to the number of iterations in the timing
              loop several times in order to collect repetitions results.  The calls to benchmark are surrounded
              by start and stop call to time the amount of  time  it  takes  to  do  the  benchmarked  operation
              iterations  times.   After  all  the benchmark results have been collected, cleanup is called with
              iterations set to 0 to cleanup any resources which  may  have  been  allocated  by  initialize  or
              benchmark.   cookie is a void pointer to a hunk of memory that can be used to store any parameters
              or state that is needed by the benchmark.

       void benchmp_getstate()
              returns a void pointer to the lmbench-internal state used during benchmarking.  The state  is  not
              to be used or accessed directly by clients, but rather would be passed into benchmp_interval.

       iter_t    benchmp_interval(void* state)
              returns  the  number  of  times the benchmark should execute its benchmark loop during this timing
              interval.  This is used only for weird benchmarks which cannot implement the benchmark body  in  a
              function which can return, such as the page fault handler.  Please see lat_sig.c for sample usage.

       uint64    get_n()
              returns the number of times loop_body was executed during the timing interval.

       void milli(char *s, uint64 n)
              print  out  the  time  per  operation  in milli-seconds.  n is the number of operations during the
              timing interval, which is passed as  a  parameter  because  each  loop_body  can  contain  several
              operations.

       void micro(char *s, uint64 n)
              print the time per opertaion in micro-seconds.

       void nano(char *s, uint64 n)
              print the time per operation in nano-seconds.

       void mb(uint64 bytes)
              print the bandwidth in megabytes per second.

       void kb(uint64 bytes)
              print the bandwidth in kilobytes per second.

USING lmbench

       Here  is  an  example  of  a  simple  benchmark  that measures the latency of the random number generator
       lrand48():

              #include ``lmbench.h''

              void
              benchmark_lrand48(iter_t iterations, void* cookie) {
                   while(iterations-- > 0)
                        lrand48();
              }

              int
              main(int argc, char *argv[])
              {
                   benchmp(NULL, benchmark_lrand48, NULL, 0, 1, 0, TRIES, NULL);
                   micro( lrand48()", get_n());"
                   exit(0);
              }

       Here is a simple benchmark that measures and reports the bandwidth of bcopy:

              #include ``lmbench.h''

              #define MB (1024 * 1024)
              #define SIZE (8 * MB)

              struct _state {
                   int size;
                   char* a;
                   char* b;
              };

              void
              initialize_bcopy(iter_t iterations, void* cookie) {
                   struct _state* state = (struct _state*)cookie;

                  if (!iterations) return;
                   state->a = malloc(state->size);
                   state->b = malloc(state->size);
                   if (state->a == NULL || state->b == NULL)
                        exit(1);
              }

              void
              benchmark_bcopy(iter_t iterations, void* cookie) {
                   struct _state* state = (struct _state*)cookie;

                   while(iterations-- > 0)
                        bcopy(state->a, state->b, state->size);
              }

              void
              cleanup_bcopy(iter_t iterations, void* cookie) {
                   struct _state* state = (struct _state*)cookie;

                  if (!iterations) return;
                   free(state->a);
                   free(state->b);
              }

              int
              main(int argc, char *argv[])
              {
                   struct _state state;

                   state.size = SIZE;
                   benchmp(initialize_bcopy, benchmark_bcopy, cleanup_bcopy,
                        0, 1, 0, TRIES, &state);
                   mb(get_n() * state.size);
                   exit(0);
              }

       A slightly more complex version of the bcopy benchmark might measure bandwidth as a  function  of  memory
       size and parallelism.  The main procedure in this case might look something like this:

              int
              main(int argc, char *argv[])
              {
                   int  size, par;
                   struct _state state;

                   for (size = 64; size <= SIZE; size <<= 1) {
                        for (par = 1; par < 32; par <<= 1) {
                             state.size = size;
                             benchmp(initialize_bcopy, benchmark_bcopy,
                                  cleanup_bcopy, 0, par, 0, TRIES, &state);
                             fprintf(stderr, d%d
                             mb(par * get_n() * state.size);
                        }
                   }
                   exit(0);
              }

VARIABLES

       There  are  three  environment variables that can be used to modify the lmbench timing subsystem: ENOUGH,
       TIMING_O, and LOOP_O.

FUTURES

       Development of lmbench is continuing.

AUTHOR

       Carl Staelin and Larry McVoy

       Comments, suggestions, and bug reports are always welcome.

(c)1998-2000 Larry McVoy and Carl Staelin            $Date:$                                          LMBENCH(3)

NAME

SYNOPSIS

DESCRIPTION

USING lmbench

VARIABLES

FUTURES

SEE ALSO

AUTHOR