Provided by: lmbench-doc_3.0-a7-1_all bug

NAME

       lmbench - system benchmarks

DESCRIPTION

       lmbench  is  a  series  of  micro  benchmarks intended to measure basic
       operating system and hardware system metrics.  The benchmarks fall into
       three general classes: bandwidth, latency, and ‘‘other’’.

       Most  of the lmbench benchmarks use a standard timing harness described
       in timing(3) and have a few standard options: parallelism, warmup,  and
       repetitions.   Parallelism  specifies the number of benchmark processes
       to run in parallel.   This  is  primarily  useful  when  measuring  the
       performance of SMP or distributed computers and can be used to evaluate
       the system’s performance scalability.  Warmup is the number of  minimum
       number  of  microseconds  the  benchmark should execute the benchmarked
       capability before it  begins  measuring  performance.   Again  this  is
       primarily  useful  for SMP or distributed systems and it is intended to
       give the process scheduler time to "settle" and  migrate  processes  to
       other   processors.   By  measuring  performance  over  various  warmup
       periods,   users   may   evaulate   the   scheduler’s   responsiveness.
       Repetitions  is  the  number  of measurements that the benchmark should
       take.  This allows lmbench to provide  greater  or  lesser  statistical
       strength  to the results it reports.  The default number of repetitions
       is 11.

BANDWIDTH MEASUREMENTS

       Data movement is  fundamental  to  the  performance  on  most  computer
       systems.   The  bandwidth  measurements  are  intended  to show how the
       system can move data.  The results of  the  bandwidth  metrics  can  be
       compared  but care must be taken to understand what it is that is being
       compared.   The  bandwidth  benchmarks  can  be  reduced  to  two  main
       components: operating system overhead and memory speeds.  The bandwidth
       benchmarks report their results  as  megabytes  moved  per  second  but
       please  note  that  the  data  moved is not necessarily the same as the
       memory bandwidth used to move the data.   Consult  the  individual  man
       pages for more information.

       Each  of the bandwidth benchmarks is listed below with a brief overview
       of the intent of the benchmark.

       bw_file_rd    reading and summing of a file via the read(2)  interface.

       bw_mem_cp     memory copy.

       bw_mem_rd     memory reading and summing.

       bw_mem_wr     memory writing.

       bw_mmap_rd    reading  and  summing  of  a  file via the memory mapping
                     mmap(2) interface.

       bw_pipe       reading of data via a pipe.

       bw_tcp        reading of data via a TCP/IP socket.

       bw_unix       reading data from a UNIX socket.

LATENCY MEASUREMENTS

       Control messages are  also  fundamental  to  the  performance  on  most
       computer  systems.   The  latency measurements are intended to show how
       fast a system can be told to do some operation.   The  results  of  the
       latency  metrics  can  be compared to each other for the most part.  In
       particular, the pipe, rpc, tcp, and udp transactions are all  identical
       benchmarks carried out over different system abstractions.

       Latency numbers here should mostly be in microseconds per operation.

       lat_connect   the time it takes to establish a TCP/IP connection.

       lat_ctx       context  switching;  the  number and size of processes is
                     varied.

       lat_fcntl     fcntl file locking.

       lat_fifo      ‘‘hot potato’’ transaction through a UNIX FIFO.

       lat_fs        creating and deleting small files.

       lat_pagefault the time it takes to fault in a page from a file.

       lat_mem_rd    memory read latency  (accurate  to  the  ~2-5  nanosecond
                     range, reported in nanoseconds).

       lat_mmap      time to set up a memory mapping.

       lat_ops       basic  processor  operations,  such  as integer XOR, ADD,
                     SUB, MUL, DIV, and MOD, and  float  ADD,  MUL,  DIV,  and
                     double ADD, MUL, DIV.

       lat_pipe      ‘‘hot potato’’ transaction through a Unix pipe.

       lat_proc      process creation times (various sorts).

       lat_rpc       ‘‘hot  potato’’  transaction  through Sun RPC over UDP or
                     TCP.

       lat_select    select latency

       lat_sig       signal installation and catch latencies.  Also protection
                     fault signal latency.

       lat_syscall   non trivial entry into the system.

       lat_tcp       ‘‘hot potato’’ transaction through TCP.

       lat_udp       ‘‘hot potato’’ transaction through UDP.

       lat_unix      ‘‘hot potato’’ transaction through UNIX sockets.

       lat_unix_connect
                     the  time it takes to establish a UNIX socket connection.

OTHER MEASUREMENTS

       mhz           processor cycle time

       tlb           TLB size and TLB miss latency

       line          cache line size (in bytes)

       cache         cache statistics, such as line size, cache sizes,  memory
                     parallelism.

       stream        John McCalpin’s stream benchmark

       par_mem       memory  subsystem parallelism.  How many requests can the
                     memory subsystem service in parallel, which may depend on
                     the location of the data in the memory hierarchy.

       par_ops       basic processor operation parallelism.

SEE ALSO

       bargraph(1),     graph(1),     lmbench(3),    results(3),    timing(3),
       bw_file_rd(8), bw_mem_cp(8), bw_mem_wr(8),  bw_mmap_rd(8),  bw_pipe(8),
       bw_tcp(8),   bw_unix(8),   lat_connect(8),   lat_ctx(8),  lat_fcntl(8),
       lat_fifo(8),  lat_fs(8),   lat_http(8),   lat_mem_rd(8),   lat_mmap(8),
       lat_ops(8),  lat_pagefault(8),  lat_pipe(8),  lat_proc(8),  lat_rpc(8),
       lat_select(8),  lat_sig(8),  lat_syscall(8),  lat_tcp(8),   lat_udp(8),
       lmdd(8),  par_ops(8),  par_mem(8),  mhz(8),  tlb(8), line(8), cache(8),
       stream(8)

ACKNOWLEDGEMENT

       Funding for  the  development  of  these  tools  was  provided  by  Sun
       Microsystems Computer Corporation.

       A   large  number  of  people  have  contributed  to  the  testing  and
       development of lmbench.

COPYING

       The benchmarking code is distributed  under  the  GPL  with  additional
       restrictions, see the COPYING file.

AUTHOR

       Carl Staelin and Larry McVoy

       Comments, suggestions, and bug reports are always welcome.

(c)1994-2000 Larry McVoy and Carl St$Date$                          LMBENCH(8)