Provided by: lmbench_3.0-a9-1.1ubuntu0.1_amd64 bug

NAME

       bw_mem - time memory bandwidth

SYNOPSIS

       bw_mem_cp    [    -P    <parallelism>    ]    [    -W   <warmups>   ]   [   -N   <repetitions>   ]   size
       rd|wr|rdwr|cp|fwr|frd|bzero|bcopy [align]

DESCRIPTION

       bw_mem allocates twice the specified amount of memory, zeros it, and then times the copying of the  first
       half to the second half.  Results are reported in megabytes moved per second.

       The  size  specification  may  end  with ``k'' or ``m'' to mean kilobytes (* 1024) or megabytes (* 1024 *
       1024).

OUTPUT

       Output format is "%0.2f %.2f\n", megabytes, megabytes_per_second, i.e.,

       8.00 25.33

       There are nine different memory benchmarks in bw_mem.  They each measure slightly different  methods  for
       reading, writing or copying data.

       rd     measures  the  time  to  read data into the processor.  It computes the sum of an array of integer
              values.  It accesses every fourth word.

       wr     measures the time to write data to memory.  It assigns a constant value to each memory of an array
              of integer values.  It accesses every fourth word.

       rdwr   measures  the  time to read data into memory and then write data to the same memory location.  For
              each element in an array it adds the current value  to  a  running  sum  before  assigning  a  new
              (constant) value to the element.  It accesses every fourth word.

       cp     measures  the  time  to  copy data from one location to another.  It does an array copy: dest[i] =
              source[i].  It accesses every fourth word.

       frd    measures the time to read data into the processor.  It computes the sum of  an  array  of  integer
              values.

       fwr    measures the time to write data to memory.  It assigns a constant value to each memory of an array
              of integer values.

       fcp    measures the time to copy data from one location to another.  It does an  array  copy:  dest[i]  =
              source[i].

       bzero  measures how fast the system can bzero memory.

       bcopy  measures how fast the system can bcopy data.

MEMORY UTILIZATION

       This  benchmark can move up to three times the requested memory.  Bcopy will use 2-3 times as much memory
       bandwidth: there is one read from the source and a write to the destionation.  The write usually  results
       in  a  cache  line  read and then a write back of the cache line at some later point.  Memory utilization
       might be reduced by 1/3 if the processor architecture implemented ``load cache line'' and  ``store  cache
       line'' instructions (as well as ``getcachelinesize'').

SEE ALSO

       lmbench(8).

AUTHOR

       Carl Staelin and Larry McVoy

       Comments, suggestions, and bug reports are always welcome.

(c)1994-2000 Larry McVoy and Carl Staelin            $Date$                                            BW_MEM(8)