Ubuntu Manpage: lamtrace - Unload LAM trace data.

Provided by: lam-runtime_7.1.4-6build2_amd64

NAME

       lamtrace - Unload LAM trace data.

SYNOPSIS

       lamtrace [-hkvR] [-mpi] [-l listno] [-f #secs] [filename] [nodes] [processes]

OPTIONS

       -h            Print useful information on this command.

       -k            Copy and do not remove trace data.

       -v            Be verbose.

       -R            Delete all trace data from the specified nodes.

       -l            Unload only from the given list number.

       -mpi          Unload trace data for an MPI application.

       -f #secs      Signal  target processes to flush trace data to the daemon.  Then wait #secs
                     before unloading.

       filename      Place trace data into this file (default: def.lamtr).

DESCRIPTION

The -t option of mpirun(1) and loadgo(1) allows the application to generate execution
traces. These traces are first stored in a buffer within each application process. When
the buffer is full and when the application terminates, the runtime buffer is flushed to
the trace daemon (a structural component within the LAM daemon). The trace daemon will
also collect data up to a pre-compiled limit. Beyond this limit, the oldest traces in
will be forgotten in favor of the newer traces.

After an application has finished, the record of its execution is stored in the trace
daemons of each node that was running the application. The lamtrace command can be used
to retrieve these traces and store them in one file for display by a performance
visualization tool, such as xmpi(1). If the application was started by xmpi(1), lamtrace
is not normally needed as the equivalent functionality is invoked with a button.

Incomplete trace data can be unloaded while the application is running. The output file
must not exist prior to invoking lamtrace. This is a good situation to use the -k option,
which preserves the trace daemon's contents after unloading. Each reload will then get
the entire run's trace data up to the present time.

A running process is likely to be holding the most recent trace data in an internal
buffer. A standard LAM signal, LAM_SIGTRACE (see doom(1)), causes trace enabled processes
to flush the internal trace buffer to the daemon. The -f option tells lamtrace to send
this signal to all target processes before unloading trace data. A race condition
develops between the target process storing trace data to the daemon and the unloading
procedure. The problem is foisted upon the user who gives a delay parameter after -f.

Trace data are organized by node, process identifier and list number. A process can store
traces on any node, although the local node is the obvious, least intrusive choice. The
process can identify itself in any meaningful way (getpid(2) is a good idea) The list
number is also chosen by the process. These values may be set by an instrumented library,
such as libmpi(3), or directly by the application with lam_rtrstore(2). Unloading
flexibility follows that of storing with the -l option selecting the list number, and
standard LAM command line mnemonics selecting nodes and processes.

Dropping old traces when a pre-compiled volume limit is reached only happens for positive
list numbers. Traces in negatively numbered lists will be collected until the underlying
system runs out of memory. Do not use negative list numbers for high volume trace data.

If no process selection is given on the command line, trace data will be unloaded for all
processes on each specified node.

LAM, its trace daemon and lamtrace are all unaware of the format and meaning of traces.

The -R option does not unload trace data. It causes the target trace daemons to free the
memory occupied by trace data in the given list. If all lists are specified (no -l
option), the trace daemon is effectively reset to its state after initiating LAM.

Unloading MPI Trace Data
A special capability, selected by the -mpi option, exists to search for and unload only
the trace data generated by an MPI application. For this purpose, lamtrace is aware of
the particular reserved list numbers that libmpi(3) uses to store traces. It begins by
searching all specified nodes and processes (the whole LAM multicomputer, if nothing is
specified) for a special trace generated by process rank 0 in MPI_COMM_WORLD of an MPI
application. This special trace contains the node and process identifiers of all
processes in that MPI_COMM_WORLD communicator. lamtrace then uses the node / process
information to collect all trace data generated by libmpi(3).

If multiple world communicators exist within LAM's trace daemons, the first one found is
used. Multiple worlds may be present due to multiple concurrent applications, trace data
from a previous run not removed (either with lamtrace or lamclean(1)), or an application
that spawns processes. A particular world communicator can be located by providing
precise node and process location to lamtrace.

The -mpi option is not compatible with the -l option.

EXAMPLES

       lamtrace -v -mpi mytraces
           Unload  trace  data into the file "mytraces" from the first MPI application found in a
           search of the entire LAM multicomputer.  Report on important steps as they are done.

       lamtrace n30 -l 5 p21367
           Unload trace data from list 5 of process ID 21367 on node 30.  Operate silently.

       lamtrace -mpi n30 p21367
           Unload trace data from the MPI application world group whose process rank  0  has  PID
           21367 and is/was running on node 30.

BUGS

       Since  trace data can be unloaded during an application's execution, there should be a way
       to incrementally append to an output file.  This is a bit tricky with -mpi, but it can  be
       done.

FILES

       def.lamtr     default output file