lunar (1) lamtrace.1.gz

Provided by: lam-runtime_7.1.4-7_amd64 bug

NAME

       lamtrace - Unload LAM trace data.

SYNOPSIS

       lamtrace [-hkvR] [-mpi] [-l listno] [-f #secs] [filename] [nodes] [processes]

OPTIONS

       -h            Print useful information on this command.

       -k            Copy and do not remove trace data.

       -v            Be verbose.

       -R            Delete all trace data from the specified nodes.

       -l            Unload only from the given list number.

       -mpi          Unload trace data for an MPI application.

       -f #secs      Signal  target processes to flush trace data to the daemon.  Then wait #secs
                     before unloading.

       filename      Place trace data into this file (default: def.lamtr).

DESCRIPTION

       The -t option of mpirun(1) and loadgo(1) allows  the  application  to  generate  execution
       traces.   These traces are first stored in a buffer within each application process.  When
       the buffer is full and when the application terminates, the runtime buffer is  flushed  to
       the  trace  daemon  (a structural component within the LAM daemon).  The trace daemon will
       also collect data up to a pre-compiled limit.  Beyond this limit,  the  oldest  traces  in
       will be forgotten in favor of the newer traces.

       After  an  application  has  finished,  the record of its execution is stored in the trace
       daemons of each node that was running the application.  The lamtrace command can  be  used
       to  retrieve  these  traces  and  store  them  in  one  file  for display by a performance
       visualization tool, such as xmpi(1).  If the application was started by xmpi(1),  lamtrace
       is not normally needed as the equivalent functionality is invoked with a button.

       Incomplete  trace  data can be unloaded while the application is running.  The output file
       must not exist prior to invoking lamtrace.  This is a good situation to use the -k option,
       which  preserves  the  trace daemon's contents after unloading.  Each reload will then get
       the entire run's trace data up to the present time.

       A running process is likely to be holding the  most  recent  trace  data  in  an  internal
       buffer.  A standard LAM signal, LAM_SIGTRACE (see doom(1)), causes trace enabled processes
       to flush the internal trace buffer to the daemon.  The -f option tells  lamtrace  to  send
       this  signal  to  all  target  processes  before  unloading  trace data.  A race condition
       develops between the target process storing trace data to the  daemon  and  the  unloading
       procedure.  The problem is foisted upon the user who gives a delay parameter after -f.

       Trace data are organized by node, process identifier and list number.  A process can store
       traces on any node, although the local node is the obvious, least intrusive  choice.   The
       process  can  identify  itself  in  any meaningful way (getpid(2) is a good idea) The list
       number is also chosen by the process.  These values may be set by an instrumented library,
       such  as  libmpi(3),  or  directly  by  the  application  with lam_rtrstore(2).  Unloading
       flexibility follows that of storing with the -l option  selecting  the  list  number,  and
       standard LAM command line mnemonics selecting nodes and processes.

       Dropping  old traces when a pre-compiled volume limit is reached only happens for positive
       list numbers.  Traces in negatively numbered lists will be collected until the  underlying
       system runs out of memory.  Do not use negative list numbers for high volume trace data.

       If  no process selection is given on the command line, trace data will be unloaded for all
       processes on each specified node.

       LAM, its trace daemon and lamtrace are all unaware of the format and meaning of traces.

       The -R option does not unload trace data.  It causes the target trace daemons to free  the
       memory  occupied  by  trace  data  in  the  given list.  If all lists are specified (no -l
       option), the trace daemon is effectively reset to its state after initiating LAM.

   Unloading MPI Trace Data
       A special capability, selected by the -mpi option, exists to search for  and  unload  only
       the  trace  data  generated by an MPI application.  For this purpose, lamtrace is aware of
       the particular reserved list numbers that libmpi(3) uses to store traces.   It  begins  by
       searching  all  specified  nodes and processes (the whole LAM multicomputer, if nothing is
       specified) for a special trace generated by process rank 0 in  MPI_COMM_WORLD  of  an  MPI
       application.   This  special  trace  contains  the  node  and  process  identifiers of all
       processes in that MPI_COMM_WORLD communicator.  lamtrace then  uses  the  node  /  process
       information to collect all trace data generated by libmpi(3).

       If  multiple  world communicators exist within LAM's trace daemons, the first one found is
       used.  Multiple worlds may be present due to multiple concurrent applications, trace  data
       from  a  previous run not removed (either with lamtrace or lamclean(1)), or an application
       that spawns processes.  A particular  world  communicator  can  be  located  by  providing
       precise node and process location to lamtrace.

       The -mpi option is not compatible with the -l option.

EXAMPLES

       lamtrace -v -mpi mytraces
           Unload  trace  data into the file "mytraces" from the first MPI application found in a
           search of the entire LAM multicomputer.  Report on important steps as they are done.

       lamtrace n30 -l 5 p21367
           Unload trace data from list 5 of process ID 21367 on node 30.  Operate silently.

       lamtrace -mpi n30 p21367
           Unload trace data from the MPI application world group whose process rank  0  has  PID
           21367 and is/was running on node 30.

BUGS

       Since  trace data can be unloaded during an application's execution, there should be a way
       to incrementally append to an output file.  This is a bit tricky with -mpi, but it can  be
       done.

FILES

       def.lamtr     default output file

SEE ALSO

       mpirun(1), loadgo(1), lam_rtrstore(1), lamclean(1), libmpi(3), xmpi(1)