Provided by: systemtap_0.0.20080705-2ubuntu1_i386 bug


       stap - systemtap script translator/driver


       stap [ OPTIONS ] - [ ARGUMENTS ]
       stap [ OPTIONS ] -e SCRIPT [ ARGUMENTS ]
       stap [ OPTIONS ] -l PROBE [ ARGUMENTS ]


       The  stap  program  is the front-end to the Systemtap tool.  It accepts
       probing  instructions  (written  in  a  simple   scripting   language),
       translates  those  instructions  into C code, compiles this C code, and
       loads the resulting kernel  module  into  a  running  Linux  kernel  to
       perform the requested system trace/probe functions.  You can supply the
       script in a named file, from standard input, or from the command  line.
       The  program runs until it is interrupted by the user, or if the script
       voluntarily invokes the exit() function, or  by  sufficient  number  of
       soft errors.

       The language, which is described in a later section, is strictly typed,
       declaration free, procedural, and inspired by awk.   It  allows  source
       code  points  or  events  in the kernel to be associated with handlers,
       which are subroutines that are executed synchronously.  It is  somewhat
       similar conceptually to "breakpoint command lists" in the gdb debugger.

       This manual corresponds to version 0.7.


       The systemtap translator supports the  following  options.   Any  other
       option prints a list of supported options.

       -v     Increase  verbosity.  Produce a larger volume of informative (?)
              output each time option repeated.

       -h     Show help message.

       -V     Show version message.

       -k     Keep the temporary directory after all processing.  This may  be
              useful in order to examine the generated C code, or to reuse the
              compiled kernel object.

       -g     Guru mode.  Enable parsing  of  unsafe  expert-level  constructs
              like embedded C.

       -P     Prologue-searching  mode.   Activate  heuristics  to work around
              incorrect debbugging information for $target variables.

       -u     Unoptimized  mode.    Disable   unused   code   elision   during

       -w     Suppressed  warnings  mode.  Disable warning messages for elided
              code in user script.

       -b     Use bulk mode (percpu files) for kernel-to-user data transfer.

       -t     Collect timing information on the number of times probe executes
              and average amount of time spent in each probe.

       -sNUM  Use NUM megabyte buffers for kernel-to-user data transfer.  On a
              multiprocessor in bulk mode, this is a per-processor amount.

       -p NUM Stop after pass  NUM.   The  passes  are  numbered  1-5:  parse,
              elaborate,  translate, compile, run.  See the PROCESSING section
              for details.

       -I DIR Add the given directory to the tapset search directory.  See the
              description of pass 2 for details.

       -D NAME=VALUE
              Add  the  given C preprocessor directive to the module Makefile.
              These can be used to override limit parameters described  below.

       -R DIR Look for the systemtap runtime sources in the given directory.

       -r RELEASE
              Build for given kernel release instead of currently running one.

       -m MODULE
              Use the given name  for  the  generated  kernel  object  module,
              instead  of  a  unique  randomized  name.   The generated kernel
              object module is copied to the current directory.

       -o FILE
              Send standard output to named file. In bulk mode,  percpu  files
              will start with FILE_ followed by the cpu number.

       -c CMD Start the probes, run CMD, and exit when CMD finishes.

       -x PID Sets  target()  to  PID.  This allows scripts to be written that
              filter on a specific process.

       -l PROBE
              Instead of running a probe script, just list all available probe
              points  matching  the  given  pattern.   The pattern may include
              wildcards and aliases.

       --kelf For names and addresses  of  functions  to  probe,  consult  the
              symbol  tables in the kernel and modules.  This can be useful if
              your kernel  and/or  modules  were  compiled  without  debugging
              information,  or  the  function  you  want  to  probe  is  in an
              assembly-language file built without debugging information.  See
              the MAKING DO WITH SYMBOL TABLES section for more information.

              For  names  and  addresses of kernel functions to probe, consult
              the symbol table in the indicated text  file.   The  default  is
              /boot/   The  contents of this file should be
              in the form of the default output from nm(1).  Only  symbols  of
              type  T  or  t  are used.  If you specify /proc/kallsyms or some
              other file in  that  format,  where  lines  for  module  symbols
              contain  a fourth column, reading of the symbol table stops with
              the first module symbol (which should be right  after  the  last
              kernel  symbol).   As  with  --kelf,  the  symbol  table in each
              module’s .ko file will also be consulted.   See  the  MAKING  DO
              WITH SYMBOL TABLES section for more information.

              For  testing,  act  as  though  neither  the uncompressed kernel
              (vmlinux) nor the kernel debugging information can be found.

              For testing, act as though vmlinux and  modules  lack  debugging


       Any  additional  arguments on the command line are passed to the script
       parser for substitution.  See below.


       The systemtap script  language  resembles  awk.   There  are  two  main
       outermost  constructs:  probes and functions.  Within these, statements
       and expressions use C-like operator syntax and precedence.

       Whitespace is ignored.  Three forms of comments are supported:
              # ... shell style, to the end of line, except for $# and @#
              // ... C++ style, to the end of line
              /* ... C style ... */
       Literals are either strings enclosed in double-quotes (passing  through
       the  usual  C  escape codes with backslashes), or integers (in decimal,
       hexadecimal, or octal, using the same notation as in C).   All  strings
       are  limited  in length to some reasonable value (a few hundred bytes).
       Integers are 64-bit signed quantities, although the parser also accepts
       (and wraps around) values above positive 2**63.

       In  addition, script arguments given at the end of the command line may
       be inserted.  Use $1 ... $<NN> for insertion unquoted, @1 ... @<NN> for
       insertion as a string literal.  The number of arguments may be accessed
       through $# (as an unquoted number) or through @# (as a quoted  number).
       These  may be used at any place a token may begin, including within the
       preprocessing stage.  Reference to an argument number beyond  what  was
       actually given is an error.

       A  simple  conditional preprocessing stage is run as a part of parsing.
       The general form is similar to the cond ? exp1 : exp2 ternary operator:
              %( CONDITION %? TRUE-TOKENS %)
              %( CONDITION %? TRUE-TOKENS %: FALSE-TOKENS %)
       The CONDITION is either an expression whose format is determined by its
       first keyword, or a string literals comparison or  a  numeric  literals

       If  the  first part is the identifier kernel_vr or kernel_v to refer to
       the kernel  version  number,  with  ("2.6.13-1.322FC3smp")  or  without
       ("2.6.13")  the release code suffix, then the second part is one of the
       six standard numeric comparison operators <, <=, ==, !=, >, and >=, and
       the  third part is a string literal that contains an RPM-style version-
       release value.  The condition is deemed satisfied if the version of the
       target  kernel  (as optionally overridden by the -r option) compares to
       the given version string.  The comparison is  performed  by  the  glibc
       function  strverscmp.  As a special case, if the operator is for simple
       equality (==), or inequality (!=), and  the  third  part  contains  any
       wildcard  characters (* or ? or [), then the expression is treated as a
       wildcard (mis)match as evaluated by fnmatch.

       If, on the other hand, the first part is the identifier arch  to  refer
       to  the  processor  architecture,  then the second part then the second
       part is one of the two string comparison operators == or  !=,  and  the
       third  part  is a string literal for matching it.  This comparison is a
       wildcard (mis)match.

       Otherwise, the CONDITION is expected to be  a  comparison  between  two
       string  literals  or two numeric literals.  In this case, the arguments
       are the only variables usable.

       The TRUE-TOKENS and FALSE-TOKENS are zero or more general parser tokens
       (possibly  including  nested preprocessor conditionals), and are pasted
       into the input stream if the condition is true or false.  For  example,
       the  following  code  induces  a  parse  error unless the target kernel
       version is newer than 2.6.5:
              %( kernel_v <= "2.6.5" %? **ERROR** %) # invalid token sequence
       The following code might adapt to hypothetical kernel version drift:
              probe kernel.function (
                %( kernel_v <= "2.6.12" %? "__mm_do_fault" %:
                   %( kernel_vr == "2.6.13*smp" %? "do_page_fault" %:
                      UNSUPPORTED %) %)
              ) { /* ... */ }

              %( arch == "ia64" %?
                 probe syscall.vliw = kernel.function("vliw_widget") {}

       Identifiers for variables and functions are an  alphanumeric  sequence,
       and  may  include  "_"  and  "$" characters.  They may not start with a
       plain digit, as in C.  Each variable is by default local to  the  probe
       or function statement block within which it is mentioned, and therefore
       its scope and lifetime is limited to a  particular  probe  or  function

       Scalar  variables  are  implicitly  typed  as either string or integer.
       Associative arrays also have a string or integer value, and a  a  tuple
       of  strings  and/or  integers  serving  as a key.  Here are a few basic
              var1 = 5
              var2 = "bar"
              array1 [pid()] = "name"     # single numeric key
              array2 ["foo",4,i++] += 5   # vector of string/num/num keys
              if (["hello",5,4] in array2) println ("yes")  # membership test

       The translator performs type inference on  all  identifiers,  including
       array  indexes  and function parameters.  Inconsistent type-related use
       of identifiers signals an error.

       Variables may be declared global, so that they are shared  amongst  all
       probes  and live as long as the entire systemtap session.  There is one
       namespace for all global variables, regardless  of  which  script  file
       they  are  found  within.   A  global declaration may be written at the
       outermost level anywhere, not within a block of  code.   The  following
       declaration marks a few variables as global.  The translator will infer
       for each its value type, and if it is used as an array, its key  types.
       Optionally,  scalar  globals may be initialized with a string or number
              global var1, var2, var3=4

       Arrays are limited in size by the MAXMAPENTRIES  variable  --  see  the
       SAFETY AND SECURITY section for details.  Optionally, global arrays may
       be declared with a maximum size in brackets,  overriding  MAXMAPENTRIES
       for  that array only.  Note that this doesn’t indicate the type of keys
       for the array, just the size.
              global tiny_array[10], normal_array, big_array[50000]

       Statements enable procedural  control  flow.   They  may  occur  within
       functions  and probe handlers.  The total number of statements executed
       in response to any single probe event is limited to some number defined
       by  a  macro  in  the translated C code, and is in the neighbourhood of

       EXP    Execute the string- or integer-valued expression and throw  away
              the value.

       { STMT1 STMT2 ... }
              Execute  each  statement  in  sequence in this block.  Note that
              separators or terminators are generally  not  necessary  between

       ;      Null  statement,  do  nothing.   It  is  useful  as  an optional
              separator between statements to improve  syntax-error  detection
              and to handle certain grammar ambiguities.

       if (EXP) STMT1 [ else STMT2 ]
              Compare  integer-valued  EXP  to  zero.  Execute the first (non-
              zero) or second STMT (zero).

       while (EXP) STMT
              While integer-valued EXP evaluates to non-zero, execute STMT.

       for (EXP1; EXP2; EXP3) STMT
              Execute EXP1 as initialization.  While EXP2 is non-zero, execute
              STMT, then the iteration expression EXP3.

       foreach (VAR in ARRAY [ limit EXP ]) STMT
              Loop  over  each  element  of  the named global array, assigning
              current key to VAR.  The array may not be  modified  within  the
              statement.   By adding a single + or - operator after the VAR or
              the ARRAY identifier, the iteration will  proceed  in  a  sorted
              order,  by  ascending  or  descending index or value.  Using the
              optional limit keyword limits the number of loop  iterations  to
              EXP times.  EXP is evaluted once at the beginning of the loop.

       foreach ([VAR1, VAR2, ...] in ARRAY [ limit EXP ]) STMT
              Same  as  above,  used when the array is indexed with a tuple of
              keys.  A sorting suffix may be used on at most one VAR or  ARRAY

       break, continue
              Exit  or  iterate  the  innermost  nesting loop (while or for or
              foreach) statement.

       return EXP
              Return EXP value from enclosing  function.   If  the  function’s
              value  is  not  taken  anywhere,  then a return statement is not
              needed, and the function will have a special "unknown" type with
              no return value.

       next   Return now from enclosing probe handler.

       delete ARRAY[INDEX1, INDEX2, ...]
              Remove from ARRAY the element specified by the index tuple.  The
              value will no longer be  available,  and  subsequent  iterations
              will  not  report  the element.  It is not an error to delete an
              element that does not exist.

       delete ARRAY
              Remove all elements from ARRAY.

       delete SCALAR
              Removes the value of SCALAR.  Integers and strings  are  cleared
              to  0  and  ""  respectively,  while statistics are reset to the
              initial empty state.

       Systemtap supports a number of operators that  have  the  same  general
       syntax,  semantics,  and  precedence  as  in  C and awk.  Arithmetic is
       performed as per typical C rules for signed integers.  Division by zero
       or overflow is detected and results in an error.

       binary numeric operators
              * / % + - >> << & ^ | && ||

       binary string operators
              .  (string concatenation)

       numeric assignment operators
              = *= /= %= += -= >>= <<= &= ^= |=

       string assignment operators
              = .=

       unary numeric operators
              + - ! ~ ++ --

       binary numeric or string comparison operators
              < > <= >= == !=

       ternary operator
              cond ? exp1 : exp2

       grouping operator
              ( exp )

       function call
              fn ([ arg1, arg2, ... ])

       array membership check
              exp in array
              [exp1, exp2, ...] in array

       The main construct in the scripting language identifies probes.  Probes
       associate abstract events with a statement block ("probe handler") that
       is  to  be executed when any of those events occur.  The general syntax
       is as follows:
              probe PROBEPOINT [, PROBEPOINT] { [STMT ...] }

       Events are specified in a special syntax called "probe points".   There
       are  several  varieties  of probe points defined by the translator, and
       tapset scripts may define further ones using aliases.  These are listed
       in the stapprobes(5) manual pages.

       The probe handler is interpreted relative to the context of each event.
       For events associated  with  kernel  code,  this  context  may  include
       variables  defined  in  the  source  code  at that spot.  These "target
       variables" are presented to the script as  variables  whose  names  are
       prefixed  with "$".  They may be accessed only if the kernel’s compiler
       preserved them despite optimization.  This is the same constraint  that
       a  debugger  user  faces  when working with optimized code.  Some other
       events have very little context.

       New probe points may be defined using "aliases".  Probe  point  aliases
       look similar to probe definitions, but instead of activating a probe at
       the given point, it just defines a new probe point name as an alias  to
       an  existing one. There are two types of alias, i.e. the prologue style
       and  the  epilogue  style  which  are  identified  by  "="   and   "+="

       For  prologue  style  alias,  the statement block that follows an alias
       definition is implicitly added as a prologue to any probe  that  refers
       to  the  alias. While for the epilogue style alias, the statement block
       that follows an alias definition is implicitly added as an epilogue  to
       any probe that refers to the alias.  For example:

              probe = kernel.function("sys_read") {
                fildes = $fd
                if (execname == "init") next  # skip rest of probe
       defines   a   new   probe   point,   which   expands  to
       kernel.function("sys_read"), with the given statement  as  a  prologue,
       which  is  useful to predefine some variables for the alias user and/or
       to skip probe processing entirely based on some conditions.  And
              probe += kernel.function("sys_read") {
                if (tracethis) println ($fd)
       defines a new probe point with the  given  statement  as  an  epilogue,
       which  is  useful to take actions based upon variables set or left over
       by the the alias user.

       An alias is used just like a built-in probe type.
              probe {
                printf("reading fd=%d0, fildes)
                if (fildes > 10) tracethis = 1

       Systemtap scripts may define subroutines to  factor  out  common  work.
       Functions  take any number of scalar (integer or string) arguments, and
       must return a single scalar (integer or string).  An  example  function
       declaration looks like this:
              function thisfn (arg1, arg2) {
                 return arg1 + arg2
       Note  the  general  absence  of  type  declarations,  which are instead
       inferred by the translator.  However, if desired, a function definition
       may  include explicit type declarations for its return value and/or its
       arguments.  This is especially helpful for  embedded-C  functions.   In
       the  following  example, the type inference engine need only infer type
       type of arg2 (a string).
              function thatfn:string (arg1:long, arg2) {
                 return sprint(arg1) . arg2
       Functions may call others or themselves  recursively,  up  to  a  fixed
       nesting  limit.   This  limit is defined by a macro in the translated C
       code and is in the neighbourhood of 10.

       There are a set of function names that are  specially  treated  by  the
       translator.   They format values for printing to the standard systemtap
       output stream in a more convenient way.  The  sprint*  variants  return
       the formatted string instead of printing it.

       print, sprint
              Print  one  or  more  values  of any type, concatenated directly

       println, sprintln
              Print values like print and sprint, but also append a newline.

       printd, sprintd
              Take a string delimiter and two or more values of any type,  and
              print  the  values with the delimiter interposed.  The delimiter
              must be a literal string constant.

       printdln, sprintdln
              Print values with a delimiter like printd and sprintd, but  also
              append a newline.

       printf, sprintf
              Take a formatting string and a number of values of corresponding
              types, and print them all.  The format must be a literal  string

       The  printf  formatting  directives  similar to those of C, except that
       they are fully type-checked by the translator.
                   x = sprintf("take %d steps forward, %d steps back\n", 3, 2)
                   printf("take %d steps forward, %d steps back\n", 3+1, 2*2)
                   bob = "bob"
                   alice = "alice"
                   printf("%s phoned %s %.4x times\n", bob, alice . bob, 3456)
                   printf("%s except after %s\n",
                        sprintf("%s before %s",
                             sprint(1), sprint(3)),
                   id[bob] = 1234
                   id[alice] = 5678
                   foreach (name in id)
                        printdln("|", strlen(name), name, id[name])

       It is often desirable to collect statistics in a way  that  avoids  the
       penalties  of  repeatedly  exclusive locking the global variables those
       numbers are being put into.  Systemtap  provides  a  solution  using  a
       special  operator to accumulate values, and several pseudo-functions to
       extract the statistical aggregates.

       The aggregation operator is <<<, and resembles an assignment, or a  C++
       output-streaming  operation.   The  left  operand specifies a scalar or
       array-index lvalue, which must be declared global.  The  right  operand
       is  a  numeric  expression.   The  meaning  is intuitive: add the given
       number to the pile of numbers to compute statistics of.  (The  specific
       list  of  statistics  to  gather is given separately, by the extraction
                  foo <<< 1
                  stats[pid()] <<< memsize

       The extraction functions are also special.  For each  appearance  of  a
       distinct  extraction  function  operating  on  a  given identifier, the
       translator arranges to compute a set of  statistics  that  satisfy  it.
       The  statistics  system  is  thereby "on-demand".  Each execution of an
       extraction function causes the aggregation  to  be  computed  for  that
       moment across all processors.

       Here  is the set of extractor functions.  The first argument of each is
       the same style of lvalue used on the left hand side of  the  accumulate
       operation.  The @count(v), @sum(v), @min(v), @max(v), @avg(v) extractor
       functions  compute  the  number/total/minimum/maximum/average  of   all
       accumulated values.  The resulting values are all simple integers.

       Histograms  are  also  available, but are more complicated because they
       have      a      vector      rather      than       scalar       value.
       @hist_linear(v,start,stop,interval)  represents a linear histogram from
       "start" to "stop" by increments of "interval".  The  interval  must  be
       positive.  Similarly,  @hist_log(v)  represents  a  base-2  logarithmic
       histogram. Printing a histogram with  the  print  family  of  functions
       renders a histogram object as a tabular "ASCII art" bar chart.
              probe foo {
                x <<< $value
              probe end {
                printf ("avg %d = sum %d / count %d\n",
                        @avg(x), @sum(x), @count(x))
                print (@hist_log(v))

       When  in guru mode, the translator accepts embedded code in the script.
       Such code is enclosed between %{ and %}  markers,  and  is  transcribed
       verbatim,  without  analysis,  in  some  sequence, into the generated C
       code.  At the outermost level, this  may  be  useful  to  add  #include
       instructions,  and  any auxiliary definitions for use by other embedded

       The other place where embedded code is permitted is as a function body.
       In  this case, the script language body is replaced entirely by a piece
       of C code enclosed again between %{ and %} markers.  This C code may do
       anything  reasonable  and safe.  There are a number of undocumented but
       complex  safety  constraints  on   atomicity,   concurrency,   resource
       consumption, and run time limits, so this is an advanced technique.

       The  memory  locations  set  aside for input and output values are made
       available to it using a macro THIS.  Here are some examples:
              function add_one (val) %{
                THIS->__retvalue = THIS->val + 1;
              function add_one_str (val) %{
                strlcpy (THIS->__retvalue, THIS->val, MAXSTRINGLEN);
                strlcat (THIS->__retvalue, "one", MAXSTRINGLEN);
       The function argument and return value types have to be inferred by the
       translator  from  the  call  sites in order for this to work.  The user
       should examine C code generated for ordinary script-language  functions
       in order to write compatible embedded-C ones.

       A  set of builtin functions and probe point aliases are provided by the
       scripts  installed  under  the  /usr/share/systemtap/tapset  directory.
       These are described in the stapfuncs(5) and stapprobes(5) manual pages.


       The translator begins pass 1 by parsing the given input script, and all
       scripts   (files  named  *.stp)  found  in  a  tapset  directory.   The
       directories listed with -I are processed in sequence, each processed in
       "guru  mode".   For each directory, a number of subdirectories are also
       searched.  These subdirectories are derived from  the  selected  kernel
       version (the -R option), in order to allow more kernel-version-specific
       scripts to override less specific ones.   For  example,  for  a  kernel
       version  2.6.12-23.FC3  the  following  patterns  would be searched, in
       sequence: 2.6.12-23.FC3/*.stp,  2.6.12/*.stp,  2.6/*.stp,  and  finally
       *.stp Stopping the translator after pass 1 causes it to print the parse

       In pass 2, the translator analyzes the input script to resolve  symbols
       and  types.  References to variables, functions, and probe aliases that
       are unresolved internally are satisfied by searching through the parsed
       tapset scripts.  If any tapset script is selected because it defines an
       unresolved symbol, then the entirety of that script  is  added  to  the
       translator’s resolution queue.  This process iterates until all symbols
       are resolved and a subset of tapset scripts is selected.

       Next, all probe point  descriptions  are  validated  against  the  wide
       variety  supported  by the translator.  Probe points that refer to code
       locations ("synchronous probe points") require the  appropriate  kernel
       debugging  information  to  be  installed.   In  the  associated  probe
       handlers, target-side variables (whose names begin with "$") are  found
       and have their run-time locations decoded.

       Next,   all   probes   and  functions  are  analyzed  for  optimization
       opportunities, in order to remove variables, expressions, and functions
       that have no useful value and no side-effect.  Embedded-C functions are
       assumed to have side-effects  unless  they  include  the  magic  string
       /* pure */.   Since  this optimization can hide latent code errors such
       as type mismatches or invalid $target variables, it  sometimes  may  be
       useful to disable the optimizations with the -u option.

       Finally,  all variable, function, parameter, array, and index types are
       inferred  from  context  (literals  and   operators).    Stopping   the
       translator  after  pass  2 causes it to list all the probes, functions,
       and variables, along with all  inferred  types.   Any  inconsistent  or
       unresolved types cause an error.

       In  pass 3, the translator writes C code that represents the actions of
       all selected script files, and creates a Makefile to build that into  a
       kernel  object.   These  files  are  placed into a temporary directory.
       Stopping the translator at this point causes it to print  the  contents
       of the C file.

       In  pass  4,  the  translator  invokes the Linux kernel build system to
       create the actual kernel object file.  This involves  running  make  in
       the  temporary  directory,  and  requires  a kernel module build system
       (headers, config and Makefiles) to  be  installed  in  the  usual  spot
       /lib/modules/VERSION/build.   Stopping  the  translator after pass 4 is
       the last chance before running the kernel object.  This may  be  useful
       if you want to archive the file.

       In  pass  5,  the  translator  invokes  the systemtap auxiliary program
       staprun program for the given kernel object.  This program arranges  to
       load  the module then communicates with it, copying trace data from the
       kernel into temporary files, until the user sends an interrupt  signal.
       Any  run-time  error encountered by the probe handlers, such as running
       out of memory, division by zero, exceeding nesting or  runtime  limits,
       results in a soft error indication.  Soft errors in excess of MAXERRORS
       block of all subsequent probes, and terminate  the  session.   Finally,
       staprun unloads the module, and cleans up.


       See the stapex(5) manual page for a collection of samples.


       The  systemtap  translator  caches  the  pass 3 output (the generated C
       code) and the pass 4 output (the compiled  kernel  module)  if  pass  4
       completes  successfully.   This  cached  output  is  reused if the same
       script is translated again assuming the  same  conditions  exist  (same
       kernel version, same systemtap version, etc.).  Cached files are stored
       in  the  $SYSTEMTAP_DIR/cache  directory,  which  may  be  periodically
       cleaned/erased by the user.


       Systemtap  is  an administrative tool.  It exposes kernel internal data
       structures and  potentially  private  user  information.   It  acquires
       either root privileges

       To actually run the kernel objects it builds, a user must be one of the

       ·   the root user;

       ·   a member of the stapdev group; or

       ·   a member of the stapusr group.  Members of the  stapusr  group  can
           only  use  modules  located  in  the /lib/modules/VERSION/systemtap
           directory.  This directory must be owned by root and not  be  world

       The  kernel  modules  generated  by stap program are run by the staprun
       program.  The latter is a part of the Systemtap package,  dedicated  to
       module  loading and unloading (but only in the white zone), and kernel-
       to-user data transfer.  Since staprun does not perform  any  additional
       security  checks  on the kernel objects it is given, it would be unwise
       for a system administrator to add untrusted users  to  the  stapdev  or
       stapusr groups.

       The  translator  asserts certain safety constraints.  It aims to ensure
       that no handler routine can run for very long, allocate memory, perform
       unsafe  operations,  or  in  unintentionally interfere with the kernel.
       Use of script global variables is suitably locked  to  protect  against
       manipulation by concurrent probe handlers.  Use of guru mode constructs
       such as embedded C can violate these  constraints,  leading  to  kernel
       crash or data corruption.

       The  resource  use  limits  are  set by macros in the generated C code.
       These may be overridden with the -D flag.  A selection of these  is  as

              Maximum number of recursive function call levels, default 10.

              Maximum length of strings, default 128.

              Maximum  number  of  iterations  to  wait  for  locks  on global
              variables before declaring possible deadlock  and  skipping  the
              probe, default 1000.

              Maximum  number of statements to execute during any single probe
              hit (with interrupts disabled), default 1000.

              Maximum number of statements to execute during any single  probe
              hit which is executed with interrupts enabled (such as begin/end
              probes), default (MAXACTION * 10).

              Maximum number of rows in any single global array, default 2048.

              Maximum  number  of  soft  errors  before  an exit is triggered,
              default 0, which means  that  the  first  error  will  exit  the

              Maximum  number  of  skipped  reentrant probes before an exit is
              triggered, default 100.

              Minimum number of free kernel stack bytes required in  order  to
              run  a probe handler, default 1024.  This number should be large
              enough for the probe handler’s own needs, plus a safety  margin.

       Multipule  scripts  can  write data into a relay buffer concurrently. A
       host script provides an interface for accessing  its  relay  buffer  to
       guest  scripts.   Then,  the  output  of the guests are merged into the
       output of the host.  To run a script  as  a  host,  execute  stap  with
       -DRELAYHOST[=name]  option.  The name identifies your host script among
       several  hosts.   While   running   the   host,   execute   stap   with
       -DRELAYGUEST[=name]  to  add a guest script to the host.  Note that you
       must unload guests before unloading a host. If there  are  some  guests
       connected to the host, unloading the host will be failed.

       In  case  something  goes  wrong with stap or staprun after a probe has
       already started running, one may safely kill both user  processes,  and
       remove  the  active  probe kernel module with rmmod.  Any pending trace
       messages may be lost.

       In addition to the methods outlined above, the generated kernel  module
       also  uses  overload  processing to make sure that probes can’t run for
       too  long.   If  more  than  STP_OVERLOAD_THRESHOLD   cycles   (default
       500000000) have been spent in all the probes on a single cpu during the
       last STP_OVERLOAD_INTERVAL cycles (default 1000000000), the probes have
       overloaded the system and an exit is triggered.

       By  default,  overload processing is turned on for all modules.  If you
       would like to disable overload processing, define STP_NO_OVERLOAD.


       Systemtap performs best when it has access to the debugging information
       associated  with your kernel and modules.  However, if this information
       is not available, systemtap  can  still  support  probing  of  function
       entries  and returns using symbols read from vmlinux and/or the modules
       in /lib/modules.  Systemtap can also read the kernel symbol table  from
       a  text  file  such  as  /boot/  or  /proc/kallsyms.  See the
       ---kelf and ---kmap options.

       If systemtap finds relevant debugging information, it will use it  even
       if you specify ---kelf or ---kmap.

       Without  debugging  information, systemtap cannot support the following
       types of language constructs:

       ·   probe specifications that refer to source files or line numbers

       ·   probe specifications that refer to inline functions

       ·   statements that refer to $target variables

       ·   tapset-defined variables defined using any of the above constructs.
           In  particular,  at  this  writing, the prologue blocks for certain
           aliases in the syscall tapset  (e.g.,  contain  "if"
           statements  that refer to $target variables.  If your script refers
           to any such aliases, systemtap must have  access  to  the  kernel’s
           debugging information.

       Most  T  and t symbols correspond to function entry points, but some do
       not.  Based only  on  the  symbol  table,  systemtap  cannot  tell  the
       difference.   Placing return probes on symbols that aren’t entry points
       will most likely lead to kernel stack corruption.


              Systemtap data directory  for  cached  systemtap  files,  unless
              overridden by the SYSTEMTAP_DIR environment variable.

              Temporary  directory for systemtap files, including translated C
              code and kernel object.

              The automatic tapset search directory, unless overridden by  the
              SYSTEMTAP_TAPSET environment variable.

              The  runtime sources, unless overridden by the SYSTEMTAP_RUNTIME
              environment variable.

              The location of kernel module building infrastructure.

              The location of kernel debugging information when packaged  into
              the    kernel-debuginfo    RPM,   unless   overridden   by   the
              SYSTEMTAP_DEBUGINFO_PATH  environment  variable.   The   default
              value  for  this  variable is -:.debug:/usr/lib/debug.  Elfutils
              searches vmlinux in this path and it interprets the  path  as  a
              base  directory of which various subdirectories will be searched
              for finding modules.

              The auxiliary program supervising module  loading,  interaction,
              and unloading.


       stapprobes(5), stapfuncs(5), stapvars(5), stapex(5), awk(1), gdb(1)


       Use  the Bugzilla link off of the project web page or our mailing list.,<>.