Provided by: systemtap-doc_2.3-1ubuntu1_all bug


       stapprobes - systemtap probe points


       The  following  sections  enumerate the variety of probe points supported by the systemtap
       translator, and some of the additional aliases defined by standard tapset  scripts.   Many
       are individually documented in the 3stap manual section, with the probe:: prefix.


              probe PROBEPOINT [, PROBEPOINT] { [STMT ...] }

       A  probe  declaration  may list multiple comma-separated probe points in order to attach a
       handler to all of the named events.  Normally, the handler statements are run whenever any
       of events occur.

       The  syntax  of  a  single probe point is a general dotted-symbol sequence.  This allows a
       breakdown of the event namespace into parts, somewhat like the Domain Name System does  on
       the  Internet.   Each  component  identifier  may  be  parametrized  by a string or number
       literal, with a syntax like a function call.  A component may include a "*" character,  to
       expand  to  a  set  of  matching probe points.  It may also include "**" to match multiple
       sequential components at once.  Probe aliases likewise expand to other probe points.

       Probe aliases can be given on their own, or with a suffix.  The  suffix  attaches  to  the
       underlying probe point that the alias is expanded to. For example,
       expands to
       with the component maxactive(10) being recognized as a suffix.

       Normally,  each and every probe point resulting from wildcard- and alias-expansion must be
       resolved to some low-level  system  instrumentation  facility  (e.g.,  a  kprobe  address,
       marker, or a timer configuration), otherwise the elaboration phase will fail.

       However,  a  probe  point  may  be  followed  by  a  "?" character, to indicate that it is
       optional, and that no error should result if it fails  to  resolve.   Optionalness  passes
       down  through  all  levels of alias/wildcard expansion.  Alternately, a probe point may be
       followed by a "!" character, to indicate that it is both optional and sufficient.   (Think
       vaguely  of  the Prolog cut operator.) If it does resolve, then no further probe points in
       the same comma-separated list will be resolved.  Therefore, the "!"  sufficiency mark only
       makes sense in a list of probe point alternatives.

       Additionally,  a  probe  point  may  be  followed  by a "if (expr)" statement, in order to
       enable/disable the probe point on-the-fly. With the "if" statement, if the "expr" is false
       when  the  probe point is hit, the whole probe body including alias's body is skipped. The
       condition is stacked up through all levels  of  alias/wildcard  expansion.  So  the  final
       condition  becomes  the  logical-and  of  conditions  of all expanded alias/wildcard.  The
       expressions are necessarily restricted to global variables.

       These are all syntactically valid probe points.  (They are generally semantically invalid,
       depending  on  the  contents  of  the  tapsets,  and  the versions of kernel/user software

              kernel.function("no_such_function") ?
              module("awol").function("no_such_function") !
              signal.*? if (switch)

       Probes may be broadly classified into "synchronous" and "asynchronous".   A  "synchronous"
       event  is  deemed  to  occur  when  any  processor  executes an instruction matched by the
       specification.  This gives these probes a reference point (instruction address) from which
       more  contextual  data  may  be  available.   Other  families  of  probe  points  refer to
       "asynchronous" events such as timers/counters  rolling  over,  where  there  is  no  fixed
       reference  point  that  is  related.   Each  probe  point specification may match multiple
       locations (for example, using wildcards or aliases), and all  them  are  then  probed.   A
       probe  declaration  may  also contain several comma-separated specifications, all of which
       are probed.


       Resolving some probe points requires DWARF debuginfo or "debug symbols" for  the  specific
       part  being  instrumented.  For some others, DWARF is automatically synthesized on the fly
       from source code header files.  For others, it is not needed at all.   Since  a  systemtap
       script may use any mixture of probe points together, the union of their DWARF requirements
       has to be met on the computer where script  compilation  occurs.   (See  the  --use-server
       option  and  the  stap-server(8)  man  page  for  information about the remote compilation
       facility, which allows these requirements to be met on a different machine.)

       The following point lists many of the available probe point  families,  to  classify  them
       with respect to their need for DWARF debuginfo.

       DWARF                          NON-DWARF

       kernel.function, .statement    kernel.mark
       module.function, .statement    process.mark, process.plt
       process.function, .statement   begin, end, error, never
       process.mark (backup)          timer
       AUTO-DWARF                     kernel.statement.absolute
       kernel.trace                   kprobe.function
                                      process.begin, .end, .error


       The  probe  points  begin  and  end  are defined by the translator to refer to the time of
       session startup and shutdown.  All "begin" probe  handlers  are  run,  in  some  sequence,
       during  the startup of the session.  All global variables will have been initialized prior
       to this point.  All "end" probes are run, in some sequence, during the normal shutdown  of
       a  session,  such as in the aftermath of an exit () function call, or an interruption from
       the user.  In the case of an error-triggered shutdown, "end" probes are  not  run.   There
       are no target variables available in either context.

       If  the  order of execution among "begin" or "end" probes is significant, then an optional
       sequence number may be provided:


       The number N may be positive or negative.  The probe handlers are run in increasing order,
       and the order between handlers with the same sequence number is unspecified.  When "begin"
       or "end" are given without a sequence, they are effectively sequence zero.

       The error probe point is similar to the end probe, except that each such probe handler run
       when  the  session  ends  after  errors  have  occurred.   In such cases, "end" probes are
       skipped, but each "error" probe is still attempted.  This kind of probe  can  be  used  to
       clean  up  or  emit  a  "final  gasp".   It  may also be numerically parametrized to set a

       The probe point never is specially defined by the translator to mean "never".   Its  probe
       handler  is never run, though its statements are analyzed for symbol / type correctness as
       usual.  This probe point may be useful in conjunction with optional probes.

       The syscall.*  aliases define several hundred probes, too many to detail here.   They  are
       of the general form:


       Generally, two probes are defined for each normal system call as listed in the syscalls(2)
       manual page, one for entry and one for return.  Those system calls that  never  return  do
       not have a corresponding .return probe.

       Each probe alias provides a variety of variables. Looking at the tapset source code is the
       most reliable way.  Generally, each variable listed in the standard manual  page  is  made
       available  as  a script-level variable, so exposes filename, flags, and mode.
       In addition, a standard suite of variables is available at most aliases:

       argstr A pretty-printed form of the entire argument list, without parentheses.

       name   The name of the system call.

       retstr For return probes, a pretty-printed form of the system-call result.

       As usual for probe aliases, these variables are  all  simply  initialized  once  from  the
       underlying  $context  variables,  so  that  later  changes  to  $context variables are not
       automatically reflected.  Not all probe aliases obey  all  of  these  general  guidelines.
       Please report any bothersome ones you encounter as a bug.

       If  debuginfo  availability  is  a  problem, you may try using the non-DWARF syscall probe
       aliases instead.  Use the nd_syscall.   prefix  instead  of  syscall.   The  same  context
       variables are available, as far as possible.

       Intervals  defined  by  the  standard  kernel "jiffies" timer may be used to trigger probe
       handlers asynchronously.  Two probe point variants are supported by the translator:


       The probe handler is run every N jiffies (a kernel-defined unit of time, typically between
       1  and 60 ms).  If the "randomize" component is given, a linearly distributed random value
       in the range [-M..+M] is added to N every time the handler is run.  N is restricted  to  a
       reasonable range (1 to around a million), and M is restricted to be smaller than N.  There
       are no target variables provided in either context.  It is possible for such probes to  be
       run concurrently on a multi-processor computer.

       Alternatively,  intervals  may  be  specified in units of time.  There are two probe point
       variants similar to the jiffies timer:


       Here, N and M are specified in milliseconds, but the full options for  units  are  seconds
       (s/sec),  milliseconds (ms/msec), microseconds (us/usec), nanoseconds (ns/nsec), and hertz
       (hz).  Randomization is not supported for hertz timers.

       The actual resolution of the timers depends on the target kernel.  For  kernels  prior  to
       2.6.17,  timers  are  limited  to  jiffies  resolution, so intervals are rounded up to the
       nearest jiffies interval.  After 2.6.17, the  implementation  uses  hrtimers  for  tighter
       precision,  though  the  actual resolution will be arch-dependent.  In either case, if the
       "randomize" component is given, then the random value will be added to the interval before
       any rounding occurs.

       Profiling timers are also available to provide probes that execute on all CPUs at the rate
       of the system tick (CONFIG_HZ).  This probe takes no parameters.  On some kernels, this is
       a  one-concurrent-user-only  or  disabled  facility, resulting in error -16 (EBUSY) during
       probe registration.


       Full context information of the  interrupted  process  is  available,  making  this  probe
       suitable for a time-based sampling profiler.

       It  is  recommended  to use the tapset probe timer.profile rather than timer.profile.tick.
       This  probe  point  behaves  identically  to  timer.profile.tick   when   the   underlying
       functionality  is  available,  and  falls  back  to using perf.sw.cpu_clock on some recent
       kernels which lack the corresponding profile timer facility.

       This  family  of  probe  points  uses  symbolic  debugging  information  for  the   target
       kernel/module/program,  as  may  be  found  in  unstripped  executables,  or  the separate
       debuginfo packages.  They allow placement of probes logically into the execution  path  of
       the  target  program,  by specifying a set of points in the source or object code.  When a
       matching statement executes on any processor, the probe handler is run in that context.

       Points in a kernel, which are identified by module, source  file,  line  number,  function
       name, or some combination of these.

       Here  is a list of probe point families currently supported.  The .function variant places
       a probe near the beginning of the named function, so  that  parameters  are  available  as
       context variables.  The .return variant places a probe at the moment after the return from
       the named function, so the return value is available as the  "$return"  context  variable.
       The  .inline  modifier  for  .function  filters  the  results to include only instances of
       inlined functions.  The  .call  modifier  selects  the  opposite  subset.   The  .exported
       modifier  filters the results to include only exported functions.  Inline functions do not
       have an identifiable return point, so .return is not  supported  on  .inline  probes.  The
       .statement  variant  places a probe at the exact spot, exposing those local variables that
       are visible there.


       (See the USER-SPACE section below for more information on the process probes.)

       In the above list, MPATTERN stands for a string literal that aims to identify  the  loaded
       kernel  module  of interest and LPATTERN stands for a source program label.  Both MPATTERN
       and LPATTERN may include the "*" "[]", and "?" wildcards.  PATTERN  stands  for  a  string
       literal that aims to identify a point in the program.  It is made up of three parts:

       ·   The  first part is the name of a function, as would appear in the nm program's output.
           This part may use the "*" and "?" wildcarding operators to match multiple names.

       ·   The second part is optional and begins with the "@" character.  It is followed by  the
           path to the source file containing the function, which may include a wildcard pattern,
           such as mm/slab*.  If it does not match as is, an implicit "*/"  is  optionally  added
           before  the  pattern,  so  that  a  script need only name the last few components of a
           possibly long source directory path.

       ·   Finally, the third part is optional if the file name part was  given,  and  identifies
           the  line  number  in  the source file preceded by a ":" or a "+".  The line number is
           assumed to be an absolute line number if preceded by a ":", or relative to  the  entry
           of  the  function  if preceded by a "+".  All the lines in the function can be matched
           with ":*".  A range of lines x through y can be matched with ":x-y".

       As an alternative, PATTERN may be a numeric constant,  indicating  an  address.   Such  an
       address  may  be  found from symbol tables of the appropriate kernel / module object file.
       It is verified against known statement code boundaries, and will be relocated for  use  at
       run time.

       In  guru  mode only, absolute kernel-space addresses may be specified with the ".absolute"
       suffix.   Such  an  address  is  considered  already  relocated,  as  if  it   came   from
       /proc/kallsyms, so it cannot be checked against statement/instruction boundaries.

       Many  of  the source-level context variables, such as function parameters, locals, globals
       visible in the compilation unit, may be visible to probe  handlers.   They  may  refer  to
       these  variables  by  prefixing  their  name  with "$" within the scripts.  In addition, a
       special syntax allows limited traversal of structures, pointers, and arrays.  More  syntax
       allows pretty-printing of individual variables or their groups.  See also @cast.

       $var   refers  to  an  in-scope  variable "var".  If it's an integer-like type, it will be
              cast to a 64-bit int for systemtap script use.  String-like pointers (char  *)  may
              be  copied  to  systemtap  string  values  using  the  kernel_string or user_string

              an alternative syntax for $varname

              refers to the global (either file local or external) variable varname defined  when
              the  file  src/file.c was compiled. The CU in which the variable is resolved is the
              first CU in the module of the probe point which matches the given file name at  the
              end  and  has the shortest file name path (e.g. given @var("foo@bar/baz.c") and CUs
              with file name paths src/sub/module/bar/baz.c and src/bar/baz.c the second CU  will
              be chosen to resolve the (file) global variable foo

       $var->field traversal via a structure's or a pointer's field.  This
              generalized  indirection operator may be repeated to follow more levels.  Note that
              the .  operator is not used for plain structure members, only -> for both purposes.
              (This is because "." is reserved for string concatenation.)

              is  available  in  return probes only for functions that are declared with a return

              indexes into an array.  The index given with a literal number or even an  arbitrary
              numeric expression.

       A number of operators exist for such basic context variable expressions:

       $$vars expands to a character string that is equivalent to
              sprintf("parm1=%x ... parmN=%x var1=%x ... varN=%x",
                      parm1, ..., parmN, var1, ..., varN)
       for each variable in scope at the probe point.  Some values may be printed as =?  if their
       run-time location cannot be found.

              expands to a subset of $$vars for only local variables.

              expands to a subset of $$vars for only function parameters.

              is available in return probes only.  It expands to a string that is  equivalent  to
              sprintf("return=%x", $return) if the probed function has a return value, or else an
              empty string.

       & $EXPR
              expands to the  address  of  the  given  context  variable  expression,  if  it  is

              expands  to 1 or 0 iff the given context variable expression is resolvable, for use
              in conditionals such as
              @defined($foo->bar) ? $foo->bar : 0

       $EXPR$ expands to a string with all of $EXPR's members, equivalent to
              sprintf("{.a=%i, .b=%u, .c={...}, .d=[...]}",
                       $EXPR->a, $EXPR->b)

              expands to a string with all of $var's members and submembers, equivalent to
              sprintf("{.a=%i, .b=%u, .c={.x=%p, .y=%c}, .d=[%i, ...]}",
                      $EXPR->a, $EXPR->b, $EXPR->c->x, $EXPR->c->y, $EXPR->d[0])

       For the  kernel  ".return"  probes,  only  a  certain  fixed  number  of  returns  may  be
       outstanding.   The  default  is a relatively small number, on the order of a few times the
       number of physical CPUs.  If many different threads concurrently call  the  same  blocking
       function,  such  as  futex(2)  or  read(2),  this  limit  could  be  exceeded, and skipped
       "kretprobes" would be reported by "stap -t".  To work around this, specify a
              probe FOO.return.maxactive(NNN)
       suffix, with a large enough NNN  to  cover  all  expected  concurrently  blocked  threads.
       Alternately, use the
              stap -DKRETACTIVE=NNNN
       stap command line macro setting to override the default for all ".return" probes.

       For  ".return"  probes, context variables other than the "$return" may be accessible, as a
       convenience for a script programmer wishing to access function parameters.   These  values
       are  snapshots  taken  at the time of function entry.  Local variables within the function
       are not generally accessible, since those variables did not exist in allocated/initialized
       form at the snapshot moment.

       In addition, arbitrary entry-time expressions can also be saved for ".return" probes using
       the @entry(expr) operator.  For example, one can compute the elapsed time of a function:
              probe kernel.function("do_filp_open").return {
                  println( get_timeofday_us() - @entry(get_timeofday_us()) )

       The following table  summarizes  how  values  related  to  a  function  parameter  context
       variable, a pointer named addr, may be accessed from a .return probe.

       at-entry value   past-exit value

       $addr            not available
       $addr->x->y      @cast(@entry($addr),"struct zz")->x->y
       $addr[0]         {kernel,user}_{char,int,...}(& $addr[0])

       In  absence of debugging information, entry & exit points of kernel & module functions can
       be probed using the "kprobe" family of probes.  However, these do not  permit  looking  up
       the arguments / local variables of the function.  Following constructs are supported :

       Probes  of  type  function  are  recommended  for kernel functions, whereas probes of type
       module are recommended for probing  functions  of  the  specified  module.   In  case  the
       absolute  address  of  a  kernel  or  module  function  is  known, statement probes can be

       Note that FUNCTION and MODULE names must not contain wildcards, or the probe will  not  be
       registered.  Also, statement probes must be run under guru-mode only.

       Support  for  user-space  probing  is  available  for kernels that are configured with the
       utrace extensions, or have the uprobes facility  in  linux  3.5.   (Various  kernel  build
       configuration options need to be enabled; systemtap will advise if these are missing.)

       There are several forms.  First, a non-symbolic probe point:
       is  analogous  to  kernel.statement(ADDRESS).absolute  in  that  both use raw (unverified)
       virtual addresses and provide no $variables.  The target PID  parameter  must  identify  a
       running  process, and ADDRESS should identify a valid instruction address.  All threads of
       that process will be probed.

       Second, non-symbolic user-kernel interface events handled by utrace may be probed:

       A .begin probe gets called when new process described by PID or FULLPATH gets created.   A
       .thread.begin  probe  gets  called  when  a  new  thread described by PID or FULLPATH gets
       created.  A .end probe gets called when process described by  PID  or  FULLPATH  dies.   A
       .thread.end probe gets called when a thread described by PID or FULLPATH dies.  A .syscall
       probe gets called when a thread described by PID or FULLPATH makes  a  system  call.   The
       system  call  number  is  available  in  the  $syscall  context  variable, and the first 6
       arguments of the system call are available in the $argN (ex. $arg1,  $arg2,  ...)  context
       variable.   A .syscall.return probe gets called when a thread described by PID or FULLPATH
       returns from a system call.  The system call number is available in the  $syscall  context
       variable,  and  the  return  value  of the system call is available in the $return context
       variable.  A .insn probe gets called for every single-stepped instruction of  the  process
       described  by  PID  or  FULLPATH.  A .insn.block probe gets called for every block-stepped
       instruction of the process described by PID or FULLPATH.

       If a process probe is specified without a PID  or  FULLPATH,  all  user  threads  will  be
       probed.   However, if systemtap was invoked with the -c or -x options, then process probes
       are restricted to the process hierarchy associated with the target process.  If a  process
       probe  is  specified without a PID or FULLPATH, but with the -c option, the PATH of the -c
       cmd will be heuristically filled into the process PATH.

       Third, symbolic static instrumentation compiled into programs and shared libraries may  be

       A  .mark  probe  gets  called  via  a  static probe which is defined in the application by
       STAP_PROBE1(PROVIDER,LABEL,arg1), which are macros defined in sys/sdt.h.  The PROVIDER  is
       an  arbitrary application identifier, LABEL is the marker site identifier, and arg1 is the
       integer-typed argument.  STAP_PROBE1 is used for probes with 1  argument,  STAP_PROBE2  is
       used  for probes with 2 arguments, and so on.  The arguments of the probe are available in
       the context variables $arg1, $arg2, ...  An alternative to using the STAP_PROBE macros  is
       to  use the dtrace script to create custom macros.  Additionally, the variables $$name and
       $$provider are available as parts of the probe point  name.   The  sys/sdt.h  macro  names
       DTRACE_PROBE* are available as aliases for STAP_PROBE*.

       Finally, full symbolic source-level probes in user-space programs and shared libraries are
       supported.  These are exactly analogous to the symbolic DWARF-based  kernel/module  probes
       described  above.   They  expose  the  same  sorts  of  context  $variables  for  function
       parameters, local variables, and so on.

       Note that for all process probes, PATH names refer to executables that  are  searched  the
       same  way  shells  do:  relative to the working directory if they contain a "/" character,
       otherwise in $PATH.  If PATH names refer to scripts, the actual interpreters (specified in
       the  script  in  the first line after the #! characters) are probed.  If PATH is a process
       component parameter referring to shared libraries  then  all  processes  that  map  it  at
       runtime would be selected for probing.  If PATH is a library component parameter referring
       to shared libraries then the process specified by the process component would be selected.

       A .plt probe will probe functions in the program linkage table corresponding to  the  rest
       of  the probe point.  .plt can be specified as a shorthand for .plt("*").  The symbol name
       is available as a $$name context variable; function arguments  are  not  available,  since
       PLTs are processed without debuginfo.

       If  the  PATH string contains wildcards as in the MPATTERN case, then standard globbing is
       performed to find all matching paths.  In this case, the $PATH environment variable is not

       If  systemtap was invoked with the -c or -x options, then process probes are restricted to
       the process hierarchy associated with the target process.

       Support for probing Java methods is available using Byteman as a backend.  Byteman  is  an
       instrumentation tool from the JBoss project which systemtap can use to monitor invocations
       for a specific method or line in a Java program.

       Systemtap does so by generating a Byteman script listing the probes to instrument and then
       invoking the Byteman bminstall utility.

       This Java instrumentation support is currently a prototype feature with major limitations.
       Moreover, Java probing currently does not work across users;  the  stap  script  must  run
       (with  appropriate  permissions)  under  the same user that the Java process being probed.
       (Thus a stap script under root currently cannot probe Java methods in a non-root-user Java

       The first probe type refers to Java processes by the name of the Java process:
       The PNAME argument must be a pre-existing jvm pid, and be identifiable via a jps listing.

       The  PATTERN  parameter specifies the signature of the Java method to probe. The signature
       must consist of the exact name of the method, followed by a bracketed list of the types of
       the arguments, for instance "myMethod(int,double,Foo)". Wildcards are not supported.

       The  probe  can be set to trigger at a specific line within the method by appending a line
       number with colon, just as in other types of probes: "myMethod(int,double,Foo):245".

       The CLASSNAME parameter identifies the Java class the method belongs to,  either  with  or
       without  the  package qualification. By default, the probe only triggers on descendants of
       the class that do not override the method  definition  of  the  original  class.  However,
       CLASSNAME  can  take an optional caret prefix, as in ^, which specifies that
       the probe should also trigger on all descendants of MyClass  that  override  the  original
       method.  For instance, every method with signature foo(int) in program can be
       probed at once using

       The second probe type works analogously, but refers to Java processes by PID:
       (PIDs for an already running process can be obtained using the jps(1) utility.)

       Context variables defined within java probes include $arg1 through $arg10 (for up  to  the
       first 10 arguments of a method), represented as integers or strings.

       These probe points allow procfs "files" in /proc/systemtap/MODNAME to be created, read and
       written using a permission that may be modified using  the  proper  umask  value.  Default
       permissions  are 0400 for read probes, and 0200 for write probes. If both a read and write
       probe are being used on the same file, a default permission of 0600 will be  used.   Using
       procfs.umask(0040).read  would  result in a 0404 permission set for the file.  (MODNAME is
       the name of the systemtap module). The proc filesystem is  a  pseudo-filesystem  which  is
       used  an  an  interface  to kernel data structures. There are several probe point variants
       supported by the translator:


       PATH is the file name (relative to /proc/systemtap/MODNAME) to be created.  If no PATH  is
       specified (as in the last two variants above), PATH defaults to "command".

       When  a  user  reads  /proc/systemtap/MODNAME/PATH, the corresponding procfs read probe is
       triggered.  The string data to be read should be assigned to a variable named $value, like

              procfs("PATH").read { $value = "100\n" }

       When a user writes into /proc/systemtap/MODNAME/PATH, the corresponding procfs write probe
       is triggered.  The data the user wrote is available in the string variable  named  $value,
       like this:

              procfs("PATH").write { printf("user wrote: %s", $value) }

       MAXSIZE  is  the  size of the procfs read buffer.  Specifying MAXSIZE allows larger procfs
       output.  If no MAXSIZE is specified, the procfs read buffer defaults to STP_PROCFS_BUFSIZE
       (which  defaults  to MAXSTRINGLEN, the maximum length of a string).  If setting the procfs
       read buffers for more than one  file  is  needed,  it  may  be  easiest  to  override  the
       STP_PROCFS_BUFSIZE definition.  Here's an example of using MAXSIZE:

                  $value = "long string..."
                  $value .= "another long string..."
                  $value .= "another long string..."
                  $value .= "another long string..."

       These  probe  points allow observation of network packets using the netfilter mechanism. A
       netfilter probe in systemtap corresponds to a netfilter  hook  function  in  the  original
       netfilter  probes  API.  It  is  probably more convenient to use tapset::netfilter(3stap),
       which wraps the  primitive  netfilter  hooks  and  does  the  work  of  extracting  useful
       information from the context variables.

       There are several probe point variants supported by the translator:


       PROTOCOL_F  is  the  protocol  family  to  listen  for,  currently  one  of  NFPROTO_IPV4,

       HOOKNAME is the point, or 'hook', in the protocol stack at which to intercept the  packet.
       The  available  hook names for each protocol family are taken from the kernel header files
       <linux/netfilter_ipv4.h>,    <linux/netfilter_ipv6.h>,     <linux/netfilter_arp.h>     and
       <linux/netfilter_bridge.h>.  For  instance,  allowable  hook  names  for  NFPROTO_IPV4 are

       PRIORITY  is  an  integer  priority  giving  the  order in which the probe point should be
       triggered relative to any other netfilter hook functions which trigger on the same packet.
       Hook  functions  execute  on each packet in order from smallest priority number to largest
       priority number. If no PRIORITY is specified (as in the first  two  probe  point  variants
       above), PRIORITY defaults to "0".

       There  are  a number of predefined priority names of the form NF_IP_PRI_* and NF_IP6_PRI_*
       which  are   defined   in   the   kernel   header   files   <linux/netfilter_ipv4.h>   and
       <linux/netfilter_ipv6.h>  respectively.  The  script  is permitted to use these instead of
       specifying an integer priority. (The  probe  points  for  NFPROTO_ARP  and  NFPROTO_BRIDGE
       currently  do not expose any named hook priorities to the script writer.)  Thus, allowable
       ways to specify the priority include:


       A script using guru mode is permitted to specify any identifier or number as the parameter
       for  hook, pf, and priority. This feature should be used with caution, as the parameter is
       inserted verbatim into the C code generated by systemtap.

       The netfilter probe points define the following context variables:

       $skb   The address of the sk_buff struct representing the packet. See <linux/skbuff.h> for
              details   on   how   to   use   this   struct,  or  alternatively  use  the  tapset
              tapset::netfilter(3stap) for easy access to key information.

       $in    The address of the net_device struct representing the network device on  which  the
              packet  was  received  (if  any). May be 0 if the device is unknown or undefined at
              that stage in the protocol stack.

       $out   The address of the net_device struct representing the network device on  which  the
              packet  will  be  sent  (if any). May be 0 if the device is unknown or undefined at
              that stage in the protocol stack.

              (Guru  mode   only.)   Assigning   one   of   the   verdict   values   defined   in
              <linux/netfilter.h>  to  this  variable  alters  the further progress of the packet
              through the protocol stack. For instance, the following guru mode script forces all
              ipv6 network packets to be dropped:

              probe"NFPROTO_IPV6").hook("NF_IP6_PRE_ROUTING") {
                $verdict = 0 /* nf_drop */

       For  convenience,  unlike the primitive probe points discussed here, the probes defined in
       tapset::netfilter(3stap) export the lowercase names of the verdict constants (e.g. NF_DROP
       becomes nf_drop) as local variables.

       This family of probe points hooks up to static probing markers inserted into the kernel or
       modules.  These markers are special macro calls inserted  by  kernel  developers  to  make
       probing  faster  and more reliable than with DWARF-based probes.  Further, DWARF debugging
       information is not required to probe markers.

       Marker  probe  points  begin  with  kernel.   The  next  part  names  the  marker  itself:
       mark("name").  The marker name string, which may contain the usual wildcard characters, is
       matched against the names given to the marker macros when the  kernel  and/or  module  was
       compiled.     Optionally,  you can specify format("format").  Specifying the marker format
       string allows differentiation between two markers with the same name but different  marker
       format strings.

       The  handler  associated  with  a  marker-based  probe  may  read  the optional parameters
       specified at the macro call site.  These are named $arg1 through $argNN, where NN  is  the
       number  of parameters supplied by the macro.  Number and string parameters are passed in a
       type-safe manner.

       The marker format string associated with a marker is available in $format.  And  also  the
       marker name string is available in $name.

       This  family  of  probe  points  hooks  up to static probing tracepoints inserted into the
       kernel or modules.  As with markers, these tracepoints are special macro calls inserted by
       kernel  developers  to make probing faster and more reliable than with DWARF-based probes,
       and DWARF debugging information is not required to probe tracepoints.  Tracepoints have an
       extra advantage of more strongly-typed parameters than markers.

       Tracepoint  probes  begin  with  kernel.   The  next  part  names  the  tracepoint itself:
       trace("name").   The  tracepoint  name  string,  which  may  contain  the  usual  wildcard
       characters,  is  matched  against  the  names  defined  by  the  kernel  developers in the
       tracepoint header files.

       The handler associated with a tracepoint-based probe  may  read  the  optional  parameters
       specified  at  the  macro  call site.  These are named according to the declaration by the
       tracepoint  author.   For  example,  the  tracepoint  probe   kernel.trace("sched_switch")
       provides  the parameters $rq, $prev, and $next.  If the parameter is a complex type, as in
       a struct pointer, then a script can access fields with the same syntax  as  DWARF  $target
       variables.   Also, tracepoint parameters cannot be modified, but in guru-mode a script may
       modify fields of parameters.

       The name of the tracepoint is available in $$name, and a string of  name=value  pairs  for
       all parameters of the tracepoint is available in $$vars or $$parms.

       This family of probes is used to set hardware watchpoints for a given
        (global) kernel symbol. The probes take three components as inputs :

       1.  The  virtualaddress/name  of the kernel symbol to be traced is supplied as argument to
       this class of probes. ( Probes for only data  segment  variables  are  supported.  Probing
       local variables of a function cannot be done.)

       2. Nature of access to be probed : a.  .write probe gets triggered when a write happens at
       the specified address/symbol name.  b.  rw probe is triggered when either a read or  write

       3.   .length  (optional)  Users  have  the option of specifying the address interval to be
       probed using "length" constructs. The  user-specified  length  gets  approximated  to  the
       closest possible address length that the architecture can support. If the specified length
       exceeds the limits imposed  by  architecture,  an  error  message  is  flagged  and  probe
       registration  fails.   Wherever  'length'  is  not  specified,  the  translator requests a
       hardware breakpoint probe of length 1. It should be noted that the "length"  construct  is
       not valid with symbol names.

       Following constructs are supported :

       This  set  of  probes  make use of the debug registers of the processor, which is a scarce
       resource. (4 on x86 , 1 on powerpc ) The script translation flags  a  warning  if  a  user
       requests  more  hardware  breakpoint  probes  than  the  limits  set  by architecture. For
       example,a pass-2 warning is flashed when an input script requests  5  hardware  breakpoint
       probes on an x86 system while x86 architecture supports a maximum of 4 breakpoints.  Users
       are cautioned to set probes judiciously.

       This prototype family of probe points interfaces to the kernel "perf event" infrastructure
       for controlling hardware performance counters.  The events being attached to are described
       by the "type", "config" fields of the perf_event_attr structure, and  are  sampled  at  an
       interval governed by the "sample_period" field.

       These fields are made available to systemtap scripts using the following syntax:
              probe perf.type(NN).config(MM).sample(XX)
              probe perf.type(NN).config(MM)
              probe perf.type(NN).config(MM).process("PROC")
              probe perf.type(NN).config(MM).counter("COUNTER")
              probe perf.type(NN).config(MM).process("PROC").counter("COUNTER")
       The systemtap probe handler is called once per XX increments of the underlying performance
       counter.  The default sampling count is  1000000.   The  range  of  valid  type/config  is
       described  by  the  perf_event_open(2)  system  call,  and/or the linux/perf_event.h file.
       Invalid combinations or exhausted hardware  counter  resources  result  in  errors  during
       systemtap  script  startup.   Systemtap does not sanity-check the values: it merely passes
       them through to the kernel for error- and safety-checking.   By  default  the  perf  event
       probe  is systemwide unless .process is specified, which will bind the probe to a specific
       task.  If the name is omitted then it is inferred from the  stap  -c  argument.    A  perf
       event  can  be read on demand using .counter.  The body of the perf probe handler will not
       be invoked for a .counter probe; instead, the counter is read in a user space probe via:

          process("PROCESS").statement("func@file") {stat <<< @perf("NAME")}


       Here are some example probe points, defining the associated events.

       begin, end, end
              refers to the startup and normal shutdown  of  the  session.   In  this  case,  the
              handler would run once during startup and twice during shutdown.

              refers to a periodic interrupt, every 1000 +/- 200 jiffies.

       kernel.function("*init*"), kernel.function("*exit*")
              refers to all kernel functions with "init" or "exit" in the name.

              refers  to any functions within the "kernel/time.c" file that span line 240.   Note
              that this is  not  a  probe  at  the  statement  at  that  line  number.   Use  the
              kernel.statement probe instead.

              refers to an STAP_MARK(getuid, ...) macro call in the kernel.

              refers to the moment of return from all functions with "sync" in the name in any of
              the USB drivers.

              refers to the first byte of the statement whose compiled instructions  include  the
              given address in the kernel.

              refers to the statement of line 296 within "kernel/time.c".

              refers to the statement at line bio_init+3 within "fs/bio.c"."pid_max").write
              refers to a hardware breakpoint of type "write" set on pid_max

              refers to the group of probe aliases with any name in the third position


       stap(1), probe::*(3stap), tapset::*(3stap)