Provided by: pdsh_2.31-3build2_amd64 bug

NAME

       pdsh - issue commands to groups of hosts in parallel

SYNOPSIS

       pdsh [options]... command

DESCRIPTION

       pdsh is a variant of the rsh(1) command. Unlike rsh(1), which runs commands on a single remote host, pdsh
       can run multiple remote commands in parallel. pdsh uses a "sliding window"  (or  fanout)  of  threads  to
       conserve resources on the initiating host while allowing some connections to time out.

       When  pdsh  receives  SIGINT (ctrl-C), it lists the status of current threads. A second SIGINT within one
       second terminates the program. Pending threads may be canceled by issuing ctrl-Z  within  one  second  of
       ctrl-C.   Pending  threads  are  those  that  have not yet been initiated, or are still in the process of
       connecting to the remote host.

       If a remote command is not specified on the command line, pdsh runs interactively, prompting for commands
       and  executing  them  when terminated with a carriage return. In interactive mode, target nodes that time
       out on the first command are not contacted  for  subsequent  commands,  and  commands  prefixed  with  an
       exclamation point will be executed on the local system.

       The  core  functionality  of  pdsh  may  be supplemented by dynamically loadable modules. The modules may
       provide a new connection protocol (replacing the standard rcmd(3) protocol  used  by  rsh(1)),  filtering
       options  (e.g. removing hosts that are "down" from the target list), and/or host selection options (e.g.,
       -a selects all hosts from a configuration file.). By default, pdsh must have at least one  "rcmd"  module
       loaded. See the RCMD MODULES section for more information.

RCMD MODULES

       The  method  by  which  pdsh runs commands on remote hosts may be selected at runtime using the -R option
       (See OPTIONS below).  This functionality is ultimately implemented via dynamically loadable modules,  and
       so  the list of available options may be different from installation to installation. A list of currently
       available rcmd modules is printed when using any of the -h, -V, or -L options. The  default  rcmd  module
       will also be displayed with the -h and -V options.

       A list of rcmd modules currently distributed with pdsh follows.

       rsh     Uses  an  internal,  thread-safe implementation of BSD rcmd(3) to run commands using the standard
               rsh(1) protocol.

       exec    Executes an arbitrary command for each target host. The first of the pdsh remote arguments is the
               local  command  to  execute,  followed  by  any  further  arguments.  Some  simple parameters are
               substitued on the command line, including %h for the target hostname, %u for the remote username,
               and  %n  for the remote rank [0-n] (To get a literal % use %%).  For example, the following would
               duplicate using the ssh module to run hostname(1) across the hosts foo[0-10]:

                  pdsh -R exec -w foo[0-10] ssh -x -l %u %h hostname

               and this command line would run grep(1) in parallel across the files console.foo[0-10]:

                  pdsh -R exec -w foo[0-10] grep BUG console.%h

       ssh     Uses a variant of popen(3) to run multiple copies of the ssh(1) command.

       mrsh    This module uses the mrsh(1) protocol to execute jobs on remote hosts.  The mrsh protocol uses  a
               credential  based authentication, forgoing the need to allocate reserved ports. In other aspects,
               it acts just like rsh. Remote nodes must be running mrshd(8) in order  for  the  mrsh  module  to
               work.

       qsh     Allows pdsh to execute MPI jobs over QsNet. Qshell propagates the current working directory, pdsh
               environment, and Elan capabilities to the remote process. The following environment variable  are
               also  appended  to the environment: RMS_RANK, RMS_NODEID, RMS_PROCID, RMS_NNODES, and RMS_NPROCS.
               Since pdsh needs to run setuid  root  for  qshell  support,  qshell  does  not  directly  support
               propagation  of  LD_LIBRARY_PATH  and  LD_PREOPEN.  Instead the QSHELL_REMOTE_LD_LIBRARY_PATH and
               QSHELL_REMOTE_LD_PREOPEN environment  variables  will  may  be  used  and  will  be  remapped  to
               LD_LIBRARY_PATH and LD_PREOPEN by the qshell daemon if set.

       mqsh    Similar to qshell, but uses the mrsh protocol instead of the rsh protocol.

       krb4    The  krb4  module  allows users to execute remote commands after authenticating with kerberos. Of
               course, the remote rshd daemons must be kerberized.

       xcpu    The xcpu module uses the xcpu service to execute remote commands.

OPTIONS

       The list of available options is determined at runtime by supplementing the list of standard pdsh options
       with  any  options  provided by loaded rcmd and misc modules.  In some cases, options provided by modules
       may conflict with each other. In these cases, the modules are incompatible and the  first  module  loaded
       wins.

Standard target nodelist options

       -w TARGETS,...
              Target and or filter the specified list of hosts. Do not use with any other node selection options
              (e.g. -a, -g, if they  are  available).  No  spaces  are  allowed  in  the  comma-separated  list.
              Arguments  in  the TARGETS list may include normal host names, a range of hosts in hostlist format
              (See HOSTLIST EXPRESSIONS), or a single `-' character to read the list of hosts on stdin.

              If a host or hostlist is preceded by a `-' character, this causes those  hosts  to  be  explicitly
              excluded.  If  the  argument  is preceded by a single `^' character, it is taken to be the path to
              file containing a list of hosts, one per line. If the item begins with  a  `/'  character,  it  is
              taken   as a regular expression on which to filter the list of hosts (a regex argument may also be
              optionally trailed by another '/', e.g.  /node.*/). A regex or file  name  argument  may  also  be
              preceeded by a minus `-' to exclude instead of include thoses hosts.

              A  list  of  hosts  may  also  be  preceded by "user@" to specify a remote username other than the
              default, or "rcmd_type:" to specify an alternate rcmd connection type for these hosts.  When  used
              together,  the  rcmd type must be specified first, e.g. "ssh:user1@host0" would use ssh to connect
              to host0 as user "user1."

       -x host,host,...
              Exclude the specified hosts. May be specified in conjunction with other target node  list  options
              such  as  -a  and  -g  (when available). Hostlists may also be specified to the -x option (see the
              HOSTLIST EXPRESSIONS section below). Arguments to -x may also be preceeded by the  filename  (`^')
              and  regex  ('/') characters as described above, in which case the resulting hosts are excluded as
              if they had been given to -w and preceeded with the minus `-' character.

Standard pdsh options

       -S     Return the largest of the remote command return values.

       -h     Output usage menu and quit. A list of available rcmd modules will also be printed at  the  end  of
              the usage message.

       -s     Only on AIX, separate remote command stderr and stdout into two sockets.

       -q     List option values and the target nodelist and exit without action.

       -b     Disable ctrl-C status feature so that a single ctrl-C kills parallel job. (Batch Mode)

       -l user
              This  option may be used to run remote commands as another user, subject to authorization. For BSD
              rcmd, this means the invoking user and system must be listed in the userĀ“s .rhosts file (even  for
              root).

       -t seconds
              Set the connect timeout. Default is 10 seconds.

       -u seconds
              Set  a  limit  on the amount of time a remote command is allowed to execute.  Default is no limit.
              See note in LIMITATIONS if using -u with ssh.

       -f number
              Set the maximum number of simultaneous remote commands to number.  The default is 32.

       -R name
              Set rcmd module to name. This option may also be set via the PDSH_RCMD_TYPE environment  variable.
              A  list of available rcmd modules may be obtained via the -h, -V, or -L options.  The default will
              be listed with -h or -V.

       -M name,...
              When multiple misc modules provide the same options to pdsh, the first module  initialized  "wins"
              and  subsequent  modules  are  not loaded.  The -M option allows a list of modules to be specified
              that will be force-initialized before all  others,  in-effect  ensuring  that  they  load  without
              conflict   (unless   they  conflict  with  eachother).  This  option  may  also  be  set  via  the
              PDSH_MISC_MODULES environment variable.

       -L     List info on all loaded pdsh modules and quit.

       -N     Disable hostname: prefix on lines of output.

       -d     Include more complete thread status when SIGINT is received, and display connect and command  time
              statistics on stderr when done.

       -V     Output pdsh version information, along with list of currently loaded modules, and exit.

qsh/mqsh module options

       -n tasks_per_node
              Set the number of tasks spawned per node. Default is 1.

       -m block | cyclic
              Set block versus cyclic allocation of processes to nodes. Default is block.

       -r railmask
              Set the rail bitmask for a job on a multirail system. The default railmask is 1, which corresponds
              to rail 0 only. Each bit set in the argument to -r corresponds to a rail on the system, so a value
              of 2 would correspond to rail 1 only, and 3 would indicate to use both rail 1 and rail 0.

machines module options

       -a     Target all nodes from machines file.

genders module options

       In addition to the genders options presented below, the genders attribute pdsh_rcmd_type may also be used
       in the genders database to specify an alternate rcmd connect type than the pdsh default  for  hosts  with
       this attribute. For example, the following line in the genders file

         host0 pdsh_rcmd_type=ssh

       would cause pdsh to use ssh to connect to host0, even if rsh were the default.  This can be overridden on
       the commandline with the "rcmd_type:host0" syntax.

       -A     Target all nodes in genders database. The -A option will target every host listed in genders -- if
              you want to omit some hosts by default, see the -a option below.

       -a     Target  all  nodes  in  genders  database except those with the "pdsh_all_skip" attribute. This is
              shorthand for running "pdsh -A -X pdsh_all_skip ..."

       -g attr[=val][,attr[=val],...]
              Target nodes that match any of the specified genders attributes (with optional values).  Conflicts
              with  the  -a  option.  If  used  in combination with other node selection options like -w, the -g
              option will select from the supplied node list, instead of from  the  genders  file  as  a  whole.
              Otherwise,  This option targets the alternate hostnames in the genders database by default. The -i
              option provided by the genders module may be used to translate  these  to  the  canonical  genders
              hostnames.  If  the  installed  version of genders supports it, attributes supplied to -g may also
              take the form of genders queries. Genders queries will query the genders database for  the  union,
              intersection, difference, or complement of genders attributes and values.  The set operation union
              is represented by  two  pipe  symbols  ('||'),  intersection  by  two  ampersand  symbols  ('&&'),
              difference  by two minus symbols ('--'), and complement by a tilde ('~').  Parentheses may be used
              to change the order of operations. See the nodeattr(1) manpage for examples of genders queries.

       -X attr[=val][,attr[=val],...]
              Exclude nodes that match any of the specified genders attributes (optionally with  values).   This
              option  may  be used in combination with any other of the node selection options (e.g. -w, -g, -a,
              -X may also take the form of genders queries. Please see documentation for the genders  -g  option
              for more information about genders queries.

       -i     Request translation between canonical and alternate hostnames.

       -F filename
              Read  genders  information  from  filename instead of the system default genders file. If filename
              doesn't specify an absolute path then it is taken to be relative to the directory specified by the
              PDSH_GENDERS_DIR  environment  variable  (/etc  by default). An alternate genders file may also be
              specified via the PDSH_GENDERS_FILE environment variable.

nodeupdown module options

       -v     Eliminate target nodes that are considered "down" by libnodeupdown.

slurm module options

       The slurm module allows pdsh to target nodes based on currently running SLURM jobs. The slurm  module  is
       typically  called  after  all other node selection options have been processed, and if no nodes have been
       selected, the module will attempt to read a running  jobid  from  the  SLURM_JOBID  environment  variable
       (which  is  set when running under a SLURM allocation). If SLURM_JOBID references an invalid job, it will
       be silently ignored.

       -j jobid[,jobid,...]
              Target list of nodes allocated to the SLURM job jobid. This option may be used multiple  times  to
              target  multiple  SLURM  jobs.  The special argument "all" can be used to target all nodes running
              SLURM jobs, e.g.  -j all.

       -P partition[,partition,...]
              Target list of nodes containing in the  SLURM  partition  partition.   This  option  may  be  used
              multiple  times  to  target  multiple  SLURM partitions and/or partitions may be given in a comma-
              delimited list.

torque module options

       The torque module allows pdsh to target nodes based on currently running Torque/PBS jobs. Similar to  the
       slurm  module,  the  torque  module  is typically called after all other node selection options have been
       processed, and if no nodes have been selected, the module will attempt to read a running jobid  from  the
       PBS_JOBID environment variable (which is set when running under a Torque allocation).

       -j jobid[,jobid,...]
              Target  list of nodes allocated to the Torque job jobid. This option may be used multiple times to
              target multiple Torque jobs.

rms module options

       The rms module allows pdsh to target nodes based on an RMS resource. The rms module is  typically  called
       after  all  other node selection options, and if no nodes have been selected, the module will examine the
       RMS_RESOURCEID environment variable and attempt to set the target list of hosts to the nodes in  the  RMS
       resource. If an invalid resource is denoted, the variable is silently ignored.

SDR module options

       The SDR module supports targeting hosts via the System Data Repository on IBM SPs.

       -a     Target  all  nodes  in  the  SDR. The list is generated from the "reliable hostname" in the SDR by
              default.

       -i     Translate hostnames between reliable and initial in the SDR, when applicable.   If  the  a  target
              hostname  matches  either  the initial or reliable hostname in the SDR, the alternate name will be
              substitued. Thus a list composed of initial hostnames will instead be  replaced  with  a  list  of
              reliable  hostnames.   For  example, when used with -a above, all initial hostnames in the SDR are
              targeted.

       -v     Do not target nodes that are marked as not responding in the SDR on the targeted interface. (If  a
              hostname does not appear in the SDR, then that name will remain in the target hostlist.)

       -G     In combination with -a, include all partitions.

nodeattr module options

       The  nodeattr module supports access to the genders database via the nodeattr(1) command. See the genders
       section above for a list of support options with this module. The option usage with the  nodeattr  module
       is  the  same  as  genders,  above, with the exception that the -i option may only be used with -a or -g.
       NOTE: This module will only work with very old releases of genders where the nodeattr(1) command supports
       the  -r option, and before the libgenders API was available. Users running newer versions of genders will
       need to use the genders module instead.

dshgroup module options

       The dshgroup module allows pdsh to use dsh (or Dancer's shell) style group files from /etc/dsh/group/  or
       ~/.dsh/group/.  The  default search path may be overridden with the DSHGROUP_PATH environment variable, a
       colon-separated list of directories to search. The default value for DSHGROUP_PATH is /etc/dsh/group.

       -g groupname,...
              Target  nodes  in  dsh  group  file  "groupname"  found  in   either   ~/.dsh/group/groupname   or
              /etc/dsh/group/groupname.

       -X groupname,...
              Exclude nodes in dsh group file "groupname."

       As  an  enhancement  in  pdsh,  dshgroup  files may optionally include other dshgroup files via a special
       #include STRING syntax.  The argument to #include may be either a file path, or a group  name,  in  which
       case the path used to search for the group file is the same as if the group had been specified to -g.

netgroup module options

       The  netgroup  module  allows  pdsh  to  use  standard  netgroup  entries to build lists of target hosts.
       (/etc/netgroup or NIS)

       -g groupname,...
              Target nodes in netgroup "groupname."

       -X groupname,...
              Exclude nodes in netgroup "groupname."

ENVIRONMENT VARIABLES

       PDSH_RCMD_TYPE
              Equivalent to the -R option, the value of this environment  variable  will  be  used  to  set  the
              default rcmd module for pdsh to use (e.g. ssh, rsh).

       PDSH_SSH_ARGS
              Override  the  standard arguments that pdsh passes to the ssh(1) command ("-2 -a -x -l%u %h"). The
              use of the parameters %u, %h, and %n (as documented in the rcmd/exec section above)  is  optional.
              If  these  parameters  are  missing,  pdsh  will  append them to the ssh commandline because it is
              assumed they are mandatory.

       PDSH_SSH_ARGS_APPEND
              Append  additional  options   to   the   ssh(1)   command   invoked   by   pdsh.    For   example,
              PDSH_SSH_ARGS_APPEND="-q"  would  run  ssh  in quiet mode, or "-v" would increase the verbosity of
              ssh. (Note: these arguments are actually prepended to the ssh commandline to  ensure  they  appear
              before any target hostname argument to ssh.)

       WCOLL  If no other node selection option is used, the WCOLL environment variable may be set to a filename
              from which a list of target hosts will be read. The file should contain a list of hosts,  one  per
              line  (though  each  line  may  contain  a  hostlist expression.  See HOSTLIST EXPRESSIONS section
              below).

       DSHPATH
              If set, the path in DSHPATH will be used as the PATH for the remote processes.

       FANOUT Set the pdsh fanout (See description of -f above).

HOSTLIST EXPRESSIONS

       As noted in sections above pdsh accepts lists of hosts the general form: prefix[n-m,l-k,...], where n < m
       and  l  <  k,  etc.,  as an alternative to explicit lists of hosts. This form should not be confused with
       regular expression character classes (also denoted by ``[]''). For example, foo[19] does not represent an
       expression matching foo1 or foo9, but rather represents the degenerate hostlist: foo19.

       The  hostlist  syntax is meant only as a convenience on clusters with a "prefixNNN" naming convention and
       specification of ranges should not be considered necessary -- this foo1,foo9 could be specified as  such,
       or by the hostlist foo[1,9].

       Some examples of usage follow:

       Run command on foo01,foo02,...,foo05
           pdsh -w foo[01-05] command

       Run command on foo7,foo9,foo10
            pdsh -w foo[7,9-10] command

       Run command on foo0,foo4,foo5
            pdsh -w foo[0-5] -x foo[1-3] command

       A suffix on the hostname is also supported:

       Run command on foo0-eth0,foo1-eth0,foo2-eth0,foo3-eth0
          pdsh -w foo[0-3]-eth0 command

       As  a  reminder  to  the  reader, some shells will interpret brackets ('[' and ']') for pattern matching.
       Depending on your shell, it may be necessary to enclose ranged lists  within  quotes.   For  example,  in
       tcsh, the first example above should be executed as:

            pdsh -w "foo[01-05]" command

ORIGIN

       Originally  a  rewrite of IBM dsh(1) by Jim Garlick <garlick@llnl.gov> on LLNL's ASCI Blue-Pacific IBM SP
       system. It is now used on Linux clusters at LLNL.

LIMITATIONS

       When using ssh for remote execution, expect the stderr of ssh to be folded in with  that  of  the  remote
       command.  When  invoked  by  pdsh, it is not possible for ssh to prompt for passwords if RSA/DSA keys are
       configured properly, etc..  For ssh implementations that suppport a connect timeout option, pdsh attempts
       to  use  that  option  to  enforce  the  timeout (e.g. -oConnectTimeout=T for OpenSSH), otherwise connect
       timeouts are not supported when using ssh.  Finally, there is no reliable way for  pdsh  to  ensure  that
       remote  commands  are  actually  terminated  when  using  a  command timeout. Thus if -u is used with ssh
       commands may be left running on remote hosts even after timeout has killed local ssh processes.

       Output from multiple processes per node may be interspersed when using qshell or mqshell rcmd modules.

       The number of nodes that pdsh can simultaneously execute remote jobs on is limited by the maximum  number
       of threads that can be created concurrently, as well as the availability of reserved ports in the rsh and
       qshell rcmd modules. On systems that implement Posix threads, the  limit  is  typically  defined  by  the
       constant PTHREADS_THREADS_MAX.

FILES

SEE ALSO

       rsh(1), ssh(1), dshbak(1), pdcp(1)