Provided by: slurm-llnl_2.2.7-1_i386 bug

NAME

       srun - Run parallel jobs

SYNOPSIS

       srun            [OPTIONS...]  executable [args...]

DESCRIPTION

       Run  a  parallel  job  on cluster managed by SLURM.  If necessary, srun
       will first create a resource allocation in which to  run  the  parallel
       job.

OPTIONS

       -A, --account=<account>
              Charge  resources  used  by  this job to specified account.  The
              account is an arbitrary string. The account name may be  changed
              after job submission using the scontrol command.

       --acctg-freq=<seconds>
              Define  the  job accounting sampling interval.  This can be used
              to override  the  JobAcctGatherFrequency  parameter  in  SLURM's
              configuration  file,  slurm.conf.  A value of zero disables real
              the periodic job sampling and  provides  accounting  information
              only  on  job  termination (reducing SLURM interference with the
              job).

       -B --extra-node-info=<sockets[:cores[:threads]]>
              Request a specific allocation of resources with  details  as  to
              the number and type of computational resources within a cluster:
              number of sockets (or physical processors) per node,  cores  per
              socket,  and  threads  per  core.  The total amount of resources
              being requested is the product of all of the terms.  Each  value
              specified  is considered a minimum.  An asterisk (*) can be used
              as a placeholder indicating that all available resources of that
              type  are  to be utilized.  As with nodes, the individual levels
              can also be specified in separate options if desired:
                  --sockets-per-node=<sockets>
                  --cores-per-socket=<cores>
                  --threads-per-core=<threads>
              If  task/affinity  plugin  is  enabled,   then   specifying   an
              allocation  in this manner also sets a default --cpu_bind option
              of threads if the -B option specifies a thread count,  otherwise
              an  option  of  cores if a core count is specified, otherwise an
              option   of   sockets.    If   SelectType   is   configured   to
              select/cons_res,   it   must   have   a  parameter  of  CR_Core,
              CR_Core_Memory, CR_Socket, or CR_Socket_Memory for  this  option
              to be honored.  This option is not supported on BlueGene systems
              (select/bluegene plugin is configured).  If not  specified,  the
              scontrol show job will display 'ReqS:C:T=*:*:*'.

       --begin=<time>
              Defer  initiation  of  this  job  until  the specified time.  It
              accepts times of the form HH:MM:SS to run a job  at  a  specific
              time  of  day  (seconds are optional).  (If that time is already
              past, the next day is assumed.)  You may also specify  midnight,
              noon,  or  teatime (4pm) and you can have a time-of-day suffixed
              with AM or PM for running in the morning or  the  evening.   You
              can  also say what day the job will be run, by specifying a date
              of the form MMDDYY or MM/DD/YY YYYY-MM-DD. Combine date and time
              using the following format YYYY-MM-DD[THH:MM[:SS]]. You can also
              give times like now + count time-units, where the time-units can
              be seconds (default), minutes, hours, days, or weeks and you can
              tell SLURM to run the job today with the keyword  today  and  to
              run  the  job tomorrow with the keyword tomorrow.  The value may
              be changed after job submission using the scontrol command.  For
              example:
                 --begin=16:00
                 --begin=now+1hour
                 --begin=now+60           (seconds by default)
                 --begin=2010-01-20T12:34:00

              Notes on date/time specifications:
               -   Although   the   'seconds'   field  of  the  HH:MM:SS  time
              specification is allowed by the code, note that the poll time of
              the  SLURM scheduler is not precise enough to guarantee dispatch
              of the job on the exact second.  The job  will  be  eligible  to
              start  on  the next poll following the specified time. The exact
              poll interval depends on the SLURM scheduler (e.g.,  60  seconds
              with the default sched/builtin).
               -   If   no  time  (HH:MM:SS)  is  specified,  the  default  is
              (00:00:00).
               - If a date is specified without a year (e.g., MM/DD) then  the
              current  year  is  assumed,  unless the combination of MM/DD and
              HH:MM:SS has already passed for that year,  in  which  case  the
              next year is used.

       --checkpoint=<time>
              Specifies  the  interval between creating checkpoints of the job
              step.  By  default,  the  job  step  will  have  no  checkpoints
              created.     Acceptable    time   formats   include   "minutes",
              "minutes:seconds",    "hours:minutes:seconds",     "days-hours",
              "days-hours:minutes" and "days-hours:minutes:seconds".

       --checkpoint-dir=<directory>
              Specifies  the  directory  into  which  the  job  or  job step's
              checkpoint should be written (used by  the  checkpoint/blcr  and
              checkpoint/xlch plugins only).  The default value is the current
              working  directory.   Checkpoint  files  will  be  of  the  form
              "<job_id>.ckpt"  for  jobs and "<job_id>.<step_id>.ckpt" for job
              steps.

       --comment=<string>
              An arbitrary comment.

       -C, --constraint=<list>
              Specify a list of constraints.   The  constraints  are  features
              that have been assigned to the nodes by the slurm administrator.
              The list of constraints may include multiple features  separated
              by  ampersand  (AND)  and/or  vertical  bar (OR) operators.  For
              example:             --constraint="opteron&video"             or
              --constraint="fast|faster".   In  the  first example, only nodes
              having both the feature "opteron" AND the feature  "video"  will
              be  used.   There  is  no mechanism to specify that you want one
              node with  feature  "opteron"  and  another  node  with  feature
              "video" in case no node has both features.  If only one of a set
              of possible options should be used for all allocated nodes, then
              use  the  OR  operator  and  enclose  the  options within square
              brackets.  For example: "--constraint=[rack1|rack2|rack3|rack4]"
              might  be  used to specify that all nodes must be allocated on a
              single rack of the cluster, but any of those four racks  can  be
              used.   A  request  can  also specify the number of nodes needed
              with some feature by appending an asterisk and count  after  the
              feature      name.      For     example     "srun     --nodes=16
              --constraint=graphics*4 ..."  indicates that the job requires 16
              nodes at that at least four of those nodes must have the feature
              "graphics."  Constraints with node counts may only  be  combined
              with  AND  operators.   If no nodes have the requested features,
              then the job will be rejected by the  slurm  job  manager.  This
              option  is  used  for  job allocations, but ignored for job step
              allocations.

       --contiguous
              If set, then the allocated nodes must  form  a  contiguous  set.
              Not honored with the topology/tree or topology/3d_torus plugins,
              both of which can modify the node ordering.  Not honored  for  a
              job step's allocation.

       --cores-per-socket=<cores>
              Restrict  node  selection  to  nodes with at least the specified
              number of cores per socket.  See additional information under -B
              option above when task/affinity plugin is enabled.

       --cpu_bind=[{quiet,verbose},]type
              Bind  tasks  to CPUs. Used only when the task/affinity plugin is
              enabled.   The  configuration  parameter   TaskPluginParam   may
              override  these  options.   For  example,  if TaskPluginParam is
              configured to bind to cores, your job will not be able  to  bind
              tasks  to  sockets.   NOTE:  To  have SLURM always report on the
              selected CPU binding for all commands executed in a  shell,  you
              can   enable   verbose   mode   by  setting  the  SLURM_CPU_BIND
              environment variable value to "verbose".

              The following informational environment variables are  set  when
              --cpu_bind is in use:
                      SLURM_CPU_BIND_VERBOSE
                      SLURM_CPU_BIND_TYPE
                      SLURM_CPU_BIND_LIST

              See  the  ENVIRONMENT  VARIABLE  section  for  a  more  detailed
              description of the individual SLURM_CPU_BIND* variables.

              When using --cpus-per-task to run multithreaded tasks, be  aware
              that  CPU  binding  is inherited from the parent of the process.
              This means that the multithreaded task should either specify  or
              clear  the CPU binding itself to avoid having all threads of the
              multithreaded  task  use  the  same  mask/CPU  as  the   parent.
              Alternatively,  fat  masks  (masks  which  specify more than one
              allowed CPU) could be used for the tasks  in  order  to  provide
              multiple CPUs for the multithreaded tasks.

              By  default, a job step has access to every CPU allocated to the
              job.  To ensure that distinct CPUs are  allocated  to  each  job
              step, use the --exclusive option.

              If  the job step allocation includes an allocation with a number
              of sockets, cores, or threads equal to the number of tasks to be
              started  then  the  tasks  will  by  default  be  bound  to  the
              appropriate  resources.   Disable  this  mode  of  operation  by
              explicitly setting "--cpu-bind=none".

              Note  that a job step can be allocated different numbers of CPUs
              on each node or be allocated CPUs not starting at location zero.
              Therefore  one  of  the options which automatically generate the
              task binding is  recommended.   Explicitly  specified  masks  or
              bindings  are  only honored when the job step has been allocated
              every available CPU on the node.

              Binding a task to a NUMA locality domain means to bind the  task
              to  the  set  of CPUs that belong to the NUMA locality domain or
              "NUMA node".  If  NUMA  locality  domain  options  are  used  on
              systems  with  no NUMA support, then each socket is considered a
              locality domain.

              Supported options include:

              q[uiet]
                     Quietly bind before task runs (default)

              v[erbose]
                     Verbosely report binding before task runs

              no[ne] Do not bind tasks to CPUs (default)

              rank   Automatically bind by task rank.  Task zero is  bound  to
                     socket  (or  core  or  thread)  zero, etc.  Not supported
                     unless the entire node is allocated to the job.

              map_cpu:<list>
                     Bind by mapping CPU  IDs  to  tasks  as  specified  where
                     <list>  is  <cpuid1>,<cpuid2>,...<cpuidN>.   CPU  IDs are
                     interpreted as decimal values unless  they  are  preceded
                     with   '0x'   in  which  case  they  are  interpreted  as
                     hexadecimal values.  Not supported unless the entire node
                     is allocated to the job.

              mask_cpu:<list>
                     Bind  by  setting  CPU  masks on tasks as specified where
                     <list>  is  <mask1>,<mask2>,...<maskN>.   CPU  masks  are
                     always  interpreted  as  hexadecimal  values  but  can be
                     preceded with an optional '0x'.  Not supported unless the
                     entire node is allocated to the job.

              rank_ldom
                     Bind to a NUMA locality domain by rank

              map_ldom:<list>
                     Bind  by  mapping  NUMA  locality  domain IDs to tasks as
                     specified  where  <list>  is  <ldom1>,<ldom2>,...<ldomN>.
                     The locality domain IDs are interpreted as decimal values
                     unless they are preceded with '0x' in which case they are
                     interpreted  as hexadecimal values.  Not supported unless
                     the entire node is allocated to the job.

              mask_ldom:<list>
                     Bind by setting NUMA locality domain masks  on  tasks  as
                     specified  where  <list>  is  <mask1>,<mask2>,...<maskN>.
                     NUMA locality domain  masks  are  always  interpreted  as
                     hexadecimal  values  but can be preceded with an optional
                     '0x'.  Not supported unless the entire node is  allocated
                     to the job.

              sockets
                     Automatically  generate  masks  binding tasks to sockets.
                     Only the CPUs on the socket which have been allocated  to
                     the  job  will  be  used.  If the number of tasks differs
                     from the number of allocated sockets this can  result  in
                     sub-optimal binding.

              cores  Automatically  generate masks binding tasks to cores.  If
                     the number of tasks differs from the number of  allocated
                     cores this can result in sub-optimal binding.

              threads
                     Automatically  generate  masks  binding tasks to threads.
                     If the  number  of  tasks  differs  from  the  number  of
                     allocated threads this can result in sub-optimal binding.

              ldoms  Automatically   generate  masks  binding  tasks  to  NUMA
                     locality domains.  If the number of  tasks  differs  from
                     the  number of allocated locality domains this can result
                     in sub-optimal binding.

              help   Show help message for cpu_bind

       -c, --cpus-per-task=<ncpus>
              Request that ncpus be allocated per process. This may be  useful
              if  the  job is multithreaded and requires more than one CPU per
              task for  optimal  performance.  The  default  is  one  CPU  per
              process.   If  -c is specified without -n, as many tasks will be
              allocated  per  node  as  possible  while  satisfying   the   -c
              restriction.  For  instance on a cluster with 8 CPUs per node, a
              job request for 4 nodes and 3 CPUs per task may be  allocated  3
              or  6  CPUs  per  node  (1  or  2 tasks per node) depending upon
              resource consumption by other jobs. Such a job may be unable  to
              execute  more  than a total of 4 tasks.  This option may also be
              useful to spawn tasks without allocating resources  to  the  job
              step  from  the job's allocation when running multiple job steps
              with the --exclusive option.

              WARNING:  There  are  configurations  and  options   interpreted
              differently  by  job  and  job step requests which can result in
              inconsistencies  for  this  option.   For   example   srun   -c2
              --threads-per-core=1  prog  may  allocate two cores for the job,
              but if each  of  those  cores  contains  two  threads,  the  job
              allocation  will include four CPUs. The job step allocation will
              then launch two threads per CPU for a total of two tasks.

              WARNING: When srun is executed from  within  salloc  or  sbatch,
              there  are  configurations  and  options  which  can  result  in
              inconsistent allocations when -c has a value greater than -c  on
              salloc or sbatch.

       -d, --dependency=<dependency_list>
              Defer  the  start  of  this job until the specified dependencies
              have been satisfied completed.  <dependency_list> is of the form
              <type:job_id[:job_id][,type:job_id[:job_id]]>.   Many  jobs  can
              share the same dependency and these  jobs  may  even  belong  to
              different  users. The  value may be changed after job submission
              using the scontrol command.

              after:job_id[:jobid...]
                     This job can begin execution  after  the  specified  jobs
                     have begun execution.

              afterany:job_id[:jobid...]
                     This  job  can  begin  execution after the specified jobs
                     have terminated.

              afternotok:job_id[:jobid...]
                     This job can begin execution  after  the  specified  jobs
                     have terminated in some failed state (non-zero exit code,
                     node failure, timed out, etc).

              afterok:job_id[:jobid...]
                     This job can begin execution  after  the  specified  jobs
                     have  successfully  executed  (ran  to completion with an
                     exit code of zero).

              singleton
                     This  job  can  begin  execution  after  any   previously
                     launched  jobs  sharing  the  same job name and user have
                     terminated.

       -D, --chdir=<path>
              have the remote processes do a chdir to  path  before  beginning
              execution.  The  default  is  to  chdir  to  the current working
              directory of the srun process.

       -e, --error=<mode>
              Specify  how  stderr  is  to  be  redirected.  By   default   in
              interactive  mode,  srun  redirects  stderr  to the same file as
              stdout, if one is specified. The --error option is  provided  to
              allow stdout and stderr to be redirected to different locations.
              See IO Redirection below for more  options.   If  the  specified
              file already exists, it will be overwritten.

       -E, --preserve-env
              Pass  the  current  values of environment variables SLURM_NNODES
              and  SLURM_NTASKS  through  to  the  executable,   rather   than
              computing them from commandline parameters.

       --epilog=<executable>
              srun will run executable just after the job step completes.  The
              command line arguments for executable will be  the  command  and
              arguments  of  the  job  step.  If executable is "none", then no
              epilog will be run.  This  parameter  overrides  the  SrunEpilog
              parameter in slurm.conf.

       --exclusive
              When  used  to  initiate  a job, the job allocation cannot share
              nodes with other running jobs.  This is the oposite of  --share,
              whichever  option  is  seen  last  on the command line will win.
              (The  default  shared/exclusive  behavior  depends   on   system
              configuration.)

              This  option can also be used when initiating more than job step
              within an existing resource allocation  and  you  want  separate
              processors  to  be  dedicated  to  each  job step. If sufficient
              processors are not available to initiate the job step,  it  will
              be  deferred.  This  can  be  thought  of  as providing resource
              management for the job within it's  allocation.  Note  that  all
              CPUs  allocated  to  a job are available to each job step unless
              the --exclusive option is used plus task affinity is configured.
              Since resource management is provided by processor, the --ntasks
              option must be specified, but the following options  should  NOT
              be  specified --relative, --distribution=arbitrary.  See EXAMPLE
              below.

       --gid=<group>
              If srun is run as root, and the --gid option is used, submit the
              job  with  group's  group  access permissions.  group may be the
              group name or the numerical group ID.

       --gres=<list>
              Specifies  a  comma  delimited  list   of   generic   consumable
              resources.    The   format   of   each  entry  on  the  list  is
              "name[:count[*cpu]]".   The  name  is  that  of  the  consumable
              resource.   The  count  is  the number of those resources with a
              default value of 1.  The specified resources will  be  allocated
              to  the job on each node allocated unless "*cpu" is appended, in
              which case the resources will be allocated on a per  cpu  basis.
              The  available  generic  consumable resources is configurable by
              the  system  administrator.   A  list   of   available   generic
              consumable  resources  will be printed and the command will exit
              if the option argument  is  "help".   Examples  of  use  include
              "--gres=gpus:2*cpu,disk=40G" and "--gres=help".

       -H, --hold
              Specify  the job is to be submitted in a held state (priority of
              zero).  A held job can now be released using scontrol  to  reset
              its priority (e.g. "scontrol update jobid=<id> priority=1".

       -h, --help
              Display help information and exit.

       --hint=<type>
              Bind tasks according to application hints

              compute_bound
                     Select  settings  for compute bound applications: use all
                     cores in each socket, one thread per core

              memory_bound
                     Select settings for memory bound applications:  use  only
                     one core in each socket, one thread per core

              [no]multithread
                     [don't]  use  extra  threads with in-core multi-threading
                     which can benefit communication intensive applications

              help   show this help message

       -I, --immediate[=<seconds>]
              exit if resources are  not  available  within  the  time  period
              specified.  If no argument is given, resources must be available
              immediately for the request to succeed.  By default, --immediate
              is  off,  and  the  command  will  block  until resources become
              available.

       -i, --input=<mode>
              Specify how stdin is to redirected. By default,  srun  redirects
              stdin  from the terminal all tasks. See IO Redirection below for
              more options.  For OS X, the poll() function  does  not  support
              stdin, so input from a terminal is not possible.

       -J, --job-name=<jobname>
              Specify a name for the job. The specified name will appear along
              with the job id number when querying running jobs on the system.
              The  default  is  the  supplied executable program's name. NOTE:
              This information may be written to the  slurm_jobacct.log  file.
              This  file  is  space  delimited  so  if  a space is used in the
              jobname name it will cause problems in properly  displaying  the
              contents of the slurm_jobacct.log file when the sacct command is
              used.

       --jobid=<jobid>
              Initiate a job step under an already allocated job with  job  id
              id.   Using  this option will cause srun to behave exactly as if
              the SLURM_JOB_ID environment variable was set.

       -K, --kill-on-bad-exit
              Immediately terminate a job if any task exits  with  a  non-zero
              exit  code.   Note:  The  -K,  --kill-on-bad-exit  option  takes
              precedence over -W, --wait to terminate the job immediately if a
              task exits with a non-zero exit code.

       -k, --no-kill
              Do  not automatically terminate a job of one of the nodes it has
              been allocated fails.  This option is only recognized on  a  job
              allocation, not for the submission of individual job steps.  The
              job will assume all responsibilities for fault-tolerance.  Tasks
              launch using this option will not be considered terminated (e.g.
              -K, --kill-on-bad-exit and  -W,  --wait  options  will  have  no
              effect  upon  the job step).  The active job step (MPI job) will
              likely suffer a fatal error, but subsequent job steps may be run
              if this option is specified.  The default action is to terminate
              the job upon node failure.

       -l, --label
              prepend task number to lines of stdout/err. Normally, stdout and
              stderr from remote tasks is line-buffered directly to the stdout
              and stderr of srun.  The --label option will  prepend  lines  of
              output with the remote task id.

       -L, --licenses=<license>
              Specification  of  licenses (or other resources available on all
              nodes of the cluster) which  must  be  allocated  to  this  job.
              License  names  can  be  followed  by an asterisk and count (the
              default count is one).  Multiple license names should  be  comma
              separated (e.g.  "--licenses=foo*4,bar").

       -m, --distribution=
              <block|cyclic|arbitrary|plane=<options>[:block|cyclic]>

              Specify  alternate  distribution  methods  for remote processes.
              This option controls the assignment of tasks  to  the  nodes  on
              which  resources  have  been  allocated, and the distribution of
              those resources to tasks for binding (task affinity). The  first
              distribution  method  (before the ":") controls the distribution
              of resources across  nodes.  The  optional  second  distribution
              method  (after  the  ":") controls the distribution of resources
              across sockets within a node.  Note that  with  select/cons_res,
              the  number  of  cpus  allocated  on each socket and node may be
              different.  Refer  to  the  mc_support.html  document  for  more
              information  on  resource  allocation,  assignment  of  tasks to
              nodes, and binding of tasks to CPUs.

              First distribution method:

              block  The block distribution method will distribute tasks to  a
                     node  such  that  consecutive  tasks  share  a  node. For
                     example, consider an allocation of three nodes each  with
                     two  cpus.  A  four-task  block distribution request will
                     distribute those tasks to the nodes with  tasks  one  and
                     two on the first node, task three on the second node, and
                     task four on the third node.  Block distribution  is  the
                     default  behavior  if  the  number  of  tasks exceeds the
                     number of allocated nodes.

              cyclic The cyclic distribution method will distribute tasks to a
                     node  such  that  consecutive  tasks are distributed over
                     consecutive  nodes  (in  a  round-robin   fashion).   For
                     example,  consider an allocation of three nodes each with
                     two cpus. A four-task cyclic  distribution  request  will
                     distribute  those  tasks  to the nodes with tasks one and
                     four on the first node, task two on the second node,  and
                     task  three on the third node.  Note that when SelectType
                     is select/cons_res, the same number of CPUs  may  not  be
                     allocated   on  each  node.  Task  distribution  will  be
                     round-robin among all the  nodes  with  CPUs  yet  to  be
                     assigned  to  tasks.   Cyclic distribution is the default
                     behavior if the number of tasks is  no  larger  than  the
                     number of allocated nodes.

              plane  The  tasks are distributed in blocks of a specified size.
                     The options include a number representing the size of the
                     task   block.    This   is   followed   by   an  optional
                     specification of the task distribution  scheme  within  a
                     block of tasks and between the blocks of tasks.  For more
                     details (including examples and diagrams), please see
                     https://computing.llnl.gov/linux/slurm/mc_support.html
                     and
                     https://computing.llnl.gov/linux/slurm/dist_plane.html.

              arbitrary
                     The  arbitrary  method  of  distribution  will   allocate
                     processes  in-order  as  listed in file designated by the
                     environment variable SLURM_HOSTFILE.  If this variable is
                     listed  it will over ride any other method specified.  If
                     not set the method will default  to  block.   Inside  the
                     hostfile  must  contain  at  minimum  the number of hosts
                     requested and be one per line  or  comma  separated.   If
                     specifying  a  task  count  (-n, --ntasks=<number>), your
                     tasks will be laid out on the nodes in the order  of  the
                     file.

              Second distribution method:

              block  The  block  distribution  method will distribute tasks to
                     sockets such that consecutive tasks share a socket.

              cyclic The cyclic distribution method will distribute  tasks  to
                     sockets  such that consecutive tasks are distributed over
                     consecutive sockets (in a round-robin fashion).

       --mail-type=<type>
              Notify user by email when certain event types occur.  Valid type
              values  are  BEGIN,  END,  FAIL,  REQUEUE,  and  ALL  (any state
              change). The user to be notified is indicated with --mail-user.

       --mail-user=<user>
              User to receive email notification of state changes  as  defined
              by --mail-type.  The default value is the submitting user.

       --mem=<MB>
              Specify the real memory required per node in MegaBytes.  Default
              value is DefMemPerNode and the maximum value  is  MaxMemPerNode.
              If configured, both of parameters can be seen using the scontrol
              show config command.  This parameter would generally be used  if
              whole  nodes  are  allocated to jobs (SelectType=select/linear).
              Also see --mem-per-cpu.  --mem and  --mem-per-cpu  are  mutually
              exclusive.

       --mem-per-cpu=<MB>
              Mimimum memory required per allocated CPU in MegaBytes.  Default
              value is DefMemPerCPU and the maximum value is MaxMemPerCPU (see
              exception  below). If configured, both of parameters can be seen
              using the scontrol show config command.  Note that if the  job's
              --mem-per-cpu  value  exceeds  the configured MaxMemPerCPU, then
              the user's limit will be treated as a  memory  limit  per  task;
              --mem-per-cpu  will  be  reduced  to  a  value  no  larger  than
              MaxMemPerCPU;  --cpus-per-task  will  be  set   and   value   of
              --cpus-per-task  multiplied  by the new --mem-per-cpu value will
              equal the original --mem-per-cpu value specified  by  the  user.
              This  parameter would generally be used if individual processors
              are allocated to jobs  (SelectType=select/cons_res).   Also  see
              --mem.  --mem and --mem-per-cpu are mutually exclusive.

       --mem_bind=[{quiet,verbose},]type
              Bind tasks to memory. Used only when the task/affinity plugin is
              enabled and the NUMA memory functions are available.  Note  that
              the  resolution  of  CPU  and  memory binding may differ on some
              architectures. For example, CPU binding may be performed at  the
              level  of the cores within a processor while memory binding will
              be performed at the level of  nodes,  where  the  definition  of
              "nodes"  may  differ  from system to system. The use of any type
              other than "none" or "local" is not recommended.   If  you  want
              greater control, try running a simple test code with the options
              "--cpu_bind=verbose,none --mem_bind=verbose,none"  to  determine
              the specific configuration.

              NOTE: To have SLURM always report on the selected memory binding
              for all commands executed in a shell,  you  can  enable  verbose
              mode by setting the SLURM_MEM_BIND environment variable value to
              "verbose".

              The following informational environment variables are  set  when
              --mem_bind is in use:

                      SLURM_MEM_BIND_VERBOSE
                      SLURM_MEM_BIND_TYPE
                      SLURM_MEM_BIND_LIST

              See  the  ENVIRONMENT  VARIABLES  section  for  a  more detailed
              description of the individual SLURM_MEM_BIND* variables.

              Supported options include:

              q[uiet]
                     quietly bind before task runs (default)

              v[erbose]
                     verbosely report binding before task runs

              no[ne] don't bind tasks to memory (default)

              rank   bind by task rank (not recommended)

              local  Use memory local to the processor in use

              map_mem:<list>
                     bind by mapping a node's memory  to  tasks  as  specified
                     where  <list>  is <cpuid1>,<cpuid2>,...<cpuidN>.  CPU IDs
                     are  interpreted  as  decimal  values  unless  they   are
                     preceded  with  '0x'  in  which  case they interpreted as
                     hexadecimal values (not recommended)

              mask_mem:<list>
                     bind by setting memory masks on tasks as specified  where
                     <list>  is  <mask1>,<mask2>,...<maskN>.  memory masks are
                     always interpreted  as  hexadecimal  values.   Note  that
                     masks  must  be  preceded with a '0x' if they don't begin
                     with [0-9] so they are seen as numerical values by srun.

              help   show this help message

       --mincpus=<n>
              Specify a minimum number of logical cpus/processors per node.

       --msg-timeout=<seconds>
              Modify the job launch message timeout.   The  default  value  is
              MessageTimeout  in  the  SLURM  configuration  file  slurm.conf.
              Changes to this are typically  not  recommended,  but  could  be
              useful to diagnose problems.

       --mpi=<mpi_type>
              Identify  the  type  of  MPI  to  be  used. May result in unique
              initiation procedures.

              list   Lists available mpi types to choose from.

              lam    Initiates one 'lamd' process  per  node  and  establishes
                     necessary environment variables for LAM/MPI.

              mpich1_shmem
                     Initiates  one process per node and establishes necessary
                     environment variables for  mpich1  shared  memory  model.
                     This also works for mvapich built for shared memory.

              mpichgm
                     For use with Myrinet.

              mvapich
                     For use with Infiniband.

              openmpi
                     For use with OpenMPI.

              none   No  special MPI processing. This is the default and works
                     with many other versions of MPI.

       --multi-prog
              Run a job with different programs and  different  arguments  for
              each  task.  In  this  case, the executable program specified is
              actually a configuration  file  specifying  the  executable  and
              arguments  for  each  task.  See  MULTIPLE PROGRAM CONFIGURATION
              below for details on the configuration file contents.

       -N, --nodes=<minnodes[-maxnodes]>
              Request that a minimum of minnodes nodes be  allocated  to  this
              job.   The  scheduler  may decide to launch the job on more than
              minnodes nodes.  A limit  on  the  maximum  node  count  may  be
              specified  with  maxnodes (e.g. "--nodes=2-4").  The minimum and
              maximum node count may be the same to specify a specific  number
              of  nodes  (e.g.  "--nodes=2-2"  will  ask  for two and ONLY two
              nodes).  The partition's node limits supersede those of the job.
              If  a  job's  node limits are outside of the range permitted for
              its associated partition, the job will  be  left  in  a  PENDING
              state.   This  permits  possible execution at a later time, when
              the partition limit is changed.  If a job node limit exceeds the
              number  of  nodes  configured  in the partition, the job will be
              rejected.  Note that the environment variable SLURM_NNODES  will
              be  set to the count of nodes actually allocated to the job. See
              the ENVIRONMENT VARIABLES  section for more information.  If  -N
              is  not  specified,  the  default behavior is to allocate enough
              nodes to satisfy the requirements of the -n and -c options.  The
              job will be allocated as many nodes as possible within the range
              specified and without delaying the initiation of the job.

       -n, --ntasks=<number>
              Specify the number of tasks to run. Request that  srun  allocate
              resources  for  ntasks tasks.  The default is one task per node,
              but note  that  the  --cpus-per-task  option  will  change  this
              default.

       --network=<type>
              Specify  the  communication protocol to be used.  This option is
              supported on AIX systems.  Since POE is used  to  launch  tasks,
              this  option  is  not  normally  used  or is specified using the
              SLURM_NETWORK environment variable.  The interpretation of  type
              is system dependent.  For systems with an IBM Federation switch,
              the following comma-separated and  case  insensitive  types  are
              recognized:  IP  (the default is user-space), SN_ALL, SN_SINGLE,
              BULK_XFER and adapter names  (e.g. SNI0  and  SNI1).   For  more
              information,  on  IBM  systems  see  poe  documentation  on  the
              environment variables MP_EUIDEVICE and  MP_USE_BULK_XFER.   Note
              that  only  four jobs steps may be active at once on a node with
              the BULK_XFER option due to limitations in the Federation switch
              driver.

       --nice[=adjustment]
              Run  the  job with an adjusted scheduling priority within SLURM.
              With no adjustment value the scheduling priority is decreased by
              100.  The  adjustment range is from -10000 (highest priority) to
              10000 (lowest priority). Only privileged  users  can  specify  a
              negative  adjustment.  NOTE: This option is presently ignored if
              SchedulerType=sched/wiki or SchedulerType=sched/wiki2.

       --ntasks-per-core=<ntasks>
              Request the maximum ntasks be invoked on each core.  Meant to be
              used  with  the  --ntasks  option.  Related to --ntasks-per-node
              except at the core level instead of the node level.  Masks  will
              automatically  be  generated  to bind the tasks to specific core
              unless --cpu_bind=none is specified.  NOTE: This option  is  not
              supported       unless      SelectTypeParameters=CR_Core      or
              SelectTypeParameters=CR_Core_Memory is configured.

       --ntasks-per-socket=<ntasks>
              Request the maximum ntasks be invoked on each socket.  Meant  to
              be  used with the --ntasks option.  Related to --ntasks-per-node
              except at the socket level instead of  the  node  level.   Masks
              will  automatically  be  generated to bind the tasks to specific
              sockets unless --cpu_bind=none is specified.  NOTE: This  option
              is   not   supported  unless  SelectTypeParameters=CR_Socket  or
              SelectTypeParameters=CR_Socket_Memory is configured.

       --ntasks-per-node=<ntasks>
              Request the maximum ntasks be invoked on each node.  Meant to be
              used   with   the   --nodes   option.    This   is   related  to
              --cpus-per-task=ncpus, but does not  require  knowledge  of  the
              actual  number  of cpus on each node.  In some cases, it is more
              convenient to be able to request that no more  than  a  specific
              number  of  tasks  be  invoked  on  each node.  Examples of this
              include submitting a hybrid MPI/OpenMP app where  only  one  MPI
              "task/rank"  should  be assigned to each node while allowing the
              OpenMP portion to utilize all of the parallelism present in  the
              node,  or  submitting  a  single setup/cleanup/monitoring job to
              each node of a pre-existing allocation as one step in  a  larger
              job script.

       -O, --overcommit
              Overcommit resources. Normally, srun will not allocate more than
              one  process  per  CPU.  By  specifying  --overcommit  you   are
              explicitly  allowing  more  than one process per CPU. However no
              more than MAX_TASKS_PER_NODE tasks are permitted to execute  per
              node.   NOTE:  MAX_TASKS_PER_NODE is defined in the file slurm.h
              and is not a variable, it is set at SLURM build time.

       -o, --output=<mode>
              Specify  the  mode  for  stdout  redirection.  By   default   in
              interactive  mode,  srun collects stdout from all tasks and line
              buffers this output to  the  attached  terminal.  With  --output
              stdout  may be redirected to a file, to one file per task, or to
              /dev/null. See section IO  Redirection  below  for  the  various
              forms of mode.  If the specified file already exists, it will be
              overwritten.

              If --error is not also  specified  on  the  command  line,  both
              stdout  and  stderr  will  directed  to  the  file  specified by
              --output.

       --open-mode=<append|truncate>
              Open the output and error files using append or truncate mode as
              specified.   The  default  value  is  specified  by  the  system
              configuration parameter JobFileAppend.

       -p, --partition=<partition_names>
              Request a specific partition for the  resource  allocation.   If
              not  specified,  the  default  behaviour  is  to allow the slurm
              controller to select the default partition as designated by  the
              system   administrator.  If  the  job  can  use  more  than  one
              partition, specify their names in a comma separate list and  the
              one offering earliest initiation will be used.

       --prolog=<executable>
              srun  will  run  executable  just before launching the job step.
              The command line arguments for executable will  be  the  command
              and arguments of the job step.  If executable is "none", then no
              prolog will be run.  This  parameter  overrides  the  SrunProlog
              parameter in slurm.conf.

       --propagate[=rlimits]
              Allows  users to specify which of the modifiable (soft) resource
              limits to propagate to the compute  nodes  and  apply  to  their
              jobs.   If  rlimits  is  not specified, then all resource limits
              will be propagated.  The following rlimit names are supported by
              Slurm  (although  some  options  may  not  be  supported on some
              systems):

              ALL       All limits listed below

              AS        The maximum address space for a processes

              CORE      The maximum size of core file

              CPU       The maximum amount of CPU time

              DATA      The maximum size of a process's data segment

              FSIZE     The maximum size of files created

              MEMLOCK   The maximum size that may be locked into memory

              NOFILE    The maximum number of open files

              NPROC     The maximum number of processes available

              RSS       The maximum resident set size

              STACK     The maximum stack size

       --pty  Execute  task  zero  in  pseudo   terminal.    Implicitly   sets
              --unbuffered.  Implicitly sets --error and --output to /dev/null
              for all tasks except task zero, which may cause those  tasks  to
              exit immediately (e.g. shells will typically exit immediately in
              that situation).  Not currently supported on AIX platforms.

       -Q, --quiet
              Suppress informational messages from srun. Errors will still  be
              displayed.

       -q, --quit-on-interrupt
              Quit  immediately  on single SIGINT (Ctrl-C). Use of this option
              disables  the  status  feature  normally  available  when   srun
              receives  a single Ctrl-C and causes srun to instead immediately
              terminate the running job.

       --qos=<qos>
              Request a quality of service for the job.   QOS  values  can  be
              defined  for  each user/cluster/account association in the SLURM
              database.  Users will be limited to their association's  defined
              set   of   qos's   when   the   SLURM  configuration  parameter,
              AccountingStorageEnforce, includes "qos" in it's definition.

       -r, --relative=<n>
              Run a job step relative to node n  of  the  current  allocation.
              This  option  may  be used to spread several job steps out among
              the nodes of the current job. If -r is  used,  the  current  job
              step  will  begin at node n of the allocated nodelist, where the
              first node is considered node 0.  The -r option is not permitted
              with  -w  or -x option and will result in a fatal error when not
              running within a prior allocation (i.e. when SLURM_JOB_ID is not
              set).  The  default  for n is 0. If the value of --nodes exceeds
              the number of nodes identified with  the  --relative  option,  a
              warning  message  will be printed and the --relative option will
              take precedence.

       --resv-ports
              Reserve communication ports for this job.  Used for OpenMPI.

       --reservation=<name>
              Allocate resources for the job from the named reservation.

       --restart-dir=<directory>
              Specifies the  directory  from  which  the  job  or  job  step's
              checkpoint  should  be  read  (used  by the checkpoint/blcrm and
              checkpoint/xlch plugins only).

       -s, --share
              The job can share nodes with other running jobs. This may result
              in  faster  job  initiation  and  higher system utilization, but
              lower application performance.

       --signal=<sig_num>[@<sig_time>]
              When a job is within sig_time seconds of its end time,  send  it
              the  signal sig_num.  Due to the resolution of event handling by
              SLURM, the signal may be sent up  to  60  seconds  earlier  than
              specified.   sig_num may either be a signal number or name (e.g.
              "10" or "USR1").  sig_time must have integer value between  zero
              and  65535.   By default, no signal is sent before the job's end
              time.  If a sig_num  is  specified  without  any  sig_time,  the
              default time will be 60 seconds.

       --slurmd-debug=<level>
              Specify  a  debug  level  for slurmd(8). level may be an integer
              value between  0  [quiet,  only  errors  are  displayed]  and  4
              [verbose  operation].   The  slurmd  debug information is copied
              onto  the  stderr  of  the  job.  By  default  only  errors  are
              displayed.

       --sockets-per-node=<sockets>
              Restrict  node  selection  to  nodes with at least the specified
              number of sockets.  See additional information under  -B  option
              above when task/affinity plugin is enabled.

       -T, --threads=<nthreads>
              Allows  limiting  the  number of concurrent threads used to send
              the job request from the srun process to the slurmd processes on
              the  allocated nodes. Default is to use one thread per allocated
              node up to a maximum of 60 concurrent threads.  Specifying  this
              option limits the number of concurrent threads to nthreads (less
              than or equal to 60).  This should only be used  to  set  a  low
              thread count for testing on very small memory computers.

       -t, --time=<time>
              Set  a  limit  on the total run time of the job or job step.  If
              the requested time limit for a job exceeds the partition's  time
              limit,  the  job  will  be  left  in  a  PENDING state (possibly
              indefinitely).  If the requested  time  limit  for  a  job  step
              exceeds  the  partition's  time  limit, the job step will not be
              initiated.  The default  time  limit  is  the  partition's  time
              limit.  When the time limit is reached, the job's tasks are sent
              SIGTERM followed by SIGKILL. If the time limit is for  the  job,
              all  job  steps  are signaled. If the time limit is for a single
              job step within an existing job allocation, only that  job  step
              will  be affected. A job time limit supercedes all job step time
              limits. The interval between SIGTERM and SIGKILL is specified by
              the  SLURM  configuration  parameter  KillWait.  A time limit of
              zero requests that no time limit be  imposed.   Acceptable  time
              formats        include       "minutes",       "minutes:seconds",
              "hours:minutes:seconds", "days-hours", "days-hours:minutes"  and
              "days-hours:minutes:seconds".

       --task-epilog=<executable>
              The  slurmstepd  daemon will run executable just after each task
              terminates.  This  will  be  executed  before   any   TaskEpilog
              parameter  in slurm.conf is executed. This is meant to be a very
              short-lived program. If it  fails  to  terminate  within  a  few
              seconds, it will be killed along with any descendant processes.

       --task-prolog=<executable>
              The  slurmstepd daemon will run executable just before launching
              each task. This will be executed after any TaskProlog  parameter
              in  slurm.conf  is  executed.   Besides  the  normal environment
              variables, this has SLURM_TASK_PID  available  to  identify  the
              process ID of the task being started.  Standard output from this
              program of the form "export NAME=value"  will  be  used  to  set
              environment variables for the task being spawned.

       --test-only
              Returns  an  estimate  of  when  a job would be scheduled to run
              given the current job queue and all  the  other  srun  arguments
              specifying  the job.  This limits srun's behavior to just return
              information; no job is actually submitted.

       --threads-per-core=<threads>
              Restrict node selection to nodes with  at  least  the  specified
              number of threads per core.  See additional information under -B
              option above when task/affinity plugin is enabled.

       --time-min=<time>
              Set a minimum time limit on the job allocation.   If  specified,
              the  job  may have it's --time limit lowered to a value no lower
              than --time-min if doing so permits the job to  begin  execution
              earlier  than otherwise possible.  The job's time limit will not
              be changed after  the  job  is  allocated  resources.   This  is
              performed   by  a  backfill  scheduling  algorithm  to  allocate
              resources  otherwise  reserved   for   higher   priority   jobs.
              Acceptable  time  formats  include "minutes", "minutes:seconds",
              "hours:minutes:seconds", "days-hours", "days-hours:minutes"  and
              "days-hours:minutes:seconds".

       --tmp=<MB>
              Specify a minimum amount of temporary disk space.

       -u, --unbuffered
              Do  not line buffer stdout from remote tasks. This option cannot
              be used with --label.

       --usage
              Display brief help message and exit.

       --uid=<user>
              Attempt to submit and/or run  a  job  as  user  instead  of  the
              invoking  user  id. The invoking user's credentials will be used
              to check access permissions for the target partition. User  root
              may  use  this option to run jobs as a normal user in a RootOnly
              partition for example. If  run  as  root,  srun  will  drop  its
              permissions  to  the  uid  specified  after  node  allocation is
              successful. user may be the user name or numerical user ID.

       -V, --version
              Display version information and exit.

       -v, --verbose
              Increase  the  verbosity  of  srun's   informational   messages.
              Multiple  -v's  will  further  increase  srun's  verbosity.   By
              default only errors will be displayed.

       -W, --wait=<seconds>
              Specify how long to wait after the first task terminates  before
              terminating  all  remaining  tasks.  A  value  of 0 indicates an
              unlimited wait (a warning will be issued after 60 seconds).  The
              default  value  is  set  by  the WaitTime parameter in the slurm
              configuration file  (see  slurm.conf(5)).  This  option  can  be
              useful to insure that a job is terminated in a timely fashion in
              the event that one or more tasks terminate  prematurely.   Note:
              The  -K,  --kill-on-bad-exit  option  takes  precedence over -W,
              --wait to terminate the job immediately if a task exits  with  a
              non-zero exit code.

       -w, --nodelist=<host1,host2,... or filename>
              Request  a specific list of hosts. The job will contain at least
              these hosts. The list may be specified as a comma-separated list
              of  hosts,  a range of hosts (host[1-5,7,...] for example), or a
              filename.  The host list will be assumed to be a filename if  it
              contains  a  "/"  character.  If  you  specify  a max node count
              (-N1-2) if there are more than 2 hosts  in  the  file  only  the
              first 2 nodes will be used in the request list.

       --wckey=<wckey>
              Specify  wckey  to be used with job.  If TrackWCKey=no (default)
              in the slurm.conf this value is ignored.

       -X, --disable-status
              Disable the display of task status when srun receives  a  single
              SIGINT  (Ctrl-C).  Instead immediately forward the SIGINT to the
              running job.  Without this option a second Ctrl-C in one  second
              is  required  to  forcibly  terminate  the  job  and  srun  will
              immediately exit. May also be set via the  environment  variable
              SLURM_DISABLE_STATUS.

       -x, --exclude=<host1,host2,... or filename>
              Request  that  a  specific  list of hosts not be included in the
              resources allocated to this job. The host list will  be  assumed
              to be a filename if it contains a "/"character.

       -Z, --no-allocate
              Run  the  specified  tasks  on a set of nodes without creating a
              SLURM "job" in the SLURM queue structure, bypassing  the  normal
              resource  allocation  step.  The list of nodes must be specified
              with the -w, --nodelist option.  This  is  a  privileged  option
              only available for the users "SlurmUser" and "root".

       The  following options support Blue Gene systems, but may be applicable
       to other systems as well.

       --blrts-image=<path>
              Path to blrts image for bluegene block.  BGL only.  Default from
              blugene.conf if not set.

       --cnload-image=<path>
              Path  to  compute  node  image  for  bluegene  block.  BGP only.
              Default from blugene.conf if not set.

       --conn-type=<type>
              Require the partition connection type to be of a  certain  type.
              On Blue Gene the acceptable of type are MESH, TORUS and NAV.  If
              NAV, or if not set, then SLURM will try  to  fit  a  TORUS  else
              MESH.   You  should  not  normally  set this option.  SLURM will
              normally allocate a TORUS if possible for a given geometry.   If
              running on a BGP system and wanting to run in HTC mode (only for
              1 midplane and below).  You can use HTC_S  for  SMP,  HTC_D  for
              Dual, HTC_V for virtual node mode, and HTC_L for Linux mode.

       -g, --geometry=<XxYxZ>
              Specify the geometry requirements for the job. The three numbers
              represent the required geometry giving dimensions in  the  X,  Y
              and  Z  directions.  For example "--geometry=2x3x4", specifies a
              block of nodes having 2 x 3  x  4  =  24  nodes  (actually  base
              partitions on Blue Gene).

       --ioload-image=<path>
              Path  to  io  image for bluegene block.  BGP only.  Default from
              blugene.conf if not set.

       --linux-image=<path>
              Path to linux image for bluegene block.  BGL only.  Default from
              blugene.conf if not set.

       --mloader-image=<path>
              Path   to  mloader  image  for  bluegene  block.   Default  from
              blugene.conf if not set.

       -R, --no-rotate
              Disables rotation of the job's requested geometry  in  order  to
              fit an appropriate block.  By default the specified geometry can
              rotate in three dimensions.

       --ramdisk-image=<path>
              Path to ramdisk image for bluegene block.   BGL  only.   Default
              from blugene.conf if not set.

       --reboot
              Force the allocated nodes to reboot before starting the job.

       srun  will  submit  the  job  request to the slurm job controller, then
       initiate all processes on the remote nodes. If the  request  cannot  be
       met  immediately,  srun  will block until the resources are free to run
       the job.  If  the  -I  (--immediate)  option  is  specified  srun  will
       terminate if resources are not immediately available.

       When  initiating  remote  processes  srun  will  propagate  the current
       working directory, unless --chdir=<path> is specified,  in  which  case
       path will become the working directory for the remote processes.

       The  -n,  -c,  and  -N  options  control  how  CPUs   and nodes will be
       allocated to the job. When specifying only the number of  processes  to
       run  with  -n,  a  default  of  one  CPU  per  process is allocated. By
       specifying the number of CPUs required per task (-c), more than one CPU
       may  be allocated per process. If the number of nodes is specified with
       -N, srun will  attempt  to  allocate  at  least  the  number  of  nodes
       specified.

       Combinations  of  the  above  three  options  may be used to change how
       processes are distributed across  nodes  and  cpus.  For  instance,  by
       specifying both the number of processes and number of nodes on which to
       run, the number of processes per  node  is  implied.  However,  if  the
       number  of  CPUs per process is more important then number of processes
       (-n) and the number of CPUs per process (-c) should be specified.

       srun will refuse to  allocate more than  one  process  per  CPU  unless
       --overcommit (-O) is also specified.

       srun will attempt to meet the above specifications "at a minimum." That
       is, if 16 nodes are requested for 32 processes, and some nodes  do  not
       have 2 CPUs, the allocation of nodes will be increased in order to meet
       the demand for CPUs. In other words, a minimum of 16  nodes  are  being
       requested.  However,  if  16 nodes are requested for 15 processes, srun
       will consider this an error, as  15  processes  cannot  run  across  16
       nodes.

       IO Redirection

       By  default, stdout and stderr will be redirected from all tasks to the
       stdout and stderr of srun,  and  stdin  will  be  redirected  from  the
       standard  input  of  srun  to all remote tasks.  If stdin is only to be
       read by a subset of the spawned tasks, specifying a file to  read  from
       rather than forwarding stdin from the srun command may be preferable as
       it avoids moving and storing data that will never be read.  For  OS  X,
       the poll() function does not support stdin, so input from a terminal is
       not possible.  This behavior may be changed with the --output, --error,
       and --input (-o, -e, -i) options. Valid format specifications for these
       options are

       all       stdout stderr is redirected from all tasks to srun.  stdin is
                 broadcast   to  all  remote  tasks.   (This  is  the  default
                 behavior)

       none      stdout and stderr is not received from any  task.   stdin  is
                 not sent to any task (stdin is closed).

       taskid    stdout  and/or  stderr are redirected from only the task with
                 relative id equal to taskid, where 0  <=  taskid  <=  ntasks,
                 where  ntasks is the total number of tasks in the current job
                 step.  stdin is redirected from the stdin  of  srun  to  this
                 same  task.   This file will be written on the node executing
                 the task.

       filename  srun will redirect stdout and/or stderr  to  the  named  file
                 from all tasks.  stdin will be redirected from the named file
                 and broadcast to all tasks in the job.  filename refers to  a
                 path  on the host that runs srun.  Depending on the cluster's
                 file system layout, this may result in the  output  appearing
                 in  different  places  depending on whether the job is run in
                 batch mode.

       format string
                 srun allows for a format string to be used  to  generate  the
                 named  IO  file described above. The following list of format
                 specifiers may be used in the format  string  to  generate  a
                 filename  that will be unique to a given jobid, stepid, node,
                 or task. In each case, the appropriate number  of  files  are
                 opened and associated with the corresponding tasks. Note that
                 any format string  containing  %t,  %n,  and/or  %N  will  be
                 written  on  the node executing the task rather than the node
                 where srun executes.

                 %J     jobid.stepid of the running job. (e.g. "128.0")

                 %j     jobid of the running job.

                 %s     stepid of the running job.

                 %N     short hostname. This will create a  separate  IO  file
                        per node.

                 %n     Node  identifier  relative to current job (e.g. "0" is
                        the first node of the running job) This will create  a
                        separate IO file per node.

                 %t     task  identifier  (rank) relative to current job. This
                        will create a separate IO file per task.

                 A number placed between  the  percent  character  and  format
                 specifier  may  be  used  to  zero-pad  the  result in the IO
                 filename. This number is  ignored  if  the  format  specifier
                 corresponds to  non-numeric data (%N for example).

                 Some  examples  of  how the format string may be used for a 4
                 task job step with a Job ID of 128  and  step  id  of  0  are
                 included below:

                 job%J.out      job128.0.out

                 job%4j.out     job0128.out

                 job%j-%2t.out  job128-00.out, job128-01.out, ...

INPUT ENVIRONMENT VARIABLES

       Some  srun  options  may  be  set  via  environment  variables.   These
       environment variables, along  with  their  corresponding  options,  are
       listed  below.   Note:  Command line options will always override these
       settings.

       PMI_FANOUT            This is used exclusively  with  PMI  (MPICH2  and
                             MVAPICH2)   and   controls  the  fanout  of  data
                             communications. The srun command  sends  messages
                             to application programs (via the PMI library) and
                             those applications may be called upon to  forward
                             that  data  to  up  to  this number of additional
                             tasks. Higher values offload work from  the  srun
                             command  to  the applications and likely increase
                             the vulnerability to failures.  The default value
                             is 32.

       PMI_FANOUT_OFF_HOST   This  is  used  exclusively  with PMI (MPICH2 and
                             MVAPICH2)  and  controls  the  fanout   of   data
                             communications.   The srun command sends messages
                             to application programs (via the PMI library) and
                             those  applications may be called upon to forward
                             that data to additional tasks. By  default,  srun
                             sends  one  message per host and one task on that
                             host forwards the data to  other  tasks  on  that
                             host up to PMI_FANOUT.  If PMI_FANOUT_OFF_HOST is
                             defined, the user task may be required to forward
                             the  data  to  tasks  on  other  hosts.   Setting
                             PMI_FANOUT_OFF_HOST  may  increase   performance.
                             Since  more  work is performed by the PMI library
                             loaded by the user application, failures also can
                             be more common and more difficult to diagnose.

       PMI_TIME              This  is  used  exclusively  with PMI (MPICH2 and
                             MVAPICH2)   and    controls    how    much    the
                             communications  from  the  tasks  to the srun are
                             spread out in time in order to avoid overwhelming
                             the  srun command with work. The default value is
                             500 (microseconds) per task. On  relatively  slow
                             processors  or  systems with very large processor
                             counts (and large PMI data sets),  higher  values
                             may be required.

       SLURM_CONF            The location of the SLURM configuration file.

       SLURM_ACCOUNT         Same as -A, --account

       SLURM_ACCTG_FREQ      Same as --acctg-freq

       SLURM_CHECKPOINT      Same as --checkpoint

       SLURM_CHECKPOINT_DIR  Same as --checkpoint-dir

       SLURM_CONN_TYPE       Same as --conn-type

       SLURM_CPU_BIND        Same as --cpu_bind

       SLURM_CPUS_PER_TASK   Same as -c, --cpus-per-task

       SLURM_DEBUG           Same as -v, --verbose

       SLURMD_DEBUG          Same as -d, --slurmd-debug

       SLURM_DEPENDENCY      -P, --dependency=<jobid>

       SLURM_DISABLE_STATUS  Same as -X, --disable-status

       SLURM_DIST_PLANESIZE  Same as -m plane

       SLURM_DISTRIBUTION    Same as -m, --distribution

       SLURM_EPILOG          Same as --epilog

       SLURM_EXCLUSIVE       Same as --exclusive

       SLURM_EXIT_ERROR      Specifies  the  exit  code generated when a SLURM
                             error occurs (e.g. invalid options).  This can be
                             used  by a script to distinguish application exit
                             codes from various SLURM error conditions.   Also
                             see SLURM_EXIT_IMMEDIATE.

       SLURM_EXIT_IMMEDIATE  Specifies   the  exit  code  generated  when  the
                             --immediate option is used and resources are  not
                             currently  available.   This  can  be  used  by a
                             script to distinguish application exit codes from
                             various   SLURM   error   conditions.   Also  see
                             SLURM_EXIT_ERROR.

       SLURM_GEOMETRY        Same as -g, --geometry

       SLURM_JOB_NAME        Same as -J, --job-name except within an  existing
                             allocation,  in which case it is ignored to avoid
                             using the batch job's name as the  name  of  each
                             job step.

       SLURM_LABELIO         Same as -l, --label

       SLURM_MEM_BIND        Same as --mem_bind

       SLURM_MPI_TYPE        Same as --mpi

       SLURM_NETWORK         Same as --network

       SLURM_NNODES          Same as -N, --nodes

       SLURM_NTASKS_PER_CORE Same as --ntasks-per-core

       SLURM_NTASKS_PER_NODE Same as --ntasks-per-node

       SLURM_NTASKS_PER_SOCKET
                             Same as --ntasks-per-socket

       SLURM_NO_ROTATE       Same as -R, --no-rotate

       SLURM_NTASKS          Same as -n, --ntasks

       SLURM_OPEN_MODE       Same as --open-mode

       SLURM_OVERCOMMIT      Same as -O, --overcommit

       SLURM_PARTITION       Same as -p, --partition

       SLURM_PMI_KVS_NO_DUP_KEYS
                             If  set,  then  PMI  key-pairs  will  contain  no
                             duplicate keys.  This is the case for MPICH2  and
                             reduces  overhead  in  testing for duplicates for
                             improved performance

       SLURM_PROLOG          Same as --prolog

       SLURM_QOS             Same as --qos

       SLURM_REMOTE_CWD      Same as -D, --chdir=

       SLURM_RESTART_DIR     Same as --restart-dir

       SLURM_RESV_PORTS      Same as --resv-ports

       SLURM_SIGNAL          Same as --signal

       SLURM_STDERRMODE      Same as -e, --error

       SLURM_STDINMODE       Same as -i, --input

       SLURM_STDOUTMODE      Same as -o, --output

       SLURM_TASK_EPILOG     Same as --task-epilog

       SLURM_TASK_PROLOG     Same as --task-prolog

       SLURM_THREADS         Same as -T, --threads

       SLURM_TIMELIMIT       Same as -t, --time

       SLURM_UNBUFFEREDIO    Same as -u, --unbuffered

       SLURM_WAIT            Same as -W, --wait

       SLURM_WCKEY           Same as -W, --wckey

       SLURM_WORKING_DIR     -D, --chdir

OUTPUT ENVIRONMENT VARIABLES

       srun will set some environment variables  in  the  environment  of  the
       executing  tasks  on  the  remote  compute  nodes.   These  environment
       variables are:

       BASIL_RESERVATION_ID  The  reservation  ID  on  Cray  systems   running
                             ALPS/BASIL only.

       SLURM_CHECKPOINT_IMAGE_DIR
                             Directory  into which checkpoint images should be
                             written if specified on the execute line.

       SLURM_CPU_BIND_VERBOSE
                             --cpu_bind verbosity (quiet,verbose).

       SLURM_CPU_BIND_TYPE   --cpu_bind type (none,rank,map_cpu:,mask_cpu:)

       SLURM_CPU_BIND_LIST   --cpu_bind map or mask  list  (<list  of  IDs  or
                             masks for this node>)

       SLURM_CPUS_ON_NODE    Count  of processors available to the job on this
                             node.  Note the  select/linear  plugin  allocates
                             entire  nodes to jobs, so the value indicates the
                             total  count  of  CPUs  on  the  node.   For  the
                             select/cons_res plugin, this number indicates the
                             number of cores on this  node  allocated  to  the
                             job.

       SLURM_GTIDS           Global  task  IDs  running  on  this  node.  Zero
                             origin and comma separated.

       SLURM_JOB_DEPENDENCY  Set to value of the --dependency option.

       SLURM_JOB_ID (and SLURM_JOBID for backwards compatibility)
                             Job id of the executing job

       SLURM_LAUNCH_NODE_IPADDR
                             IP address of the node from which the task launch
                             was initiated (where the srun command ran from)

       SLURM_LOCALID         Node local task ID for the process within a job

       SLURM_MEM_BIND_VERBOSE
                             --mem_bind verbosity (quiet,verbose).

       SLURM_MEM_BIND_TYPE   --mem_bind type (none,rank,map_mem:,mask_mem:)

       SLURM_MEM_BIND_LIST   --mem_bind  map  or  mask  list  (<list of IDs or
                             masks for this node>)

       SLURM_NNODES          Total number  of  nodes  in  the  job's  resource
                             allocation

       SLURM_NODEID          The relative node ID of the current node

       SLURM_NODELIST        List of nodes allocated to the job

       SLURM_NTASKS          Total number of processes in the current job

       SLURM_PRIO_PROCESS    The  scheduling priority (nice value) at the time
                             of job submission.  This value is  propagated  to
                             the spawned processes.

       SLURM_PROCID          The  MPI  rank  (or  relative  process ID) of the
                             current process

       SLURM_STEPID          The step ID of the current job

       SLURM_SUBMIT_DIR      The directory from which srun was invoked.

       SLURM_TASK_PID        The process ID of the task being started.

       SLURM_TASKS_PER_NODE  Number of tasks to be  initiated  on  each  node.
                             Values  are comma separated and in the same order
                             as SLURM_NODELIST.  If two  or  more  consecutive
                             nodes are to have the same task count, that count
                             is followed by "(x#)" where "#" is the repetition
                             count.                For                example,
                             "SLURM_TASKS_PER_NODE=2(x3),1" indicates that the
                             first  three  nodes will each execute three tasks
                             and the fourth node will execute one task.

       SLURM_TOPOLOGY_ADDR   This  is  set  only  if  the   system   has   the
                             topology/tree  plugin configured.  The value will
                             be set to the names network switches which may be
                             involved  in  the  job's  communications from the
                             system's top level switch down to the leaf switch
                             and  ending  with  node name. A period is used to
                             separate each hardware component name.

       SLURM_TOPOLOGY_ADDR_PATTERN
                             This  is  set  only  if  the   system   has   the
                             topology/tree  plugin configured.  The value will
                             be    set    component    types     listed     in
                             SLURM_TOPOLOGY_ADDR.    Each  component  will  be
                             identified  as  either  "switch"  or  "node".   A
                             period   is   used   to  separate  each  hardware
                             component type.

       MPIRUN_NOALLOCATE     Do not allocate a  block  on  Blue  Gene  systems
                             only.

       MPIRUN_NOFREE         Do not free a block on Blue Gene systems only.

       MPIRUN_PARTITION      The block name on Blue Gene systems only.

SIGNALS AND ESCAPE SEQUENCES

       Signals  sent  to  the  srun command are automatically forwarded to the
       tasks it is controlling with a  few  exceptions.  The  escape  sequence
       <control-c> will report the state of all tasks associated with the srun
       command. If <control-c> is entered twice within one  second,  then  the
       associated  SIGINT  signal  will be sent to all tasks and a termination
       sequence will be entered sending SIGCONT, SIGTERM, and SIGKILL  to  all
       spawned  tasks.   If  a third <control-c> is received, the srun program
       will be terminated without waiting for remote tasks to  exit  or  their
       I/O to complete.

       The escape sequence <control-z> is presently ignored. Our intent is for
       this put the srun command into a mode where various special actions may
       be invoked.

MPI SUPPORT

       MPI  use  depends  upon  the  type  of MPI being used.  There are three
       fundamentally different modes of operation used by  these  various  MPI
       implementation.

       1.  SLURM  directly  launches  the tasks and performs initialization of
       communications (Quadrics MPI, MPICH2, MPICH-GM, MVAPICH,  MVAPICH2  and
       some MPICH1 modes). For example: "srun -n16 a.out".

       2.  SLURM  creates  a  resource  allocation for the job and then mpirun
       launches tasks using SLURM's infrastructure (OpenMPI,  LAM/MPI,  HP-MPI
       and some MPICH1 modes).

       3.  SLURM  creates  a  resource  allocation for the job and then mpirun
       launches tasks using some mechanism other than SLURM, such  as  SSH  or
       RSH  (BlueGene  MPI  and  some  MPICH1  modes).   These tasks initiated
       outside of SLURM's monitoring or  control.  SLURM's  epilog  should  be
       configured   to   purge  these  tasks  when  the  job's  allocation  is
       relinquished.

       See  https://computing.llnl.gov/linux/slurm/mpi_guide.html   for   more
       information on use of these various MPI implementation with SLURM.

MULTIPLE PROGRAM CONFIGURATION

       Comments  in the configuration file must have a "#" in column one.  The
       configuration file contains the following  fields  separated  by  white
       space:

       Task rank
              One  or  more  task  ranks  to use this configuration.  Multiple
              values may be comma separated.  Ranges may be indicated with two
              numbers separated with a '-' with the smaller number first (e.g.
              "0-4" and not "4-0").  To indicate all tasks, specify a rank  of
              '*'  (in  which  case  you  probably  should  not  be using this
              option).  If an attempt is made to initiate a task for which  no
              executable  program is defined, the following error message will
              be produced "No executable program specified for this task".

       Executable
              The name of the program to  execute.   May  be  fully  qualified
              pathname if desired.

       Arguments
              Program  arguments.   The  expression "%t" will be replaced with
              the task's number.  The expression "%o" will  be  replaced  with
              the task's offset within this range (e.g. a configured task rank
              value of "1-5" would  have  offset  values  of  "0-4").   Single
              quotes   may  be  used  to  avoid  having  the  enclosed  values
              interpreted.  This field is optional.

       For example:
       ###################################################################
       # srun multiple program configuration file
       #
       # srun -n8 -l --multi-prog silly.conf
       ###################################################################
       4-6       hostname
       1,7       echo  task:%t
       0,2-3     echo  offset:%o

       > srun -n8 -l --multi-prog silly.conf
       0: offset:0
       1: task:1
       2: offset:1
       3: offset:2
       4: linux15.llnl.gov
       5: linux16.llnl.gov
       6: linux17.llnl.gov
       7: task:7

EXAMPLES

       This simple example demonstrates the execution of the command  hostname
       in  eight tasks. At least eight processors will be allocated to the job
       (the same as the task count) on however  many  nodes  are  required  to
       satisfy the request. The output of each task will be proceeded with its
       task number.  (The machine "dev" in the example below has  a  total  of
       two CPUs per node)

       > srun -n8 -l hostname
       0: dev0
       1: dev0
       2: dev1
       3: dev1
       4: dev2
       5: dev2
       6: dev3
       7: dev3

       The  srun -r option is used within a job script to run two job steps on
       disjoint nodes in the  following  example.  The  script  is  run  using
       allocate mode instead of as a batch job in this case.

       > cat test.sh
       #!/bin/sh
       echo $SLURM_NODELIST
       srun -lN2 -r2 hostname
       srun -lN2 hostname

       > salloc -N4 test.sh
       dev[7-10]
       0: dev9
       1: dev10
       0: dev7
       1: dev8

       The following script runs two job steps in parallel within an allocated
       set of nodes.

       > cat test.sh
       #!/bin/bash
       srun -lN2 -n4 -r 2 sleep 60 &
       srun -lN2 -r 0 sleep 60 &
       sleep 1
       squeue
       squeue -s
       wait

       > salloc -N4 test.sh
         JOBID PARTITION     NAME     USER  ST      TIME  NODES NODELIST
         65641     batch  test.sh   grondo   R      0:01      4 dev[7-10]

       STEPID     PARTITION     USER      TIME NODELIST
       65641.0        batch   grondo      0:01 dev[7-8]
       65641.1        batch   grondo      0:01 dev[9-10]

       This example demonstrates how one executes a simple MPICH job.  We  use
       srun  to  build  a list of machines (nodes) to be used by mpirun in its
       required format. A sample command line and the script  to  be  executed
       follow.

       > cat test.sh
       #!/bin/sh
       MACHINEFILE="nodes.$SLURM_JOB_ID"

       # Generate Machinefile for mpich such that hosts are in the same
       #  order as if run via srun
       #
       srun -l /bin/hostname | sort -n | awk '{print $2}' > $MACHINEFILE

       # Run using generated Machine file:
       mpirun -np $SLURM_NTASKS -machinefile $MACHINEFILE mpi-app

       rm $MACHINEFILE

       > salloc -N2 -n4 test.sh

       This  simple  example  demonstrates  the execution of different jobs on
       different nodes in the same srun.  You can do this for  any  number  of
       nodes  or  any number of jobs.  The executables are placed on the nodes
       sited by the SLURM_NODEID env var.  Starting at  0  and  going  to  the
       number specified on the srun commandline.

       > cat test.sh
       case $SLURM_NODEID in
           0) echo "I am running on "
              hostname ;;
           1) hostname
              echo "is where I am running" ;;
       esac

       > srun -N2 test.sh
       dev0
       is where I am running
       I am running on
       dev1

       This  example  demonstrates use of multi-core options to control layout
       of tasks.  We request that four sockets per  node  and  two  cores  per
       socket be dedicated to the job.

       > srun -N2 -B 4-4:2-2 a.out

       This  example shows a script in which Slurm is used to provide resource
       management for a job by executing the various job steps  as  processors
       become available for their dedicated use.

       > cat my.script
       #!/bin/bash
       srun --exclusive -n4 prog1 &
       srun --exclusive -n3 prog2 &
       srun --exclusive -n1 prog3 &
       srun --exclusive -n1 prog4 &
       wait

COPYING

       Copyright  (C)  2006-2007  The Regents of the University of California.
       Copyright (C) 2008-2010 Lawrence Livermore National Security.  Produced
       at   Lawrence   Livermore   National   Laboratory   (cf,   DISCLAIMER).
       CODE-OCEC-09-009. All rights reserved.

       This file is  part  of  SLURM,  a  resource  management  program.   For
       details, see <https://computing.llnl.gov/linux/slurm/>.

       SLURM  is free software; you can redistribute it and/or modify it under
       the terms of the GNU General Public License as published  by  the  Free
       Software  Foundation;  either  version  2  of  the License, or (at your
       option) any later version.

       SLURM is distributed in the hope that it will be  useful,  but  WITHOUT
       ANY  WARRANTY;  without even the implied warranty of MERCHANTABILITY or
       FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General  Public  License
       for more details.

SEE ALSO

       salloc(1),  sattach(1),  sbatch(1), sbcast(1), scancel(1), scontrol(1),
       squeue(1), slurm.conf(5), sched_setaffinity(2), numa(3) getrlimit(2),