Provided by: slurm-llnl_2.6.5-1_amd64 bug

NAME

       srun - Run parallel jobs

SYNOPSIS

       srun [OPTIONS...]  executable [args...]

DESCRIPTION

       Run  a  parallel  job on cluster managed by SLURM.  If necessary, srun will first create a
       resource allocation in which to run the parallel job.

       The following document describes the the influence of various options on the allocation of
       cpus to jobs and tasks.
       http://slurm.schedmd.com/cpu_management.html

OPTIONS

       -A, --account=<account>
              Charge  resources  used  by  this  job  to  specified  account.   The account is an
              arbitrary string. The account name may be changed after job  submission  using  the
              scontrol command.

       --acctg-freq
              Define  the  job  accounting and profiling sampling intervals.  This can be used to
              override  the  JobAcctGatherFrequency  parameter  in  SLURM's  configuration  file,
              slurm.conf.  The supported format is follows:

              --acctg-freq=<datatype>=<interval>
                          where  <datatype>=<interval>  specifies  the task sampling interval for
                          the jobacct_gather plugin or a sampling interval for a  profiling  type
                          by    the   acct_gather_profile   plugin.   Multiple,   comma-separated
                          <datatype>=<interval> intervals may be specified.  Supported  datatypes
                          are as follows:

                          task=<interval>
                                 where  <interval>  is  the task sampling interval in seconds for
                                 the  jobacct_gather  plugins  and  for  task  profiling  by  the
                                 acct_gather_profile plugin.

                          energy=<interval>
                                 where  <interval> is the sampling interval in seconds for energy
                                 profiling using the acct_gather_energy plugin

                          network=<interval>
                                 where  <interval>  is  the  sampling  interval  in  seconds  for
                                 infiniband profiling using the acct_gather_infiniband plugin.

                          filesystem=<interval>
                                 where  <interval>  is  the  sampling  interval  in  seconds  for
                                 filesystem profiling using the acct_gather_filesystem plugin.

              The default value for the task sampling interval
              is 30. The default value for all other intervals is 0.  An interval of  0  disables
              sampling  of  the  specified  type.  If the task sampling interval is 0, accounting
              information is collected only at job termination (reducing SLURM interference  with
              the job).
              Smaller  (non-zero)  values have a greater impact upon job performance, but a value
              of 30 seconds is not likely to be noticeable  for  applications  having  less  than
              10,000 tasks.

       -B --extra-node-info=<sockets[:cores[:threads]]>
              Request  a  specific allocation of resources with details as to the number and type
              of computational resources  within  a  cluster:  number  of  sockets  (or  physical
              processors)  per node, cores per socket, and threads per core.  The total amount of
              resources being requested is the product of all of the terms.  Each value specified
              is  considered  a minimum.  An asterisk (*) can be used as a placeholder indicating
              that all available resources of that type are to be utilized.  As with  nodes,  the
              individual levels can also be specified in separate options if desired:
                  --sockets-per-node=<sockets>
                  --cores-per-socket=<cores>
                  --threads-per-core=<threads>
              If  task/affinity  plugin  is enabled, then specifying an allocation in this manner
              also sets a default --cpu_bind option of threads  if  the  -B  option  specifies  a
              thread  count, otherwise an option of cores if a core count is specified, otherwise
              an option of sockets.  If SelectType is configured to select/cons_res, it must have
              a  parameter  of  CR_Core,  CR_Core_Memory, CR_Socket, or CR_Socket_Memory for this
              option  to  be  honored.   This  option  is  not  supported  on  BlueGene   systems
              (select/bluegene  plugin  is  configured).  If not specified, the scontrol show job
              will display 'ReqS:C:T=*:*:*'.

       --begin=<time>
              Defer initiation of this job until the specified time.  It  accepts  times  of  the
              form  HH:MM:SS  to run a job at a specific time of day (seconds are optional).  (If
              that time is already past,  the  next  day  is  assumed.)   You  may  also  specify
              midnight, noon, or teatime (4pm) and you can have a time-of-day suffixed with AM or
              PM for running in the morning or the evening.  You can also say what  day  the  job
              will  be  run,  by  specifying  a  date  of the form MMDDYY or MM/DD/YY YYYY-MM-DD.
              Combine date and time using the following format YYYY-MM-DD[THH:MM[:SS]].  You  can
              also  give  times  like now + count time-units, where the time-units can be seconds
              (default), minutes, hours, days, or weeks and you can tell SLURM  to  run  the  job
              today with the keyword today and to run the job tomorrow with the keyword tomorrow.
              The value may be changed after job submission  using  the  scontrol  command.   For
              example:
                 --begin=16:00
                 --begin=now+1hour
                 --begin=now+60           (seconds by default)
                 --begin=2010-01-20T12:34:00

              Notes on date/time specifications:
               -  Although  the  'seconds' field of the HH:MM:SS time specification is allowed by
              the code, note that the poll time of the SLURM scheduler is not precise  enough  to
              guarantee  dispatch  of  the  job on the exact second.  The job will be eligible to
              start on the next poll following  the  specified  time.  The  exact  poll  interval
              depends on the SLURM scheduler (e.g., 60 seconds with the default sched/builtin).
               - If no time (HH:MM:SS) is specified, the default is (00:00:00).
               -  If  a  date  is specified without a year (e.g., MM/DD) then the current year is
              assumed, unless the combination of MM/DD and HH:MM:SS has already passed  for  that
              year, in which case the next year is used.

       --checkpoint=<time>
              Specifies  the  interval between creating checkpoints of the job step.  By default,
              the job step will have no checkpoints created.   Acceptable  time  formats  include
              "minutes",      "minutes:seconds",      "hours:minutes:seconds",      "days-hours",
              "days-hours:minutes" and "days-hours:minutes:seconds".

       --checkpoint-dir=<directory>
              Specifies the directory into which the job  or  job  step's  checkpoint  should  be
              written  (used  by  the  checkpoint/blcr  and  checkpoint/xlch  plugins only).  The
              default value is the current working directory.  Checkpoint files will  be  of  the
              form "<job_id>.ckpt" for jobs and "<job_id>.<step_id>.ckpt" for job steps.

       --comment=<string>
              An arbitrary comment.

       -C, --constraint=<list>
              Nodes  can  have  features  assigned to them by the SLURM administrator.  Users can
              specify which of these features are required by  their  job  using  the  constraint
              option.   Only  nodes  having features matching the job constraints will be used to
              satisfy the request.  Multiple constraints may be specified with AND, OR, exclusive
              OR, resource counts, etc.  Supported constraint options include:

              Single Name
                     Only  nodes  which  have  the  specified feature will be used.  For example,
                     --constraint="intel"

              Node Count
                     A request can specify the number  of  nodes  needed  with  some  feature  by
                     appending  an  asterisk  and  count  after  the  feature  name.  For example
                     "--nodes=16 --constraint=graphics*4 ..."  indicates that the job requires 16
                     nodes at that at least four of those nodes must have the feature "graphics."

              AND    If only nodes with all of specified features will be used.  The ampersand is
                     used for an AND operator.  For example, --constraint="intel&gpu"

              OR     If only nodes with at least one of specified features  will  be  used.   The
                     vertical    bar    is    used    for   an   OR   operator.    For   example,
                     --constraint="intel|amd"

              Exclusive OR
                     If only one of a set of possible options should be used  for  all  allocated
                     nodes,  then  use  the  OR  operator  and  enclose the options within square
                     brackets.  For example:  "--constraint=[rack1|rack2|rack3|rack4]"  might  be
                     used  to  specify  that  all nodes must be allocated on a single rack of the
                     cluster, but any of those four racks can be used.

              Multiple Counts
                     Specific counts of multiple resources may be  specified  by  using  the  AND
                     operator  and  enclosing  the  options within square brackets.  For example:
                     "--constraint=[rack1*2&rack2*4]" might be used to  specify  that  two  nodes
                     must be allocated from nodes with the feature of "rack1" and four nodes must
                     be allocated from nodes with the feature "rack2".

       WARNING: When srun is executed from within salloc or sbatch, the constraint value can only
       contain a single feature name. None of the other operators are currently supported for job
       steps.

       --contiguous
              If set, then the allocated nodes must form a contiguous set.  Not honored with  the
              topology/tree  or  topology/3d_torus  plugins,  both  of  which can modify the node
              ordering.  Not honored for a job step's allocation.

       --cores-per-socket=<cores>
              Restrict node selection to nodes with at least the specified number  of  cores  per
              socket.  See additional information under -B option above when task/affinity plugin
              is enabled.

       --cpu_bind=[{quiet,verbose},]type
              Bind tasks to CPUs.  Used only when the  task/affinity  or  task/cgroup  plugin  is
              enabled.   The  configuration parameter TaskPluginParam may override these options.
              For example, if TaskPluginParam is configured to bind to cores, your job  will  not
              be  able  to  bind  tasks  to  sockets.   NOTE:  To have SLURM always report on the
              selected CPU binding for all commands executed in a shell, you can  enable  verbose
              mode by setting the SLURM_CPU_BIND environment variable value to "verbose".

              The  following  informational  environment  variables are set when --cpu_bind is in
              use:
                   SLURM_CPU_BIND_VERBOSE
                   SLURM_CPU_BIND_TYPE
                   SLURM_CPU_BIND_LIST

              See the ENVIRONMENT VARIABLES section  for  a  more  detailed  description  of  the
              individual SLURM_CPU_BIND* variables.

              When using --cpus-per-task to run multithreaded tasks, be aware that CPU binding is
              inherited from the parent of the process.  This means that the  multithreaded  task
              should  either  specify or clear the CPU binding itself to avoid having all threads
              of the multithreaded task use the same mask/CPU as the parent.  Alternatively,  fat
              masks  (masks  which specify more than one allowed CPU) could be used for the tasks
              in order to provide multiple CPUs for the multithreaded tasks.

              By default, a job step has access to every CPU allocated to  the  job.   To  ensure
              that distinct CPUs are allocated to each job step, use the --exclusive option.

              If  the job step allocation includes an allocation with a number of sockets, cores,
              or threads equal to the number of tasks to  be  started  then  the  tasks  will  by
              default  be bound to the appropriate resources (auto binding). Disable this mode of
              operation by explicitly setting "--cpu-bind=none".

              Note that a job step can be allocated different numbers of CPUs on each node or  be
              allocated  CPUs  not  starting at location zero. Therefore one of the options which
              automatically generate the task binding is recommended.  Explicitly specified masks
              or  bindings  are only honored when the job step has been allocated every available
              CPU on the node.

              Binding a task to a NUMA locality domain means to bind the task to the set of  CPUs
              that  belong  to  the NUMA locality domain or "NUMA node".  If NUMA locality domain
              options are used on systems with no NUMA support, then each socket is considered  a
              locality domain.

              Supported options include:

              q[uiet]
                     Quietly bind before task runs (default)

              v[erbose]
                     Verbosely report binding before task runs

              no[ne] Do not bind tasks to CPUs (default unless auto binding is applied)

              rank   Automatically  bind  by task rank.  Task zero is bound to socket (or core or
                     thread) zero, etc.  Not supported unless the entire node is allocated to the
                     job.

              map_cpu:<list>
                     Bind   by   mapping   CPU   IDs  to  tasks  as  specified  where  <list>  is
                     <cpuid1>,<cpuid2>,...<cpuidN>.  CPU IDs are interpreted  as  decimal  values
                     unless  they  are  preceded  with '0x' in which case they are interpreted as
                     hexadecimal values.  Not supported unless the entire node  is  allocated  to
                     the  job.   This  option  is  currently  only supported by the task/affinity
                     plugin.

              mask_cpu:<list>
                     Bind  by  setting  CPU  masks  on  tasks  as  specified  where   <list>   is
                     <mask1>,<mask2>,...<maskN>.  CPU masks are always interpreted as hexadecimal
                     values but can be preceded with an optional '0x'.  Not supported unless  the
                     entire  node  is  allocated  to  the  job.   This  option  is currently only
                     supported by the task/affinity plugin.

              rank_ldom
                     Bind to a NUMA locality domain by rank

              map_ldom:<list>
                     Bind by mapping NUMA locality domain IDs to tasks as specified where  <list>
                     is  <ldom1>,<ldom2>,...<ldomN>.   The locality domain IDs are interpreted as
                     decimal values unless they are preceded with '0x' in  which  case  they  are
                     interpreted  as hexadecimal values.  Not supported unless the entire node is
                     allocated to the job.

              mask_ldom:<list>
                     Bind by setting NUMA locality domain  masks  on  tasks  as  specified  where
                     <list> is <mask1>,<mask2>,...<maskN>.  NUMA locality domain masks are always
                     interpreted as hexadecimal values but can be preceded with an optional '0x'.
                     Not supported unless the entire node is allocated to the job.

              sockets
                     Automatically generate masks binding tasks to sockets.  Only the CPUs on the
                     socket which have been allocated to the job will be used.  If the number  of
                     tasks  differs  from  the  number  of  allocated  sockets this can result in
                     sub-optimal binding.

              cores  Automatically generate masks binding tasks to cores.  If the number of tasks
                     differs  from  the  number of allocated cores this can result in sub-optimal
                     binding.

              threads
                     Automatically generate masks binding tasks to threads.   If  the  number  of
                     tasks  differs  from  the  number  of  allocated  threads this can result in
                     sub-optimal binding.

              ldoms  Automatically generate masks binding tasks to NUMA locality domains.  If the
                     number  of  tasks differs from the number of allocated locality domains this
                     can result in sub-optimal binding.

              help   Show help message for cpu_bind

       --cpu-freq =<requested frequency in kilohertz>

              Request that the job step initiated by this srun be run at the requested  frequency
              if possible, on the cpus selected for the step on the compute node(s).  In addition
              to specifying a numerical frequency in kilohertz,  the  request  can  specify  low,
              medium,  or  high  for the value. "Low" will select the lowest available frequency,
              "high" will select the highest available frequency, while "medium" attempts to  set
              a  frequency  in  the middle of the available range. If the numeric value specified
              does not exactly match a legal available frequency, SLURM will attempt  to  pick  a
              legal frequency close to the request.

              The  following  informational  environment  variable  is  set  in the job step when
              --cpu-freq option is requested.
                      SLURM_CPU_FREQ_REQ

              This environment variable can also  be  used  to  supply  the  value  for  the  cpu
              frequency  request  if it is set when the 'srun' command is issued.  The --cpu-freq
              on the command  line  will  override  the  environment  variable  value.   See  the
              ENVIRONMENT VARIABLES section for a description of the SLURM_CPU_FREQ_REQ variable.

              NOTE: This parameter is treated as a request, not a requirement.  If the job step's
              node does not support setting the cpu frequency, or the requested value is  outside
              the  bounds  of  the  legal  frequencies,  an  error is logged, but the job step is
              allowed to continue.

              NOTE: Setting the frequency for just the cpus of the  job  step  implies  that  the
              tasks    are    confined    to    those   cpus.    If   task   confinement   (i.e.,
              TaskPlugin=task/affinity  or  TaskPlugin=task/cgroup  with   the   "ConstrainCores"
              option) is not configured, this parameter is ignored.

       -c, --cpus-per-task=<ncpus>
              Request  that  ncpus  be  allocated  per  process. This may be useful if the job is
              multithreaded and requires more than one CPU per task for optimal performance.  The
              default  is one CPU per process.  If -c is specified without -n, as many tasks will
              be allocated per node as possible while satisfying the -c restriction. For instance
              on  a  cluster  with 8 CPUs per node, a job request for 4 nodes and 3 CPUs per task
              may be allocated 3 or 6 CPUs per node (1  or  2  tasks  per  node)  depending  upon
              resource consumption by other jobs. Such a job may be unable to execute more than a
              total of 4 tasks.  This option may also be useful to spawn tasks without allocating
              resources to the job step from the job's allocation when running multiple job steps
              with the --exclusive option.

              WARNING: There are configurations and options interpreted differently  by  job  and
              job step requests which can result in inconsistencies for this option.  For example
              srun -c2 --threads-per-core=1 prog may allocate two cores for the job, but if  each
              of those cores contains two threads, the job allocation will include four CPUs. The
              job step allocation will then launch two threads per CPU for a total of two tasks.

              WARNING:  When  srun  is  executed  from  within  salloc  or  sbatch,   there   are
              configurations and options which can result in inconsistent allocations when -c has
              a value greater than -c on salloc or sbatch.

       -d, --dependency=<dependency_list>
              Defer the start of this job until the specified dependencies  have  been  satisfied
              completed.          <dependency_list>         is         of         the        form
              <type:job_id[:job_id][,type:job_id[:job_id]]>.   Many  jobs  can  share  the   same
              dependency  and  these  jobs may even belong to different  users. The  value may be
              changed after job submission using the scontrol command.

              after:job_id[:jobid...]
                     This job can begin execution after the specified jobs have begun execution.

              afterany:job_id[:jobid...]
                     This job can begin execution after the specified jobs have terminated.

              afternotok:job_id[:jobid...]
                     This job can begin execution after the specified  jobs  have  terminated  in
                     some failed state (non-zero exit code, node failure, timed out, etc).

              afterok:job_id[:jobid...]
                     This  job  can  begin  execution  after the specified jobs have successfully
                     executed (ran to completion with an exit code of zero).

              expand:job_id
                     Resources allocated to this job should be used to expand the specified  job.
                     The  job  to  expand  must  share  the  same  QOS  (Quality  of Service) and
                     partition.  Gang scheduling of  resources  in  the  partition  is  also  not
                     supported.

              singleton
                     This  job can begin execution after any previously launched jobs sharing the
                     same job name and user have terminated.

       -D, --chdir=<path>
              have the remote processes do a  chdir  to  path  before  beginning  execution.  The
              default is to chdir to the current working directory of the srun process.

       -e, --error=<mode>
              Specify  how  stderr  is  to  be  redirected.  By default in interactive mode, srun
              redirects stderr to the same file as stdout,  if  one  is  specified.  The  --error
              option  is  provided  to  allow  stdout  and  stderr  to be redirected to different
              locations.  See IO Redirection below for  more  options.   If  the  specified  file
              already exists, it will be overwritten.

       -E, --preserve-env
              Pass  the  current  values  of  environment variables SLURM_NNODES and SLURM_NTASKS
              through to the executable, rather than computing them from commandline parameters.

       --epilog=<executable>
              srun will run executable just after the  job  step  completes.   The  command  line
              arguments  for  executable  will  be the command and arguments of the job step.  If
              executable is "none", then no srun epilog will be run. This parameter overrides the
              SrunEpilog  parameter  in slurm.conf. This parameter is completely independent from
              the Epilog parameter in slurm.conf.

       --exclusive
              This option has two slightly different meanings for job and job  step  allocations.
              When  used  to  initiate  a  job,  the job allocation cannot share nodes with other
              running jobs.  This is the opposite of --share, whichever option is  seen  last  on
              the  command line will win. The default shared/exclusive behavior depends on system
              configuration and the partition's Shared option takes  precedence  over  the  job's
              option.

              This  option  can  also  be  used  when initiating more than one job step within an
              existing resource allocation, where you want separate processors to be dedicated to
              each job step. If sufficient processors are not available to initiate the job step,
              it will be deferred. This can be thought of as providing  resource  management  for
              the job within it's allocation. Note that all CPUs allocated to a job are available
              to each job step unless the --exclusive  option  is  used  plus  task  affinity  is
              configured. Since resource management is provided by processor, the --ntasks option
              must be specified, but the following options should NOT  be  specified  --relative,
              --distribution=arbitrary.  See EXAMPLE below.

       --gid=<group>
              If  srun  is run as root, and the --gid option is used, submit the job with group's
              group access permissions.  group may be the group name or the numerical group ID.

       --gres=<list>
              Specifies a comma delimited list of generic consumable resources.   The  format  of
              each  entry  on  the  list  is  "name[:count]".  The name is that of the consumable
              resource.  The count is the number of those resources with a default  value  of  1.
              The  specified  resources will be allocated to the job on each node.  The available
              generic consumable resources is configurable by the system administrator.   A  list
              of available generic consumable resources will be printed and the command will exit
              if the option argument is "help".  Examples of use include "--gres=gpu:2,mic=1" and
              "--gres=help".   NOTE:  By  default,  a  job  step  is allocated all of the generic
              resources that have allocated to the job. To change the behavior so that  each  job
              step  is  allocated  no  generic  resources,  explicitly set the value of --gres to
              specify zero counts for each generic resource  OR  set  "--gres=none"  OR  set  the
              SLURM_STEP_GRES environment variable to "none".

       -H, --hold
              Specify  the job is to be submitted in a held state (priority of zero).  A held job
              can now be released using scontrol to reset its priority  (e.g.  "scontrol  release
              <job_id>").

       -h, --help
              Display help information and exit.

       --hint=<type>
              Bind tasks according to application hints

              compute_bound
                     Select  settings  for  compute  bound  applications:  use  all cores in each
                     socket, one thread per core

              memory_bound
                     Select settings for memory bound applications: use only  one  core  in  each
                     socket, one thread per core

              [no]multithread
                     [don't]  use  extra  threads  with in-core multi-threading which can benefit
                     communication intensive applications

              help   show this help message

       -I, --immediate[=<seconds>]
              exit if resources are not available  within  the  time  period  specified.   If  no
              argument  is  given,  resources  must  be  available immediately for the request to
              succeed.  By default,  --immediate  is  off,  and  the  command  will  block  until
              resources  become  available.  Since this option's argument is optional, for proper
              parsing the single letter option must be followed immediately with  the  value  and
              not include a space between them. For example "-I60" and not "-I 60".

       -i, --input=<mode>
              Specify  how  stdin  is  to  redirected.  By default, srun redirects stdin from the
              terminal all tasks. See IO Redirection below for  more  options.   For  OS  X,  the
              poll() function does not support stdin, so input from a terminal is not possible.

       -J, --job-name=<jobname>
              Specify  a  name  for the job. The specified name will appear along with the job id
              number when querying running jobs on  the  system.  The  default  is  the  supplied
              executable   program's   name.  NOTE:  This  information  may  be  written  to  the
              slurm_jobacct.log file. This file is space delimited so if a space is used  in  the
              jobname  name  it  will  cause  problems in properly displaying the contents of the
              slurm_jobacct.log file when the sacct command is used.

       --jobid=<jobid>
              Initiate a job step under an already allocated job with  job  id  id.   Using  this
              option  will  cause  srun  to  behave  exactly  as  if the SLURM_JOB_ID environment
              variable was set.

       -K, --kill-on-bad-exit[=0|1]
              Controls whether or not to terminate a job if any task exits with a  non-zero  exit
              code.  If  this  option is not specified, the default action will be based upon the
              SLURM configuration parameter of KillOnBadExit. If this  option  is  specified,  it
              will  take  precedence  over  KillOnBadExit.  An  option  argument of zero will not
              terminate the job. A non-zero argument or  no  argument  will  terminate  the  job.
              Note:  This option takes precedence over the -W, --wait option to terminate the job
              immediately if a task exits  with  a  non-zero  exit  code.   Since  this  option's
              argument  is optional, for proper parsing the single letter option must be followed
              immediately with the value and not include a space between them. For example  "-K1"
              and not "-K 1".

       -k, --no-kill
              Do  not  automatically  terminate  a  job of one of the nodes it has been allocated
              fails.  This option is only recognized on a job allocation, not for the  submission
              of   individual   job   steps.   The  job  will  assume  all  responsibilities  for
              fault-tolerance.  Tasks launch using this option will not be considered  terminated
              (e.g.  -K,  --kill-on-bad-exit  and -W, --wait options will have no effect upon the
              job step).  The active job step (MPI job) will likely suffer  a  fatal  error,  but
              subsequent job steps may be run if this option is specified.  The default action is
              to terminate the job upon node failure.

       --launch-cmd
              Print external launch command instead of running job normally through  SLURM.  This
              option is only valid if using something other than the launch/slurm plugin.

       --launcher-opts=<options>
              Options  for  the  external launcher if using something other than the launch/slurm
              plugin.

       -l, --label
              prepend task number to lines of stdout/err. Normally, stdout and stderr from remote
              tasks  is  line-buffered  directly  to  the stdout and stderr of srun.  The --label
              option will prepend lines of output with the remote task id.

       -L, --licenses=<license>
              Specification of licenses (or  other  resources  available  on  all  nodes  of  the
              cluster)  which  must be allocated to this job.  License names can be followed by a
              colon and count (the default count is one).  Multiple license names should be comma
              separated (e.g.  "--licenses=foo:4,bar").

       -m, --distribution=
              <block|cyclic|arbitrary|plane=<options>[:block|cyclic]>

              Specify  alternate distribution methods for remote processes.  This option controls
              the assignment of tasks to the nodes on which resources have  been  allocated,  and
              the distribution of those resources to tasks for binding (task affinity). The first
              distribution method (before the ":") controls the distribution of resources  across
              nodes.  The  optional  second  distribution  method  (after  the  ":") controls the
              distribution  of  resources  across  sockets  within  a  node.   Note   that   with
              select/cons_res,  the  number  of  cpus  allocated  on  each socket and node may be
              different. Refer to http://slurm.schedmd.com/mc_support.html for  more  information
              on resource allocation, assignment of tasks to nodes, and binding of tasks to CPUs.
              First distribution method:

              block  The  block  distribution  method  will  distribute tasks to a node such that
                     consecutive tasks share a node. For example, consider an allocation of three
                     nodes  each  with  two  cpus.  A  four-task  block distribution request will
                     distribute those tasks to the nodes with tasks one  and  two  on  the  first
                     node, task three on the second node, and task four on the third node.  Block
                     distribution is the default behavior if the  number  of  tasks  exceeds  the
                     number of allocated nodes.

              cyclic The  cyclic  distribution  method  will distribute tasks to a node such that
                     consecutive tasks are distributed over consecutive nodes (in  a  round-robin
                     fashion).  For  example, consider an allocation of three nodes each with two
                     cpus. A four-task cyclic distribution request will distribute those tasks to
                     the  nodes with tasks one and four on the first node, task two on the second
                     node, and task three on the  third  node.   Note  that  when  SelectType  is
                     select/cons_res,  the same number of CPUs may not be allocated on each node.
                     Task distribution will be round-robin among all the nodes with CPUs  yet  to
                     be  assigned  to  tasks.  Cyclic distribution is the default behavior if the
                     number of tasks is no larger than the number of allocated nodes.

              plane  The tasks are distributed in  blocks  of  a  specified  size.   The  options
                     include  a number representing the size of the task block.  This is followed
                     by an optional specification of the task distribution scheme within a  block
                     of  tasks  and between the blocks of tasks.  The number of tasks distributed
                     to each node is the  same  as  for  cyclic  distribution,  but  the  taskids
                     assigned  to each node depend on the plane size. For more details (including
                     examples and diagrams), please see
                     http://slurm.schedmd.com/mc_support.html
                     and
                     http://slurm.schedmd.com/dist_plane.html

              arbitrary
                     The arbitrary method of distribution will  allocate  processes  in-order  as
                     listed  in  file  designated by the environment variable SLURM_HOSTFILE.  If
                     this variable is listed it will over ride any other  method  specified.   If
                     not  set the method will default to block.  Inside the hostfile must contain
                     at minimum the number of hosts requested  and  be  one  per  line  or  comma
                     separated.   If  specifying a task count (-n, --ntasks=<number>), your tasks
                     will be laid out on the nodes in the order of the file.
                     NOTE: The arbitrary distribution option on a job  allocation  only  controls
                     the nodes to be allocated to the job and not the allocation of CPUs on those
                     nodes. This option is meant primarily to control a job step's task layout in
                     an existing job allocation for the srun command.

              Second distribution method:

              block  The  block  distribution  method  will distribute tasks to sockets such that
                     consecutive tasks share a socket.

              cyclic The cyclic distribution method will distribute tasks to  sockets  such  that
                     consecutive tasks are distributed over consecutive sockets (in a round-robin
                     fashion).

       --mail-type=<type>
              Notify user by email when certain event types occur.  Valid type values are  BEGIN,
              END,  FAIL,  REQUEUE,  and  ALL  (any  state  change).  The  user to be notified is
              indicated with --mail-user.

       --mail-user=<user>
              User to receive email notification of state changes as defined by --mail-type.  The
              default value is the submitting user.

       --mem=<MB>
              Specify  the  real  memory  required  per  node  in  MegaBytes.   Default  value is
              DefMemPerNode and the maximum  value  is  MaxMemPerNode.  If  configured,  both  of
              parameters  can  be  seen  using  the scontrol show config command.  This parameter
              would   generally   be   used   if   whole   nodes   are    allocated    to    jobs
              (SelectType=select/linear).   Also  see --mem-per-cpu.  --mem and --mem-per-cpu are
              mutually exclusive.  NOTE: Enforcement of memory limits currently relies  upon  the
              task/cgroup  plugin  or  enabling  of  accounting,  which  samples  memory use on a
              periodic basis (data need not be stored, just collected). In both cases memory  use
              is based upon the job's Resident Set Size (RSS). A task may exceed the memory limit
              until the next periodic accounting sample.

       --mem-per-cpu=<MB>
              Minimum  memory  required  per  allocated  CPU  in  MegaBytes.   Default  value  is
              DefMemPerCPU  and  the  maximum  value  is  MaxMemPerCPU  (see exception below). If
              configured, both of parameters can be seen using the scontrol show config  command.
              Note  that  if  the  job's --mem-per-cpu value exceeds the configured MaxMemPerCPU,
              then the user's limit will be treated as a memory  limit  per  task;  --mem-per-cpu
              will be reduced to a value no larger than MaxMemPerCPU; --cpus-per-task will be set
              and value of --cpus-per-task multiplied by the new --mem-per-cpu value  will  equal
              the  original  --mem-per-cpu  value  specified  by  the user.  This parameter would
              generally   be   used   if   individual   processors   are   allocated   to    jobs
              (SelectType=select/cons_res).    Also  see  --mem.   --mem  and  --mem-per-cpu  are
              mutually exclusive.

       --mem_bind=[{quiet,verbose},]type
              Bind tasks to memory. Used only when the task/affinity plugin is  enabled  and  the
              NUMA  memory  functions  are available.  Note that the resolution of CPU and memory
              binding may differ on some architectures. For example, CPU binding may be performed
              at the level of the cores within a processor while memory binding will be performed
              at the level of nodes, where the definition of "nodes" may differ  from  system  to
              system.  The  use  of any type other than "none" or "local" is not recommended.  If
              you want greater  control,  try  running  a  simple  test  code  with  the  options
              "--cpu_bind=verbose,none   --mem_bind=verbose,none"   to   determine  the  specific
              configuration.

              NOTE: To have SLURM always report on the selected memory binding for  all  commands
              executed  in  a  shell,  you  can enable verbose mode by setting the SLURM_MEM_BIND
              environment variable value to "verbose".

              The following informational environment variables are set  when  --mem_bind  is  in
              use:

                   SLURM_MEM_BIND_VERBOSE
                   SLURM_MEM_BIND_TYPE
                   SLURM_MEM_BIND_LIST

              See  the  ENVIRONMENT  VARIABLES  section  for  a  more detailed description of the
              individual SLURM_MEM_BIND* variables.

              Supported options include:

              q[uiet]
                     quietly bind before task runs (default)

              v[erbose]
                     verbosely report binding before task runs

              no[ne] don't bind tasks to memory (default)

              rank   bind by task rank (not recommended)

              local  Use memory local to the processor in use

              map_mem:<list>
                     bind by mapping a node's memory  to  tasks  as  specified  where  <list>  is
                     <cpuid1>,<cpuid2>,...<cpuidN>.   CPU  IDs  are interpreted as decimal values
                     unless they are preceded  with  '0x'  in  which  case  they  interpreted  as
                     hexadecimal values (not recommended)

              mask_mem:<list>
                     bind  by  setting  memory  masks  on  tasks  as  specified  where  <list> is
                     <mask1>,<mask2>,...<maskN>.   memory  masks  are   always   interpreted   as
                     hexadecimal  values.   Note  that masks must be preceded with a '0x' if they
                     don't begin with [0-9] so they are seen as numerical values by srun.

              help   show this help message

       --mincpus=<n>
              Specify a minimum number of logical cpus/processors per node.

       --msg-timeout=<seconds>
              Modify the job launch message timeout.  The default value is MessageTimeout in  the
              SLURM   configuration   file   slurm.conf.   Changes  to  this  are  typically  not
              recommended, but could be useful to diagnose problems.

       --mpi=<mpi_type>
              Identify the type of MPI to be used. May result in unique initiation procedures.

              list   Lists available mpi types to choose from.

              lam    Initiates one 'lamd' process per node and establishes necessary  environment
                     variables for LAM/MPI.

              mpich1_shmem
                     Initiates  one  process  per  node  and  establishes  necessary  environment
                     variables for mpich1 shared memory model.  This also works for mvapich built
                     for shared memory.

              mpichgm
                     For use with Myrinet.

              mvapich
                     For use with Infiniband.

              openmpi
                     For use with OpenMPI.

              none   No  special  MPI  processing.  This is the default and works with many other
                     versions of MPI.

       --multi-prog
              Run a job with different programs and different arguments for each  task.  In  this
              case,  the executable program specified is actually a configuration file specifying
              the executable and arguments for each  task.  See  MULTIPLE  PROGRAM  CONFIGURATION
              below for details on the configuration file contents.

       -N, --nodes=<minnodes[-maxnodes]>
              Request  that a minimum of minnodes nodes be allocated to this job.  A maximum node
              count may also be specified with maxnodes.  If only one number is  specified,  this
              is  used  as  both the minimum and maximum node count.  The partition's node limits
              supersede those of the job.  If a job's  node  limits  are  outside  of  the  range
              permitted  for  its  associated partition, the job will be left in a PENDING state.
              This permits possible execution at a  later  time,  when  the  partition  limit  is
              changed.   If  a  job  node  limit  exceeds  the  number of nodes configured in the
              partition,  the  job  will  be  rejected.   Note  that  the  environment   variable
              SLURM_JOB_NUM_NODES  (and  SLURM_NNODES for backwards compatibility) will be set to
              the count of nodes actually allocated to the job.  See  the  ENVIRONMENT  VARIABLES
              section  for  more information.  If -N is not specified, the default behavior is to
              allocate enough nodes to satisfy the requirements of the -n and  -c  options.   The
              job  will  be  allocated  as  many nodes as possible within the range specified and
              without delaying the initiation of the  job.   The  node  count  specification  may
              include  a  numeric  value followed by a suffix of "k" (multiplies numeric value by
              1,024) or "m" (multiplies numeric value by 1,048,576).

       -n, --ntasks=<number>
              Specify the number of tasks to run. Request that srun allocate resources for ntasks
              tasks.   The default is one task per node, but note that the --cpus-per-task option
              will change this default.

       --network=<type>
              Specify the communication protocol to be  used.   The  interpretation  of  type  is
              system  dependent.  This option is current supported on systems with IBM's Parallel
              Environment (PE).  See IBM's LoadLeveler job command  keyword  documentation  about
              the  keyword "network" for more information.  Multiple values may be specified in a
              comma separated  list.   All  options  are  case  in-sensitive.   Supported  values
              include:

              BULK_XFER[=<resources>]
                          Enable  bulk transfer of data using Remote Direct-Memory Access (RDMA).
                          The optional resources specification is a numeric value which can  have
                          a  suffix of "k", "K", "m", "M", "g" or "G" for kilobytes, megabytes or
                          gigabytes.  NOTE: The resources specification is not supported  by  the
                          underlying  IBM  infrastructure  as of Parallel Environment version 2.2
                          and no value should be specified at this time.  The  devices  allocated
                          to  a job must all be of the same type.  The default value depends upon
                          depends upon what hardware is available and in order of preferences  is
                          IPONLY (which is not considered in User Space mode), HFI, IB, HPCE, and
                          KMUX.

              CAU=<count> Number of Collecitve Acceleration Units (CAU) required.   Applies  only
                          to  IBM  Power7-IH processors.  Default value is zero.  Independent CAU
                          will be allocated for each programming interface (MPI, LAPI, etc.)

              DEVNAME=<name>
                          Specify the device name to  use  for  communications  (e.g.  "eth0"  or
                          "mlx4_0").

              DEVTYPE=<type>
                          Specify  the  device  type  to  use  for communications.  The supported
                          values  of  type  are:  "IB"  (InfiniBand),  "HFI"  (P7   Host   Fabric
                          Interface),  "IPONLY"  (IP-Only interfaces), "HPCE" (HPC Ethernet), and
                          "KMUX" (Kernel Emulation of HPCE).  The devices allocated to a job must
                          all  be  of the same type.  The default value depends upon depends upon
                          what hardware is available and in order of preferences is IPONLY (which
                          is not considered in User Space mode), HFI, IB, HPCE, and KMUX.

              IMMED =<count>
                          Number  of  immediate  send slots per window required.  Applies only to
                          IBM Power7-IH processors.  Default value is zero.

              INSTANCES =<count>
                          Specify number of network connections for each  task  on  each  network
                          connection.  The default instance count is 1.

              IPV4        Use Internet Protocol (IP) version 4 communications (default).

              IPV6        Use Internet Protocol (IP) version 6 communications.

              LAPI        Use the LAPI programming interface.

              MPI         Use the MPI programming interface.  MPI is the default interface.

              PAMI        Use the PAMI programming interface.

              SHMEM       Use the OpenSHMEM programming interface.

              SN_ALL      Use all available switch networks (default).

              SN_SINGLE   Use one available switch network.

              UPC         Use the UPC programming interface.

              US          Use User Space communications.

              Some examples of network specifications:

              Instances=2,US,MPI,SN_ALL
                          Create  two  user  space  connections  for  MPI communications on every
                          switch network for each task.

              US,MPI,Instances=3,Devtype=IB
                          Create three user space connections for  MPI  communications  on  every
                          InfiniBand network for each task.

              IPV4,LAPI,SN_Single
                          Create  a IP version 4 connection for LAPI communications on one switch
                          network for each task.

              Instances=2,US,LAPI,MPI
                          Create two user space connections each for LAPI and MPI  communications
                          on  every switch network for each task. Note that SN_ALL is the default
                          option so every switch network is  used.  Also  note  that  Instances=2
                          specifies  that two connections are established for each protocol (LAPI
                          and MPI) and each task.  If there are two networks and  four  tasks  on
                          the  node then a total of 32 connections are established (2 instances x
                          2 protocols x 2 networks x 4 tasks).

       --nice[=adjustment]
              Run the job with an adjusted scheduling priority within SLURM.  With no  adjustment
              value  the  scheduling  priority  is decreased by 100. The adjustment range is from
              -10000 (highest priority) to 10000 (lowest priority).  Only  privileged  users  can
              specify   a  negative  adjustment.  NOTE:  This  option  is  presently  ignored  if
              SchedulerType=sched/wiki or SchedulerType=sched/wiki2.

       --ntasks-per-core=<ntasks>
              Request the maximum ntasks be invoked on each core.  Meant  to  be  used  with  the
              --ntasks  option.  Related to --ntasks-per-node except at the core level instead of
              the node level.  Masks will  automatically  be  generated  to  bind  the  tasks  to
              specific  core  unless  --cpu_bind=none  is  specified.   NOTE:  This option is not
              supported            unless             SelectTypeParameters=CR_Core             or
              SelectTypeParameters=CR_Core_Memory is configured.

       --ntasks-per-node=<ntasks>
              Request  the  maximum  ntasks  be  invoked on each node.  Meant to be used with the
              --nodes option.  This is related to --cpus-per-task=ncpus,  but  does  not  require
              knowledge  of  the  actual  number of cpus on each node.  In some cases, it is more
              convenient to be able to request that no more than a specific number  of  tasks  be
              invoked  on each node.  Examples of this include submitting a hybrid MPI/OpenMP app
              where only one MPI "task/rank" should be assigned to each node while  allowing  the
              OpenMP portion to utilize all of the parallelism present in the node, or submitting
              a single setup/cleanup/monitoring job to each node of a pre-existing allocation  as
              one step in a larger job script.

       --ntasks-per-socket=<ntasks>
              Request  the  maximum  ntasks be invoked on each socket.  Meant to be used with the
              --ntasks option.  Related to --ntasks-per-node except at the socket  level  instead
              of  the  node  level.   Masks  will automatically be generated to bind the tasks to
              specific sockets unless --cpu_bind=none is specified.  NOTE:  This  option  is  not
              supported            unless            SelectTypeParameters=CR_Socket            or
              SelectTypeParameters=CR_Socket_Memory is configured.

       -O, --overcommit
              Overcommit resources. Normally, srun will not allocate more than  one  process  per
              CPU.  By  specifying --overcommit you are explicitly allowing more than one process
              per CPU. However no more than MAX_TASKS_PER_NODE tasks are permitted to execute per
              node.   NOTE:  MAX_TASKS_PER_NODE  is  defined  in  the  file  slurm.h and is not a
              variable, it is set at SLURM build time.

       -o, --output=<mode>
              Specify the mode for stdout redirection.  By  default  in  interactive  mode,  srun
              collects  stdout  from  all  tasks  and  line  buffers  this output to the attached
              terminal. With --output stdout may be redirected to a file, to one file  per  task,
              or  to  /dev/null.  See section IO Redirection below for the various forms of mode.
              If the specified file already exists, it will be overwritten.

              If --error is not also specified on the command line, both stdout and  stderr  will
              directed to the file specified by --output.

       --open-mode=<append|truncate>
              Open  the  output  and error files using append or truncate mode as specified.  The
              default value is specified by the system configuration parameter JobFileAppend.

       -p, --partition=<partition_names>
              Request a specific partition for the resource allocation.  If  not  specified,  the
              default  behavior  is to allow the slurm controller to select the default partition
              as designated by the system administrator.  If  the  job  can  use  more  than  one
              partition,  specify  their  names  in  a  comma  separate list and the one offering
              earliest initiation will be used.

       --profile=<all|none|[energy[,|task[,|lustre[,|network]]]]>
              enables detailed data collection by the acct_gather_profile plugin.  Detailed  data
              are typically time-series that are stored in an HDF5 file for the job.

              All       All data types are collected. (Cannot be combined with other values.)

              None      No data types are collected. This is the default.
                         (Cannot be combined with other values.)

              Energy    Energy data is collected.

              Task      Task (I/O, Memory, ...) data is collected.

              Lustre    Lustre data is collected.

              Network   Network (InfiniBand) data is collected.

       --prolog=<executable>
              srun  will  run  executable  just  before launching the job step.  The command line
              arguments for executable will be the command and arguments of  the  job  step.   If
              executable is "none", then no srun prolog will be run. This parameter overrides the
              SrunProlog parameter in slurm.conf. This parameter is completely  independent  from
              the Prolog parameter in slurm.conf.

       --propagate[=rlimits]
              Allows users to specify which of the modifiable (soft) resource limits to propagate
              to the compute nodes and apply to their jobs.  If rlimits is  not  specified,  then
              all  resource  limits will be propagated.  The following rlimit names are supported
              by Slurm (although some options may not be supported on some systems):

              ALL       All limits listed below

              AS        The maximum address space for a process

              CORE      The maximum size of core file

              CPU       The maximum amount of CPU time

              DATA      The maximum size of a process's data segment

              FSIZE     The maximum size of files created. Note that if the user  sets  FSIZE  to
                        less than the current size of the slurmd.log, job launches will fail with
                        a 'File size limit exceeded' error.

              MEMLOCK   The maximum size that may be locked into memory

              NOFILE    The maximum number of open files

              NPROC     The maximum number of processes available

              RSS       The maximum resident set size

              STACK     The maximum stack size

       --pty  Execute  task  zero  in  pseudo  terminal  mode.   Implicitly  sets   --unbuffered.
              Implicitly  sets  --error and --output to /dev/null for all tasks except task zero,
              which may cause those tasks to exit immediately (e.g. shells  will  typically  exit
              immediately in that situation).  Not currently supported on AIX platforms.

       -Q, --quiet
              Suppress informational messages from srun. Errors will still be displayed.

       -q, --quit-on-interrupt
              Quit  immediately on single SIGINT (Ctrl-C). Use of this option disables the status
              feature normally available when srun receives a single Ctrl-C and  causes  srun  to
              instead immediately terminate the running job.

       --qos=<qos>
              Request  a  quality  of  service  for  the job.  QOS values can be defined for each
              user/cluster/account association in the SLURM database.  Users will be  limited  to
              their  association's  defined  set of qos's when the SLURM configuration parameter,
              AccountingStorageEnforce, includes "qos" in it's definition.

       -r, --relative=<n>
              Run a job step relative to node n of the current allocation.  This  option  may  be
              used  to  spread several job steps out among the nodes of the current job. If -r is
              used, the current job step will begin at node n of the  allocated  nodelist,  where
              the  first node is considered node 0.  The -r option is not permitted with -w or -x
              option and will result in a fatal error when not running within a prior  allocation
              (i.e.  when  SLURM_JOB_ID  is  not  set).  The  default for n is 0. If the value of
              --nodes exceeds the number of  nodes  identified  with  the  --relative  option,  a
              warning message will be printed and the --relative option will take precedence.

       --resv-ports
              Reserve communication ports for this job.  Used for OpenMPI.

       --reservation=<name>
              Allocate resources for the job from the named reservation.

       --restart-dir=<directory>
              Specifies  the directory from which the job or job step's checkpoint should be read
              (used by the checkpoint/blcrm and checkpoint/xlch plugins only).

       -s, --share
              The job allocation can share nodes with other running jobs.  This is  the  opposite
              of --exclusive, whichever option is seen last on the command line will be used. The
              default  shared/exclusive  behavior  depends  on  system  configuration   and   the
              partition's  Shared option takes precedence over the job's option.  This option may
              result the allocation being granted sooner than if the --share option was  not  set
              and allow higher system utilization, but application performance will likely suffer
              due to competition for resources within a node.

       --signal=<sig_num>[@<sig_time>]
              When a job is within sig_time seconds of its end time, send it the signal  sig_num.
              Due  to  the resolution of event handling by SLURM, the signal may be sent up to 60
              seconds earlier than specified.  sig_num may either be  a  signal  number  or  name
              (e.g.  "10"  or  "USR1").  sig_time must have integer value between zero and 65535.
              By default, no signal is sent before the job's end time.  If a sig_num is specified
              without any sig_time, the default time will be 60 seconds.

       --slurmd-debug=<level>
              Specify  a  debug  level  for  slurmd(8).  level  may be an integer value between 0
              [quiet, only errors are displayed] and 4 [verbose  operation].   The  slurmd  debug
              information  is  copied  onto  the  stderr  of  the job. By default only errors are
              displayed.

       --sockets-per-node=<sockets>
              Restrict node selection to nodes with at least the  specified  number  of  sockets.
              See  additional  information  under  -B  option  above when task/affinity plugin is
              enabled.

       --switches=<count>[@<max-time>]
              When a tree topology is used, this defines the maximum count  of  switches  desired
              for  the  job allocation and optionally the maximum time to wait for that number of
              switches. If SLURM finds an allocation containing  more  switches  than  the  count
              specified, the job remains pending until it either finds an allocation with desired
              switch count or the time limit expires.  It there is no switch count  limit,  there
              is  no  delay  in  starting  the  job.   Acceptable time formats include "minutes",
              "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes"  and
              "days-hours:minutes:seconds".   The  job's maximum time delay may be limited by the
              system administrator using the SchedulerParameters configuration parameter with the
              max_switch_wait  parameter  option.   The  default  max-time is the max_switch_wait
              SchedulerParameter.

       -T, --threads=<nthreads>
              Allows limiting the number of concurrent threads used to send the job request  from
              the  srun process to the slurmd processes on the allocated nodes. Default is to use
              one thread per allocated node up to a maximum of 60 concurrent threads.  Specifying
              this option limits the number of concurrent threads to nthreads (less than or equal
              to 60).  This should only be used to set a low thread count  for  testing  on  very
              small memory computers.

       -t, --time=<time>
              Set  a  limit  on the total run time of the job or job step.  If the requested time
              limit for a job exceeds the partition's time limit, the  job  will  be  left  in  a
              PENDING  state (possibly indefinitely).  If the requested time limit for a job step
              exceeds the partition's time limit, the  job  step  will  not  be  initiated.   The
              default  time  limit is the partition's default time limit.  When the time limit is
              reached, each task in each job step is sent SIGTERM followed by SIGKILL.  The limit
              is  for  the job, all job steps are signaled. If the time limit is for a single job
              step within an existing job allocation, only that job step will be affected. A  job
              time  limit  supersedes  all job step time limits. The interval between SIGTERM and
              SIGKILL is specified by the SLURM configuration parameter KillWait.  A  time  limit
              of  zero  requests  that no time limit be imposed.  Acceptable time formats include
              "minutes",      "minutes:seconds",      "hours:minutes:seconds",      "days-hours",
              "days-hours:minutes" and "days-hours:minutes:seconds".

       --task-epilog=<executable>
              The  slurmstepd  daemon  will  run executable just after each task terminates. This
              will be executed before any TaskEpilog parameter in slurm.conf is executed. This is
              meant  to  be  a  very  short-lived  program. If it fails to terminate within a few
              seconds, it will be killed along with any descendant processes.

       --task-prolog=<executable>
              The slurmstepd daemon will run executable just before  launching  each  task.  This
              will be executed after any TaskProlog parameter in slurm.conf is executed.  Besides
              the normal environment variables, this has SLURM_TASK_PID available to identify the
              process  ID  of  the  task being started.  Standard output from this program of the
              form "export NAME=value" will be used to set environment  variables  for  the  task
              being spawned.

       --test-only
              Returns  an  estimate of when a job would be scheduled to run given the current job
              queue and all the other srun arguments specifying  the  job.   This  limits  srun's
              behavior  to  just return information; no job is actually submitted.  EXCEPTION: On
              Bluegene/Q systems on when running within an existing job allocation, this disables
              the  use  of "runjob" to launch tasks. The program will be executed directly by the
              slurmd dameon.

       --threads-per-core=<threads>
              Restrict node selection to nodes with at least the specified number of threads  per
              core.  NOTE: "Threads" refers to the number of processing units on each core rather
              than the number of application tasks to  be  launched  per  core.   See  additional
              information under -B option above when task/affinity plugin is enabled.

       --time-min=<time>
              Set  a  minimum  time  limit on the job allocation.  If specified, the job may have
              it's --time limit lowered to a value no lower than --time-min if doing  so  permits
              the  job  to begin execution earlier than otherwise possible.  The job's time limit
              will not be changed after the job is allocated resources.  This is performed  by  a
              backfill  scheduling  algorithm to allocate resources otherwise reserved for higher
              priority jobs.   Acceptable  time  formats  include  "minutes",  "minutes:seconds",
              "hours:minutes:seconds",        "days-hours",        "days-hours:minutes"       and
              "days-hours:minutes:seconds".

       --tmp=<MB>
              Specify a minimum amount of temporary disk space.

       -u, --unbuffered
              Do not line buffer stdout from remote  tasks.  This  option  cannot  be  used  with
              --label.

       --usage
              Display brief help message and exit.

       --uid=<user>
              Attempt  to  submit  and/or  run a job as user instead of the invoking user id. The
              invoking user's credentials will be used to check access permissions for the target
              partition. User root may use this option to run jobs as a normal user in a RootOnly
              partition for example. If run as root, srun will drop its permissions  to  the  uid
              specified  after  node  allocation  is  successful.  user  may  be the user name or
              numerical user ID.

       -V, --version
              Display version information and exit.

       -v, --verbose
              Increase the verbosity  of  srun's  informational  messages.   Multiple  -v's  will
              further increase srun's verbosity.  By default only errors will be displayed.

       -W, --wait=<seconds>
              Specify  how  long  to  wait after the first task terminates before terminating all
              remaining tasks. A value of 0 indicates an unlimited wait (a warning will be issued
              after  60 seconds). The default value is set by the WaitTime parameter in the slurm
              configuration file (see slurm.conf(5)). This option can be useful to insure that  a
              job is terminated in a timely fashion in the event that one or more tasks terminate
              prematurely.  Note: The -K, --kill-on-bad-exit option  takes  precedence  over  -W,
              --wait to terminate the job immediately if a task exits with a non-zero exit code.

       -w, --nodelist=<host1,host2,... or filename>
              Request  a  specific  list of hosts. The job will contain at least these hosts. The
              list may be specified as  a  comma-separated  list  of  hosts,  a  range  of  hosts
              (host[1-5,7,...]  for example), or a filename.  The host list will be assumed to be
              a filename if it contains a "/" character. If you specify a max node count  (-N1-2)
              if  there  are more than 2 hosts in the file only the first 2 nodes will be used in
              the request list.  Rather than repeating a host name multiple  times,  an  asterisk
              and  a  repitition  count may be appended to a host name. For example "host1,host1"
              and "host1*2" are equivalent.

       --wckey=<wckey>
              Specify wckey to be used with job.  If TrackWCKey=no (default)  in  the  slurm.conf
              this value is ignored.

       -X, --disable-status
              Disable  the  display  of  task status when srun receives a single SIGINT (Ctrl-C).
              Instead immediately forward the SIGINT to the running job.  Without this  option  a
              second Ctrl-C in one second is required to forcibly terminate the job and srun will
              immediately   exit.   May   also   be   set   via    the    environment    variable
              SLURM_DISABLE_STATUS.

       -x, --exclude=<host1,host2,... or filename>
              Request that a specific list of hosts not be included in the resources allocated to
              this job. The host list will  be  assumed  to  be  a  filename  if  it  contains  a
              "/"character.

       -Z, --no-allocate
              Run  the  specified  tasks  on a set of nodes without creating a SLURM "job" in the
              SLURM queue structure, bypassing the normal resource allocation step.  The list  of
              nodes  must  be  specified  with  the  -w, --nodelist option.  This is a privileged
              option only available for the users "SlurmUser" and "root".

       The following options support Blue Gene systems, but may be applicable to other systems as
       well.

       --blrts-image=<path>
              Path  to  blrts  image for bluegene block.  BGL only.  Default from blugene.conf if
              not set.

       --cnload-image=<path>
              Path  to  compute  node  image  for  bluegene  block.   BGP  only.   Default   from
              blugene.conf if not set.

       --conn-type=<type>
              Require  the  block  connection  type  to  be  of a certain type.  On Blue Gene the
              acceptable of type are MESH, TORUS and NAV.  If NAV, or if not set, then SLURM will
              try  to fit a what the DefaultConnType is set to in the bluegene.conf if that isn't
              set the default is TORUS.  You should not normally set this option.  If running  on
              a  BGP  system and wanting to run in HTC mode (only for 1 midplane and below).  You
              can use HTC_S for SMP, HTC_D for Dual, HTC_V for virtual node mode, and  HTC_L  for
              Linux  mode.   For systems that allow a different connection type per dimension you
              can supply a comma separated list of connection types may  be  specified,  one  for
              each  dimension  (i.e.  M,T,T,T  will give you a torus connection is all dimensions
              expect the first).

       -g, --geometry=<XxYxZ> | <AxXxYxZ>
              Specify the geometry requirements for the job. On BlueGene/L and BlueGene/P systems
              there  are  three  numbers giving dimensions in the X, Y and Z directions, while on
              BlueGene/Q systems there are four numbers giving dimensions in the A, X,  Y  and  Z
              directions   and   can   not   be   used   to  allocate  sub-blocks.   For  example
              "--geometry=1x2x3x4", specifies a block of nodes having 1 x 2 x 3 x 4  =  24  nodes
              (actually midplanes on BlueGene).

       --ioload-image=<path>
              Path  to  io image for bluegene block.  BGP only.  Default from blugene.conf if not
              set.

       --linux-image=<path>
              Path to linux image for bluegene block.  BGL only.  Default  from  blugene.conf  if
              not set.

       --mloader-image=<path>
              Path to mloader image for bluegene block.  Default from blugene.conf if not set.

       -R, --no-rotate
              Disables  rotation  of  the job's requested geometry in order to fit an appropriate
              block.  By default the specified geometry can rotate in three dimensions.

       --ramdisk-image=<path>
              Path to ramdisk image for bluegene block.  BGL only.  Default from blugene.conf  if
              not set.

       --reboot
              Force the allocated nodes to reboot before starting the job.

       srun  will submit the job request to the slurm job controller, then initiate all processes
       on the remote nodes. If the request cannot be met immediately, srun will block  until  the
       resources  are  free to run the job. If the -I (--immediate) option is specified srun will
       terminate if resources are not immediately available.

       When initiating remote processes srun will propagate the current working directory, unless
       --chdir=<path>  is specified, in which case path will become the working directory for the
       remote processes.

       The -n, -c, and -N options control how CPUs  and nodes will be allocated to the job.  When
       specifying  only  the number of processes to run with -n, a default of one CPU per process
       is allocated. By specifying the number of CPUs required per task (-c), more than  one  CPU
       may  be  allocated  per  process.  If  the number of nodes is specified with -N, srun will
       attempt to allocate at least the number of nodes specified.

       Combinations of the  above  three  options  may  be  used  to  change  how  processes  are
       distributed  across  nodes  and  cpus.  For  instance,  by  specifying  both the number of
       processes and number of nodes on which to  run,  the  number  of  processes  per  node  is
       implied.  However,  if  the  number  of  CPUs per process is more important then number of
       processes (-n) and the number of CPUs per process (-c) should be specified.

       srun will refuse to  allocate more than one process per CPU unless  --overcommit  (-O)  is
       also specified.

       srun  will  attempt  to meet the above specifications "at a minimum." That is, if 16 nodes
       are requested for 32 processes, and some nodes do not have 2 CPUs, the allocation of nodes
       will  be  increased  in order to meet the demand for CPUs. In other words, a minimum of 16
       nodes are being requested. However, if 16 nodes are requested for 15 processes, srun  will
       consider this an error, as 15 processes cannot run across 16 nodes.

       IO Redirection

       By  default,  stdout and stderr will be redirected from all tasks to the stdout and stderr
       of srun, and stdin will be redirected from the standard input of srun to all remote tasks.
       If  stdin  is  only to be read by a subset of the spawned tasks, specifying a file to read
       from rather than forwarding stdin from the srun command may be  preferable  as  it  avoids
       moving and storing data that will never be read.

       For  OS  X,  the  poll()  function does not support stdin, so input from a terminal is not
       possible.

       For BGQ srun only supports stdin to 1 task running on the system.  By default it is taskid
       0    but    can    be    changed    with   the   -i<taskid>   as   described   below,   or
       --launcher-opts="--stdinrank=<taskid>".

       This behavior may be changed with the --output, --error, and --input (-o, -e, -i) options.
       Valid format specifications for these options are

       all       stdout  stderr  is redirected from all tasks to srun.  stdin is broadcast to all
                 remote tasks.  (This is the default behavior)

       none      stdout and stderr is not received from any task.  stdin is not sent to any  task
                 (stdin is closed).

       taskid    stdout and/or stderr are redirected from only the task with relative id equal to
                 taskid, where 0 <= taskid <= ntasks, where ntasks is the total number  of  tasks
                 in  the  current  job  step.  stdin is redirected from the stdin of srun to this
                 same task.  This file will be written on the node executing the task.

       filename  srun will redirect stdout and/or stderr to the named file from all tasks.  stdin
                 will  be  redirected  from the named file and broadcast to all tasks in the job.
                 filename refers to a path  on  the  host  that  runs  srun.   Depending  on  the
                 cluster's  file  system  layout,  this  may  result  in  the output appearing in
                 different places depending on whether the job is run in batch mode.

       format string
                 srun allows for a format string to  be  used  to  generate  the  named  IO  file
                 described  above.  The  following  list  of format specifiers may be used in the
                 format string to generate a filename that will  be  unique  to  a  given  jobid,
                 stepid,  node, or task. In each case, the appropriate number of files are opened
                 and associated with  the  corresponding  tasks.  Note  that  any  format  string
                 containing  %t,  %n,  and/or  %N  will be written on the node executing the task
                 rather than the node where  srun  executes,  these  format  specifiers  are  not
                 supported on a BGQ system.

                 %A     Job array's master job allocation number.

                 %a     Job array ID (index) number.

                 %J     jobid.stepid of the running job. (e.g. "128.0")

                 %j     jobid of the running job.

                 %s     stepid of the running job.

                 %N     short hostname. This will create a separate IO file per node.

                 %n     Node  identifier  relative  to current job (e.g. "0" is the first node of
                        the running job) This will create a separate IO file per node.

                 %t     task identifier (rank) relative  to  current  job.  This  will  create  a
                        separate IO file per task.

                 %u     User name.

                 A  number  placed between the percent character and format specifier may be used
                 to zero-pad the result in the IO filename. This number is ignored if the  format
                 specifier corresponds to  non-numeric data (%N for example).

                 Some  examples of how the format string may be used for a 4 task job step with a
                 Job ID of 128 and step id of 0 are included below:

                 job%J.out      job128.0.out

                 job%4j.out     job0128.out

                 job%j-%2t.out  job128-00.out, job128-01.out, ...

INPUT ENVIRONMENT VARIABLES

       Some srun options may be set via  environment  variables.   These  environment  variables,
       along with their corresponding options, are listed below.  Note: Command line options will
       always override these settings.

       PMI_FANOUT            This is used exclusively with PMI (MPICH2 and MVAPICH2) and controls
                             the  fanout  of data communications. The srun command sends messages
                             to application programs (via the PMI library) and those applications
                             may  be  called  upon  to  forward that data to up to this number of
                             additional tasks. Higher values offload work from the  srun  command
                             to  the  applications  and  likely  increase  the  vulnerability  to
                             failures.  The default value is 32.

       PMI_FANOUT_OFF_HOST   This is used exclusively with PMI (MPICH2 and MVAPICH2) and controls
                             the  fanout of data communications.  The srun command sends messages
                             to application programs (via the PMI library) and those applications
                             may  be  called  upon  to  forward that data to additional tasks. By
                             default, srun sends one message per host and one task on  that  host
                             forwards  the data to other tasks on that host up to PMI_FANOUT.  If
                             PMI_FANOUT_OFF_HOST is defined, the user task  may  be  required  to
                             forward    the    data   to   tasks   on   other   hosts.    Setting
                             PMI_FANOUT_OFF_HOST may increase performance.  Since  more  work  is
                             performed  by  the  PMI  library  loaded  by  the  user application,
                             failures also can be more common and more difficult to diagnose.

       PMI_TIME              This is used exclusively with PMI (MPICH2 and MVAPICH2) and controls
                             how  much  the  communications from the tasks to the srun are spread
                             out in time in order to avoid overwhelming  the  srun  command  with
                             work.   The  default  value  is  500  (microseconds)  per  task.  On
                             relatively slow processors or  systems  with  very  large  processor
                             counts (and large PMI data sets), higher values may be required.

       SLURM_CONF            The location of the SLURM configuration file.

       SLURM_ACCOUNT         Same as -A, --account

       SLURM_ACCTG_FREQ      Same as --acctg-freq

       SLURM_BLRTS_IMAGE     Same as --blrts-image

       SLURM_CHECKPOINT      Same as --checkpoint

       SLURM_CHECKPOINT_DIR  Same as --checkpoint-dir

       SLURM_CNLOAD_IMAGE    Same as --cnload-image

       SLURM_CONN_TYPE       Same as --conn-type

       SLURM_CPU_BIND        Same as --cpu_bind

       SLURM_CPU_FREQ_REQ    Same  as --cpu-freq. Can specify a numerical frequency in kilohertz,
                             or the request can specify low, medium, or high for the value. "Low"
                             will  select  the lowest available frequency, "high" will select the
                             highest available  frequency,  while  "medium"  attempts  to  set  a
                             frequency in the middle of the available range. If the numeric value
                             specified does not exactly match a legal available frequency,  SLURM
                             will attempt to pick a legal frequency close to the request.

       SLURM_CPUS_PER_TASK   Same as -c, --cpus-per-task

       SLURM_DEBUG           Same as -v, --verbose

       SLURMD_DEBUG          Same as -d, --slurmd-debug

       SLURM_DEPENDENCY      -P, --dependency=<jobid>

       SLURM_DISABLE_STATUS  Same as -X, --disable-status

       SLURM_DIST_PLANESIZE  Same as -m plane

       SLURM_DISTRIBUTION    Same as -m, --distribution

       SLURM_EPILOG          Same as --epilog

       SLURM_EXCLUSIVE       Same as --exclusive

       SLURM_EXIT_ERROR      Specifies  the  exit  code generated when a SLURM error occurs (e.g.
                             invalid options).  This can be  used  by  a  script  to  distinguish
                             application  exit  codes  from various SLURM error conditions.  Also
                             see SLURM_EXIT_IMMEDIATE.

       SLURM_EXIT_IMMEDIATE  Specifies the exit code generated when  the  --immediate  option  is
                             used and resources are not currently available.  This can be used by
                             a script to distinguish application exit codes  from  various  SLURM
                             error conditions.  Also see SLURM_EXIT_ERROR.

       SLURM_GEOMETRY        Same as -g, --geometry

       SLURM_GRES            Same as --gres. Also see SLURM_STEP_GRES

       SLURM_IMMEDIATE       Same as -I, --immediate

       SLURM_IOLOAD_IMAGE    Same as --ioload-image

       SLURM_JOB_ID (and SLURM_JOBID for backwards compatibility)
                             Same as --jobid

       SLURM_JOB_NAME        Same  as  -J,  --job-name  except  within an existing allocation, in
                             which case it is ignored to avoid using the batch job's name as  the
                             name of each job step.

       SLURM_JOB_NUM_NODES (and SLURM_NNODES for backwards compatibility)
                             Total number of nodes in the job’s resource allocation.

       SLURM_KILL_BAD_EXIT   Same as -K, --kill-on-bad-exit

       SLURM_LABELIO         Same as -l, --label

       SLURM_LINUX_IMAGE     Same as --linux-image

       SLURM_MEM_BIND        Same as --mem_bind

       SLURM_MEM_PER_CPU     Same as --mem-per-cpu

       SLURM_MEM_PER_NODE    Same as --mem

       SLURM_MLOADER_IMAGE   Same as --mloader-image

       SLURM_MPI_TYPE        Same as --mpi

       SLURM_NETWORK         Same as --network

       SLURM_NNODES          Same as -N, --nodes

       SLURM_NODELIST        Same as -w, --nodelist

       SLURM_NO_ROTATE       Same as -R, --no-rotate

       SLURM_NTASKS (and SLURM_NPROCS for backwards compatibility)
                             Same as -n, --ntasks

       SLURM_NTASKS_PER_CORE Same as --ntasks-per-core

       SLURM_NTASKS_PER_NODE Same as --ntasks-per-node

       SLURM_NTASKS_PER_SOCKET
                             Same as --ntasks-per-socket

       SLURM_OPEN_MODE       Same as --open-mode

       SLURM_OVERCOMMIT      Same as -O, --overcommit

       SLURM_PARTITION       Same as -p, --partition

       SLURM_PMI_KVS_NO_DUP_KEYS
                             If  set, then PMI key-pairs will contain no duplicate keys.  This is
                             the case for MPICH2 and reduces overhead in testing  for  duplicates
                             for improved performance

       SLURM_PROFILE         Same as --profile

       SLURM_PROLOG          Same as --prolog

       SLURM_QOS             Same as --qos

       SLURM_RAMDISK_IMAGE   Same as --ramdisk-image

       SLURM_REMOTE_CWD      Same as -D, --chdir=

       SLURM_REQ_SWITCH      When  a  tree  topology  is  used, this defines the maximum count of
                             switches desired for the job allocation and optionally  the  maximum
                             time to wait for that number of switches. See --switches

       SLURM_RESERVATION     Same as --reservation

       SLURM_RESTART_DIR     Same as --restart-dir

       SLURM_RESV_PORTS      Same as --resv-ports

       SLURM_SIGNAL          Same as --signal

       SLURM_STDERRMODE      Same as -e, --error

       SLURM_STDINMODE       Same as -i, --input

       SLURM_SRUN_REDUCE_TASK_EXIT_MSG
                             if  set  and  non-zero,  successive task exit messages with the same
                             exit code will be printed only once.

       SLURM_STEP_GRES       Same as --gres (only applies to job steps, not to job  allocations).
                             Also see SLURM_GRES

       SLURM_STEP_KILLED_MSG_NODE_ID=ID
                             If  set,  only  the specified node will log when the job or step are
                             killed by a signal.

       SLURM_STDOUTMODE      Same as -o, --output

       SLURM_TASK_EPILOG     Same as --task-epilog

       SLURM_TASK_PROLOG     Same as --task-prolog

       SLURM_THREADS         Same as -T, --threads

       SLURM_TIMELIMIT       Same as -t, --time

       SLURM_UNBUFFEREDIO    Same as -u, --unbuffered

       SLURM_WAIT            Same as -W, --wait

       SLURM_WAIT4SWITCH     Max time waiting for requested switches. See --switches

       SLURM_WCKEY           Same as -W, --wckey

       SLURM_WORKING_DIR     -D, --chdir

OUTPUT ENVIRONMENT VARIABLES

       srun will set some environment variables in the environment of the executing tasks on  the
       remote compute nodes.  These environment variables are:

       SLURM_CHECKPOINT_IMAGE_DIR
                             Directory   into  which  checkpoint  images  should  be  written  if
                             specified on the execute line.

       SLURM_CPU_BIND_VERBOSE
                             --cpu_bind verbosity (quiet,verbose).

       SLURM_CPU_BIND_TYPE   --cpu_bind type (none,rank,map_cpu:,mask_cpu:)

       SLURM_CPU_BIND_LIST   --cpu_bind map or mask list (list of SLURM CPU IDs or masks for this
                             node,   CPU_ID   =   Board_ID  x  threads_per_board  +  Socket_ID  x
                             threads_per_socket + Core_ID x threads_per_core + Thread_ID).

       SLURM_CPU_FREQ_REQ    Contains the value requested for cpu frequency on the  srun  command
                             as  a  numerical  frequency  in  kilohertz,  or  a coded value for a
                             request  of  low,  medium,  or  high  for  the  frequency.  See  the
                             description of the --cpu-freq option or the SLURM_CPU_FREQ_REQ input
                             environment variable.

       SLURM_CPUS_ON_NODE    Count of processors available to the job on  this  node.   Note  the
                             select/linear  plugin  allocates  entire nodes to jobs, so the value
                             indicates  the  total  count  of  CPUs  on  the   node.    For   the
                             select/cons_res plugin, this number indicates the number of cores on
                             this node allocated to the job.

       SLURM_DISTRIBUTION    Distribution type for the allocated jobs. Set the distribution  with
                             -m, --distribution.

       SLURM_GTIDS           Global  task  IDs  running  on  this  node.   Zero  origin and comma
                             separated.

       SLURM_JOB_CPUS_PER_NODE
                             Number of CPUS per node.

       SLURM_JOB_DEPENDENCY  Set to value of the --dependency option.

       SLURM_JOB_ID (and SLURM_JOBID for backwards compatibility)
                             Job id of the executing job

       SLURM_JOB_NAME        Set to the value of the --job-name option or the command  name  when
                             srun  is  used  to create a new job allocation. Not set when srun is
                             used only to  create  a  job  step  (i.e.  within  an  existing  job
                             allocation).

       SLURM_LAUNCH_NODE_IPADDR
                             IP  address  of  the  node  from which the task launch was initiated
                             (where the srun command ran from)

       SLURM_LOCALID         Node local task ID for the process within a job

       SLURM_MEM_BIND_VERBOSE
                             --mem_bind verbosity (quiet,verbose).

       SLURM_MEM_BIND_TYPE   --mem_bind type (none,rank,map_mem:,mask_mem:)

       SLURM_MEM_BIND_LIST   --mem_bind map or mask list (<list of IDs or masks for this node>)

       SLURM_NNODES          Total number of nodes in the job's resource allocation

       SLURM_NODE_ALIASES    Sets of node name, communication  address  and  hostname  for  nodes
                             allocated  to  the  job  from  the cloud. Each element in the set if
                             colon separated and each set is comma separated. For example:
                             SLURM_NODE_ALIASES=ec0:1.2.3.4:foo,ec1:1.2.3.5:bar

       SLURM_NODEID          The relative node ID of the current node

       SLURM_NODELIST        List of nodes allocated to the job

       SLURM_NTASKS (and SLURM_NPROCS for backwards compatibility)
                             Total number of processes in the current job

       SLURM_PRIO_PROCESS    The  scheduling priority (nice value) at the time of job submission.
                             This value is propagated to the spawned processes.

       SLURM_PROCID          The MPI rank (or relative process ID) of the current process

       SLURM_SRUN_COMM_HOST  IP address of srun communication host.

       SLURM_SRUN_COMM_PORT  srun communication port.

       SLURM_STEP_LAUNCHER_PORT
                             Step launcher port.

       SLURM_STEP_NODELIST   List of nodes allocated to the step.

       SLURM_STEP_NUM_NODES  Number of nodes allocated to the step.

       SLURM_STEP_NUM_TASKS  Number of processes in the step.

       SLURM_STEP_TASKS_PER_NODE
                             Number of processes per node within the step.

       SLURM_STEP_ID (and SLURM_STEPID for backwards compatibility)
                             The step ID of the current job

       SLURM_SUBMIT_DIR      The directory from which srun was invoked.

       SLURM_SUBMIT_HOST     The hostname of the computer from which salloc was invoked.

       SLURM_TASK_PID        The process ID of the task being started.

       SLURM_TASKS_PER_NODE  Number of tasks to be initiated  on  each  node.  Values  are  comma
                             separated  and  in the same order as SLURM_NODELIST.  If two or more
                             consecutive nodes are to have the same task  count,  that  count  is
                             followed  by  "(x#)" where "#" is the repetition count. For example,
                             "SLURM_TASKS_PER_NODE=2(x3),1" indicates that the first three  nodes
                             will  each  execute three tasks and the fourth node will execute one
                             task.

       SLURM_TOPOLOGY_ADDR   This is  set  only  if  the  system  has  the  topology/tree  plugin
                             configured.   The  value  will  be set to the names network switches
                             which may be involved in the job's communications from the  system's
                             top  level switch down to the leaf switch and ending with node name.
                             A period is used to separate each hardware component name.

       SLURM_TOPOLOGY_ADDR_PATTERN
                             This is  set  only  if  the  system  has  the  topology/tree  plugin
                             configured.   The  value  will  be  set  component  types  listed in
                             SLURM_TOPOLOGY_ADDR.  Each component will be  identified  as  either
                             "switch"  or  "node".   A  period  is used to separate each hardware
                             component type.

       SRUN_DEBUG            Set to the logging level of the srun command.  Default  value  is  3
                             (info  level).   The  value is incremented or decremented based upon
                             the --verbose and --quiet options.

       MPIRUN_NOALLOCATE     Do not allocate a block on Blue Gene systems only.

       MPIRUN_NOFREE         Do not free a block on Blue Gene systems only.

       MPIRUN_PARTITION      The block name on Blue Gene systems only.

SIGNALS AND ESCAPE SEQUENCES

       Signals sent to  the  srun  command  are  automatically  forwarded  to  the  tasks  it  is
       controlling  with  a few exceptions. The escape sequence <control-c> will report the state
       of all tasks associated with the srun command. If <control-c> is entered twice within  one
       second,  then  the  associated  SIGINT  signal will be sent to all tasks and a termination
       sequence will be entered sending SIGCONT, SIGTERM, and SIGKILL to all spawned tasks.  If a
       third  <control-c>  is  received,  the srun program will be terminated without waiting for
       remote tasks to exit or their I/O to complete.

       The escape sequence <control-z> is presently ignored. Our intent is for this put the  srun
       command into a mode where various special actions may be invoked.

MPI SUPPORT

       MPI  use depends upon the type of MPI being used.  There are three fundamentally different
       modes of operation used by these various MPI implementation.

       1. SLURM directly  launches  the  tasks  and  performs  initialization  of  communications
       (Quadrics  MPI,  MPICH2,  MPICH-GM, MVAPICH, MVAPICH2 and some MPICH1 modes). For example:
       "srun -n16 a.out".

       2. SLURM creates a resource allocation for the job and then mpirun  launches  tasks  using
       SLURM's infrastructure (OpenMPI, LAM/MPI, HP-MPI and some MPICH1 modes).

       3.  SLURM  creates  a resource allocation for the job and then mpirun launches tasks using
       some mechanism other than SLURM, such as SSH or RSH (BlueGene MPI and some MPICH1  modes).
       These  tasks  initiated outside of SLURM's monitoring or control. SLURM's epilog should be
       configured to purge these tasks when the job's allocation is relinquished.

       See http://slurm.schedmd.com/mpi_guide.html for more information on use of  these  various
       MPI implementation with SLURM.

MULTIPLE PROGRAM CONFIGURATION

       Comments  in the configuration file must have a "#" in column one.  The configuration file
       contains the following fields separated by white space:

       Task rank
              One or more task ranks to use this configuration.  Multiple  values  may  be  comma
              separated.   Ranges may be indicated with two numbers separated with a '-' with the
              smaller number first (e.g. "0-4"  and  not  "4-0").   To  indicate  all  tasks  not
              otherwise  specified,  specify  a  rank of '*' as the last line of the file.  If an
              attempt is made to initiate a task for which no executable program is defined,  the
              following  error message will be produced "No executable program specified for this
              task".

       Executable
              The name of the program to execute.  May be fully qualified pathname if desired.

       Arguments
              Program arguments.  The expression "%t" will be replaced with  the  task's  number.
              The expression "%o" will be replaced with the task's offset within this range (e.g.
              a configured task rank value of "1-5" would have offset values of  "0-4").   Single
              quotes  may be used to avoid having the enclosed values interpreted.  This field is
              optional.  Any arguments for the program entered on the command line will be  added
              to the arguments specified in the configuration file.

       For example:
       ###################################################################
       # srun multiple program configuration file
       #
       # srun -n8 -l --multi-prog silly.conf
       ###################################################################
       4-6       hostname
       1,7       echo  task:%t
       0,2-3     echo  offset:%o

       > srun -n8 -l --multi-prog silly.conf
       0: offset:0
       1: task:1
       2: offset:1
       3: offset:2
       4: linux15.llnl.gov
       5: linux16.llnl.gov
       6: linux17.llnl.gov
       7: task:7

EXAMPLES

       This  simple example demonstrates the execution of the command hostname in eight tasks. At
       least eight processors will be allocated to the job  (the  same  as  the  task  count)  on
       however  many  nodes  are required to satisfy the request. The output of each task will be
       proceeded with its task number.  (The machine "dev" in the example below has  a  total  of
       two CPUs per node)

       > srun -n8 -l hostname
       0: dev0
       1: dev0
       2: dev1
       3: dev1
       4: dev2
       5: dev2
       6: dev3
       7: dev3

       The  srun  -r option is used within a job script to run two job steps on disjoint nodes in
       the following example. The script is run using allocate mode instead of as a batch job  in
       this case.

       > cat test.sh
       #!/bin/sh
       echo $SLURM_NODELIST
       srun -lN2 -r2 hostname
       srun -lN2 hostname

       > salloc -N4 test.sh
       dev[7-10]
       0: dev9
       1: dev10
       0: dev7
       1: dev8

       The following script runs two job steps in parallel within an allocated set of nodes.

       > cat test.sh
       #!/bin/bash
       srun -lN2 -n4 -r 2 sleep 60 &
       srun -lN2 -r 0 sleep 60 &
       sleep 1
       squeue
       squeue -s
       wait

       > salloc -N4 test.sh
         JOBID PARTITION     NAME     USER  ST      TIME  NODES NODELIST
         65641     batch  test.sh   grondo   R      0:01      4 dev[7-10]

       STEPID     PARTITION     USER      TIME NODELIST
       65641.0        batch   grondo      0:01 dev[7-8]
       65641.1        batch   grondo      0:01 dev[9-10]

       This  example  demonstrates  how  one executes a simple MPICH job.  We use srun to build a
       list of machines (nodes) to be used by mpirun in its required  format.  A  sample  command
       line and the script to be executed follow.

       > cat test.sh
       #!/bin/sh
       MACHINEFILE="nodes.$SLURM_JOB_ID"

       # Generate Machinefile for mpich such that hosts are in the same
       #  order as if run via srun
       #
       srun -l /bin/hostname | sort -n | awk '{print $2}' > $MACHINEFILE

       # Run using generated Machine file:
       mpirun -np $SLURM_NTASKS -machinefile $MACHINEFILE mpi-app

       rm $MACHINEFILE

       > salloc -N2 -n4 test.sh

       This simple example demonstrates the execution of different jobs on different nodes in the
       same srun.  You can do this  for  any  number  of  nodes  or  any  number  of  jobs.   The
       executables  are placed on the nodes sited by the SLURM_NODEID env var.  Starting at 0 and
       going to the number specified on the srun commandline.

       > cat test.sh
       case $SLURM_NODEID in
           0) echo "I am running on "
              hostname ;;
           1) hostname
              echo "is where I am running" ;;
       esac

       > srun -N2 test.sh
       dev0
       is where I am running
       I am running on
       dev1

       This example demonstrates use of multi-core  options  to  control  layout  of  tasks.   We
       request that four sockets per node and two cores per socket be dedicated to the job.

       > srun -N2 -B 4-4:2-2 a.out

       This  example  shows  a script in which Slurm is used to provide resource management for a
       job by executing the various job steps as processors become available for their  dedicated
       use.

       > cat my.script
       #!/bin/bash
       srun --exclusive -n4 prog1 &
       srun --exclusive -n3 prog2 &
       srun --exclusive -n1 prog3 &
       srun --exclusive -n1 prog4 &
       wait

COPYING

       Copyright (C) 2006-2007 The Regents of the University of California.  Produced at Lawrence
       Livermore National Laboratory (cf, DISCLAIMER).
       Copyright (C) 2008-2010 Lawrence Livermore National Security.
       Copyright (C) 2010-2013 SchedMD LLC.

       This  file  is  part  of  SLURM,  a  resource  management  program.   For   details,   see
       <http://slurm.schedmd.com/>.

       SLURM  is  free  software; you can redistribute it and/or modify it under the terms of the
       GNU General Public License as published by the Free Software Foundation; either version  2
       of the License, or (at your option) any later version.

       SLURM is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without
       even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
       GNU General Public License for more details.

SEE ALSO

       salloc(1),   sattach(1),   sbatch(1),   sbcast(1),   scancel(1),  scontrol(1),  squeue(1),
       slurm.conf(5), sched_setaffinity (2), numa (3) getrlimit (2)