Provided by: slurm-llnl_2.6.5-1_amd64 bug

NAME

       srun - Run parallel jobs

SYNOPSIS

       srun [OPTIONS...]  executable [args...]

DESCRIPTION

       Run  a  parallel  job  on  cluster  managed  by  SLURM.   If necessary, srun will first create a resource
       allocation in which to run the parallel job.

       The following document describes the the influence of various options on the allocation of cpus  to  jobs
       and tasks.
       http://slurm.schedmd.com/cpu_management.html

OPTIONS

       -A, --account=<account>
              Charge  resources  used by this job to specified account.  The account is an arbitrary string. The
              account name may be changed after job submission using the scontrol command.

       --acctg-freq
              Define the job accounting and profiling sampling intervals.  This can  be  used  to  override  the
              JobAcctGatherFrequency  parameter in SLURM's configuration file, slurm.conf.  The supported format
              is follows:

              --acctg-freq=<datatype>=<interval>
                          where  <datatype>=<interval>  specifies   the   task   sampling   interval   for   the
                          jobacct_gather   plugin   or   a  sampling  interval  for  a  profiling  type  by  the
                          acct_gather_profile plugin. Multiple, comma-separated <datatype>=<interval>  intervals
                          may be specified. Supported datatypes are as follows:

                          task=<interval>
                                 where   <interval>   is   the   task  sampling  interval  in  seconds  for  the
                                 jobacct_gather plugins  and  for  task  profiling  by  the  acct_gather_profile
                                 plugin.

                          energy=<interval>
                                 where <interval> is the sampling interval in seconds for energy profiling using
                                 the acct_gather_energy plugin

                          network=<interval>
                                 where  <interval>  is the sampling interval in seconds for infiniband profiling
                                 using the acct_gather_infiniband plugin.

                          filesystem=<interval>
                                 where <interval> is the sampling interval in seconds for  filesystem  profiling
                                 using the acct_gather_filesystem plugin.

              The default value for the task sampling interval
              is  30. The default value for all other intervals is 0.  An interval of 0 disables sampling of the
              specified type.  If the task sampling interval is 0, accounting information is collected  only  at
              job termination (reducing SLURM interference with the job).
              Smaller (non-zero) values have a greater impact upon job performance, but a value of 30 seconds is
              not likely to be noticeable for applications having less than 10,000 tasks.

       -B --extra-node-info=<sockets[:cores[:threads]]>
              Request a specific allocation of resources with details as to the number and type of computational
              resources within a cluster: number of sockets (or physical processors) per node, cores per socket,
              and  threads per core.  The total amount of resources being requested is the product of all of the
              terms.  Each value specified is  considered  a  minimum.   An  asterisk  (*)  can  be  used  as  a
              placeholder  indicating  that  all  available  resources of that type are to be utilized.  As with
              nodes, the individual levels can also be specified in separate options if desired:
                  --sockets-per-node=<sockets>
                  --cores-per-socket=<cores>
                  --threads-per-core=<threads>
              If task/affinity plugin is enabled, then specifying an allocation  in  this  manner  also  sets  a
              default  --cpu_bind  option  of  threads  if  the -B option specifies a thread count, otherwise an
              option of cores if a core count is specified, otherwise an option of sockets.   If  SelectType  is
              configured  to select/cons_res, it must have a parameter of CR_Core, CR_Core_Memory, CR_Socket, or
              CR_Socket_Memory for this option to be honored.  This option is not supported on BlueGene  systems
              (select/bluegene  plugin  is  configured).   If  not specified, the scontrol show job will display
              'ReqS:C:T=*:*:*'.

       --begin=<time>
              Defer initiation of this job until the specified time.  It accepts times of the form  HH:MM:SS  to
              run  a  job  at a specific time of day (seconds are optional).  (If that time is already past, the
              next day is assumed.)  You may also specify midnight, noon, or teatime (4pm) and you  can  have  a
              time-of-day  suffixed  with  AM or PM for running in the morning or the evening.  You can also say
              what day the job will be run, by specifying a date of the  form  MMDDYY  or  MM/DD/YY  YYYY-MM-DD.
              Combine  date and time using the following format YYYY-MM-DD[THH:MM[:SS]]. You can also give times
              like now + count time-units, where the time-units can be seconds (default), minutes, hours,  days,
              or  weeks  and  you  can tell SLURM to run the job today with the keyword today and to run the job
              tomorrow with the keyword tomorrow.  The value may be  changed  after  job  submission  using  the
              scontrol command.  For example:
                 --begin=16:00
                 --begin=now+1hour
                 --begin=now+60           (seconds by default)
                 --begin=2010-01-20T12:34:00

              Notes on date/time specifications:
               -  Although  the  'seconds' field of the HH:MM:SS time specification is allowed by the code, note
              that the poll time of the SLURM scheduler is not precise enough to guarantee dispatch of  the  job
              on  the  exact second.  The job will be eligible to start on the next poll following the specified
              time. The exact poll interval depends on the SLURM scheduler (e.g., 60 seconds  with  the  default
              sched/builtin).
               - If no time (HH:MM:SS) is specified, the default is (00:00:00).
               -  If  a  date is specified without a year (e.g., MM/DD) then the current year is assumed, unless
              the combination of MM/DD and HH:MM:SS has already passed for that year, in  which  case  the  next
              year is used.

       --checkpoint=<time>
              Specifies  the  interval  between  creating checkpoints of the job step.  By default, the job step
              will have no checkpoints created.  Acceptable time formats include  "minutes",  "minutes:seconds",
              "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds".

       --checkpoint-dir=<directory>
              Specifies the directory into which the job or job step's checkpoint should be written (used by the
              checkpoint/blcr  and  checkpoint/xlch  plugins  only).   The  default value is the current working
              directory.   Checkpoint   files   will   be   of   the   form   "<job_id>.ckpt"   for   jobs   and
              "<job_id>.<step_id>.ckpt" for job steps.

       --comment=<string>
              An arbitrary comment.

       -C, --constraint=<list>
              Nodes  can  have features assigned to them by the SLURM administrator.  Users can specify which of
              these features are required by their job using the constraint option.  Only nodes having  features
              matching  the  job  constraints  will be used to satisfy the request.  Multiple constraints may be
              specified with AND, OR, exclusive OR, resource counts, etc.  Supported constraint options include:

              Single Name
                     Only  nodes  which   have   the   specified   feature   will   be   used.    For   example,
                     --constraint="intel"

              Node Count
                     A request can specify the number of nodes needed with some feature by appending an asterisk
                     and  count  after  the  feature name.  For example "--nodes=16 --constraint=graphics*4 ..."
                     indicates that the job requires 16 nodes at that at least four of those nodes must have the
                     feature "graphics."

              AND    If only nodes with all of specified features will be used.  The ampersand is  used  for  an
                     AND operator.  For example, --constraint="intel&gpu"

              OR     If  only  nodes  with at least one of specified features will be used.  The vertical bar is
                     used for an OR operator.  For example, --constraint="intel|amd"

              Exclusive OR
                     If only one of a set of possible options should be used for all allocated nodes,  then  use
                     the   OR   operator   and  enclose  the  options  within  square  brackets.   For  example:
                     "--constraint=[rack1|rack2|rack3|rack4]" might be used to specify that all  nodes  must  be
                     allocated on a single rack of the cluster, but any of those four racks can be used.

              Multiple Counts
                     Specific  counts  of  multiple  resources  may  be  specified by using the AND operator and
                     enclosing     the     options     within      square      brackets.       For      example:
                     "--constraint=[rack1*2&rack2*4]"  might be used to specify that two nodes must be allocated
                     from nodes with the feature of "rack1" and four nodes must be allocated from nodes with the
                     feature "rack2".

       WARNING: When srun is executed from within salloc or sbatch, the constraint  value  can  only  contain  a
       single feature name. None of the other operators are currently supported for job steps.

       --contiguous
              If  set,  then the allocated nodes must form a contiguous set.  Not honored with the topology/tree
              or topology/3d_torus plugins, both of which can modify the node ordering.  Not honored for  a  job
              step's allocation.

       --cores-per-socket=<cores>
              Restrict  node  selection  to  nodes  with at least the specified number of cores per socket.  See
              additional information under -B option above when task/affinity plugin is enabled.

       --cpu_bind=[{quiet,verbose},]type
              Bind tasks to CPUs.  Used only when the task/affinity  or  task/cgroup  plugin  is  enabled.   The
              configuration   parameter   TaskPluginParam   may   override   these  options.   For  example,  if
              TaskPluginParam is configured to bind to cores, your job  will  not  be  able  to  bind  tasks  to
              sockets.   NOTE: To have SLURM always report on the selected CPU binding for all commands executed
              in a shell, you can enable verbose mode by setting the SLURM_CPU_BIND environment  variable  value
              to "verbose".

              The following informational environment variables are set when --cpu_bind is in use:
                   SLURM_CPU_BIND_VERBOSE
                   SLURM_CPU_BIND_TYPE
                   SLURM_CPU_BIND_LIST

              See  the  ENVIRONMENT  VARIABLES  section  for  a  more  detailed  description  of  the individual
              SLURM_CPU_BIND* variables.

              When using --cpus-per-task to run multithreaded tasks, be aware that CPU binding is inherited from
              the parent of the process.  This means that the multithreaded task should either specify or  clear
              the CPU binding itself to avoid having all threads of the multithreaded task use the same mask/CPU
              as  the parent.  Alternatively, fat masks (masks which specify more than one allowed CPU) could be
              used for the tasks in order to provide multiple CPUs for the multithreaded tasks.

              By default, a job step has access to every CPU allocated to the job.  To ensure that distinct CPUs
              are allocated to each job step, use the --exclusive option.

              If the job step allocation includes an allocation with a number  of  sockets,  cores,  or  threads
              equal  to  the  number  of  tasks  to  be  started  then the tasks will by default be bound to the
              appropriate resources (auto binding).  Disable  this  mode  of  operation  by  explicitly  setting
              "--cpu-bind=none".

              Note  that a job step can be allocated different numbers of CPUs on each node or be allocated CPUs
              not starting at location zero. Therefore one of the options which automatically generate the  task
              binding is recommended.  Explicitly specified masks or bindings are only honored when the job step
              has been allocated every available CPU on the node.

              Binding  a task to a NUMA locality domain means to bind the task to the set of CPUs that belong to
              the NUMA locality domain or "NUMA node".  If NUMA locality domain options are used on systems with
              no NUMA support, then each socket is considered a locality domain.

              Supported options include:

              q[uiet]
                     Quietly bind before task runs (default)

              v[erbose]
                     Verbosely report binding before task runs

              no[ne] Do not bind tasks to CPUs (default unless auto binding is applied)

              rank   Automatically bind by task rank.  Task zero is bound to socket (or core  or  thread)  zero,
                     etc.  Not supported unless the entire node is allocated to the job.

              map_cpu:<list>
                     Bind    by    mapping    CPU    IDs    to    tasks    as    specified   where   <list>   is
                     <cpuid1>,<cpuid2>,...<cpuidN>.  CPU IDs are interpreted as decimal values unless  they  are
                     preceded with '0x' in which case they are interpreted as hexadecimal values.  Not supported
                     unless the entire node is allocated to the job.  This option is currently only supported by
                     the task/affinity plugin.

              mask_cpu:<list>
                     Bind by setting CPU masks on tasks as specified where <list> is <mask1>,<mask2>,...<maskN>.
                     CPU masks are always interpreted as hexadecimal values but can be preceded with an optional
                     '0x'.   Not  supported  unless  the  entire  node  is allocated to the job.  This option is
                     currently only supported by the task/affinity plugin.

              rank_ldom
                     Bind to a NUMA locality domain by rank

              map_ldom:<list>
                     Bind  by  mapping  NUMA  locality  domain  IDs  to  tasks  as  specified  where  <list>  is
                     <ldom1>,<ldom2>,...<ldomN>.   The  locality  domain  IDs  are interpreted as decimal values
                     unless they are preceded with '0x' in  which  case  they  are  interpreted  as  hexadecimal
                     values.  Not supported unless the entire node is allocated to the job.

              mask_ldom:<list>
                     Bind  by  setting  NUMA  locality  domain  masks  on  tasks  as  specified  where <list> is
                     <mask1>,<mask2>,...<maskN>.   NUMA  locality  domain  masks  are  always   interpreted   as
                     hexadecimal  values  but  can  be preceded with an optional '0x'.  Not supported unless the
                     entire node is allocated to the job.

              sockets
                     Automatically generate masks binding tasks to sockets.  Only the CPUs on the  socket  which
                     have  been  allocated  to  the  job  will be used.  If the number of tasks differs from the
                     number of allocated sockets this can result in sub-optimal binding.

              cores  Automatically generate masks binding tasks to cores.  If the number of tasks  differs  from
                     the number of allocated cores this can result in sub-optimal binding.

              threads
                     Automatically generate masks binding tasks to threads.  If the number of tasks differs from
                     the number of allocated threads this can result in sub-optimal binding.

              ldoms  Automatically  generate  masks  binding  tasks  to NUMA locality domains.  If the number of
                     tasks differs from the number of allocated locality domains this can result in  sub-optimal
                     binding.

              help   Show help message for cpu_bind

       --cpu-freq =<requested frequency in kilohertz>

              Request that the job step initiated by this srun be run at the requested frequency if possible, on
              the  cpus  selected  for  the  step on the compute node(s).  In addition to specifying a numerical
              frequency in kilohertz, the request can specify low, medium, or high for  the  value.  "Low"  will
              select  the  lowest available frequency, "high" will select the highest available frequency, while
              "medium" attempts to set a frequency in the middle of the available range. If  the  numeric  value
              specified  does  not exactly match a legal available frequency, SLURM will attempt to pick a legal
              frequency close to the request.

              The following informational environment variable is set in the job step when --cpu-freq option  is
              requested.
                      SLURM_CPU_FREQ_REQ

              This environment variable can also be used to supply the value for the cpu frequency request if it
              is  set  when  the 'srun' command is issued.  The --cpu-freq on the command line will override the
              environment variable value.  See the ENVIRONMENT  VARIABLES  section  for  a  description  of  the
              SLURM_CPU_FREQ_REQ variable.

              NOTE:  This parameter is treated as a request, not a requirement.  If the job step's node does not
              support setting the cpu frequency, or the requested value is  outside  the  bounds  of  the  legal
              frequencies, an error is logged, but the job step is allowed to continue.

              NOTE:  Setting the frequency for just the cpus of the job step implies that the tasks are confined
              to those cpus.  If task confinement (i.e., TaskPlugin=task/affinity or TaskPlugin=task/cgroup with
              the "ConstrainCores" option) is not configured, this parameter is ignored.

       -c, --cpus-per-task=<ncpus>
              Request that ncpus be allocated per process. This may be useful if the job  is  multithreaded  and
              requires  more  than one CPU per task for optimal performance. The default is one CPU per process.
              If -c is specified without -n, as many  tasks  will  be  allocated  per  node  as  possible  while
              satisfying the -c restriction. For instance on a cluster with 8 CPUs per node, a job request for 4
              nodes  and 3 CPUs per task may be allocated 3 or 6 CPUs per node (1 or 2 tasks per node) depending
              upon resource consumption by other jobs. Such a job may be unable to execute more than a total  of
              4  tasks.   This  option may also be useful to spawn tasks without allocating resources to the job
              step from the job's allocation when running multiple job steps with the --exclusive option.

              WARNING: There are configurations and options interpreted differently by job and job step requests
              which can result in inconsistencies for this option.  For example  srun  -c2  --threads-per-core=1
              prog  may allocate two cores for the job, but if each of those cores contains two threads, the job
              allocation will include four CPUs. The job step allocation will then launch two  threads  per  CPU
              for a total of two tasks.

              WARNING:  When srun is executed from within salloc or sbatch, there are configurations and options
              which can result in inconsistent allocations when -c has a value greater  than  -c  on  salloc  or
              sbatch.

       -d, --dependency=<dependency_list>
              Defer  the  start  of  this  job  until  the specified dependencies have been satisfied completed.
              <dependency_list> is of the form  <type:job_id[:job_id][,type:job_id[:job_id]]>.   Many  jobs  can
              share  the  same  dependency and these jobs may even belong to different  users. The  value may be
              changed after job submission using the scontrol command.

              after:job_id[:jobid...]
                     This job can begin execution after the specified jobs have begun execution.

              afterany:job_id[:jobid...]
                     This job can begin execution after the specified jobs have terminated.

              afternotok:job_id[:jobid...]
                     This job can begin execution after the specified jobs have terminated in some failed  state
                     (non-zero exit code, node failure, timed out, etc).

              afterok:job_id[:jobid...]
                     This  job  can  begin execution after the specified jobs have successfully executed (ran to
                     completion with an exit code of zero).

              expand:job_id
                     Resources allocated to this job should be used to expand the specified  job.   The  job  to
                     expand  must  share  the  same  QOS (Quality of Service) and partition.  Gang scheduling of
                     resources in the partition is also not supported.

              singleton
                     This job can begin execution after any previously launched jobs sharing the same  job  name
                     and user have terminated.

       -D, --chdir=<path>
              have  the  remote processes do a chdir to path before beginning execution. The default is to chdir
              to the current working directory of the srun process.

       -e, --error=<mode>
              Specify how stderr is to be redirected. By default in interactive mode, srun redirects  stderr  to
              the  same  file as stdout, if one is specified. The --error option is provided to allow stdout and
              stderr to be redirected to different locations.  See IO Redirection below for  more  options.   If
              the specified file already exists, it will be overwritten.

       -E, --preserve-env
              Pass  the  current  values  of  environment variables SLURM_NNODES and SLURM_NTASKS through to the
              executable, rather than computing them from commandline parameters.

       --epilog=<executable>
              srun will run executable just after the job  step  completes.   The  command  line  arguments  for
              executable  will  be  the command and arguments of the job step.  If executable is "none", then no
              srun epilog will be run. This parameter overrides the SrunEpilog  parameter  in  slurm.conf.  This
              parameter is completely independent from the Epilog parameter in slurm.conf.

       --exclusive
              This  option  has  two slightly different meanings for job and job step allocations.  When used to
              initiate a job, the job allocation cannot share nodes  with  other  running  jobs.   This  is  the
              opposite  of  --share,  whichever  option  is  seen last on the command line will win. The default
              shared/exclusive behavior depends on system configuration and the partition's Shared option  takes
              precedence over the job's option.

              This  option  can  also be used when initiating more than one job step within an existing resource
              allocation, where you want separate processors to be dedicated to each  job  step.  If  sufficient
              processors are not available to initiate the job step, it will be deferred. This can be thought of
              as  providing resource management for the job within it's allocation. Note that all CPUs allocated
              to a job are available to each job step unless the --exclusive option is used plus  task  affinity
              is  configured.  Since  resource  management is provided by processor, the --ntasks option must be
              specified, but the following options should NOT be specified --relative, --distribution=arbitrary.
              See EXAMPLE below.

       --gid=<group>
              If srun is run as root, and the --gid option is used, submit the job  with  group's  group  access
              permissions.  group may be the group name or the numerical group ID.

       --gres=<list>
              Specifies a comma delimited list of generic consumable resources.  The format of each entry on the
              list  is "name[:count]".  The name is that of the consumable resource.  The count is the number of
              those resources with a default value of 1.  The specified resources will be allocated to  the  job
              on  each  node.   The  available  generic  consumable  resources  is  configurable  by  the system
              administrator.  A list of available generic consumable resources will be printed and  the  command
              will  exit  if  the  option  argument is "help".  Examples of use include "--gres=gpu:2,mic=1" and
              "--gres=help".  NOTE: By default, a job step is allocated all of the generic resources  that  have
              allocated  to  the  job.  To  change  the  behavior  so that each job step is allocated no generic
              resources, explicitly set the value of --gres to specify zero counts for each generic resource  OR
              set "--gres=none" OR set the SLURM_STEP_GRES environment variable to "none".

       -H, --hold
              Specify  the  job  is  to  be submitted in a held state (priority of zero).  A held job can now be
              released using scontrol to reset its priority (e.g. "scontrol release <job_id>").

       -h, --help
              Display help information and exit.

       --hint=<type>
              Bind tasks according to application hints

              compute_bound
                     Select settings for compute bound applications: use all cores in each  socket,  one  thread
                     per core

              memory_bound
                     Select settings for memory bound applications: use only one core in each socket, one thread
                     per core

              [no]multithread
                     [don't]  use  extra  threads  with  in-core multi-threading which can benefit communication
                     intensive applications

              help   show this help message

       -I, --immediate[=<seconds>]
              exit if resources are not available within the time period specified.  If no  argument  is  given,
              resources  must  be  available immediately for the request to succeed.  By default, --immediate is
              off, and the command will block until resources become available. Since this option's argument  is
              optional,  for proper parsing the single letter option must be followed immediately with the value
              and not include a space between them. For example "-I60" and not "-I 60".

       -i, --input=<mode>
              Specify how stdin is to redirected. By default, srun redirects stdin from the terminal all  tasks.
              See  IO Redirection below for more options.  For OS X, the poll() function does not support stdin,
              so input from a terminal is not possible.

       -J, --job-name=<jobname>
              Specify a name for the job. The specified name will appear along  with  the  job  id  number  when
              querying  running jobs on the system. The default is the supplied executable program's name. NOTE:
              This information may be written to the slurm_jobacct.log file. This file is space delimited so  if
              a  space is used in the jobname name it will cause problems in properly displaying the contents of
              the slurm_jobacct.log file when the sacct command is used.

       --jobid=<jobid>
              Initiate a job step under an already allocated job with job id id.  Using this option  will  cause
              srun to behave exactly as if the SLURM_JOB_ID environment variable was set.

       -K, --kill-on-bad-exit[=0|1]
              Controls  whether  or  not to terminate a job if any task exits with a non-zero exit code. If this
              option is not specified, the default action will be based upon the SLURM  configuration  parameter
              of  KillOnBadExit.  If  this  option  is specified, it will take precedence over KillOnBadExit. An
              option argument of zero will not terminate the job.  A  non-zero  argument  or  no  argument  will
              terminate the job.  Note: This option takes precedence over the -W, --wait option to terminate the
              job  immediately  if  a  task  exits  with  a non-zero exit code.  Since this option's argument is
              optional, for proper parsing the single letter option must be followed immediately with the  value
              and not include a space between them. For example "-K1" and not "-K 1".

       -k, --no-kill
              Do not automatically terminate a job of one of the nodes it has been allocated fails.  This option
              is  only  recognized on a job allocation, not for the submission of individual job steps.  The job
              will assume all responsibilities for fault-tolerance.  Tasks launch using this option will not  be
              considered terminated (e.g. -K, --kill-on-bad-exit and -W, --wait options will have no effect upon
              the job step).  The active job step (MPI job) will likely suffer a fatal error, but subsequent job
              steps  may  be  run  if this option is specified.  The default action is to terminate the job upon
              node failure.

       --launch-cmd
              Print external launch command instead of running job normally through SLURM. This option  is  only
              valid if using something other than the launch/slurm plugin.

       --launcher-opts=<options>
              Options for the external launcher if using something other than the launch/slurm plugin.

       -l, --label
              prepend  task  number  to  lines  of  stdout/err. Normally, stdout and stderr from remote tasks is
              line-buffered directly to the stdout and stderr of srun.  The --label option will prepend lines of
              output with the remote task id.

       -L, --licenses=<license>
              Specification of licenses (or other resources available on all nodes of the cluster) which must be
              allocated to this job.  License names can be followed by a colon and count (the default  count  is
              one).  Multiple license names should be comma separated (e.g.  "--licenses=foo:4,bar").

       -m, --distribution=
              <block|cyclic|arbitrary|plane=<options>[:block|cyclic]>

              Specify  alternate distribution methods for remote processes.  This option controls the assignment
              of tasks to the nodes on which resources have  been  allocated,  and  the  distribution  of  those
              resources  to  tasks  for  binding (task affinity). The first distribution method (before the ":")
              controls the distribution of resources across  nodes.  The  optional  second  distribution  method
              (after  the  ":")  controls the distribution of resources across sockets within a node.  Note that
              with select/cons_res, the number of cpus allocated on each socket and node may be different. Refer
              to  http://slurm.schedmd.com/mc_support.html  for  more  information   on   resource   allocation,
              assignment of tasks to nodes, and binding of tasks to CPUs.
              First distribution method:

              block  The  block  distribution method will distribute tasks to a node such that consecutive tasks
                     share a node. For example, consider an allocation of three nodes  each  with  two  cpus.  A
                     four-task  block  distribution  request will distribute those tasks to the nodes with tasks
                     one and two on the first node, task three on the second node, and task four  on  the  third
                     node.  Block distribution is the default behavior if the number of tasks exceeds the number
                     of allocated nodes.

              cyclic The  cyclic distribution method will distribute tasks to a node such that consecutive tasks
                     are distributed over consecutive nodes (in a round-robin fashion). For example, consider an
                     allocation of three nodes each with two cpus. A four-task cyclic distribution request  will
                     distribute  those tasks to the nodes with tasks one and four on the first node, task two on
                     the second node, and  task  three  on  the  third  node.   Note  that  when  SelectType  is
                     select/cons_res,  the  same  number  of  CPUs  may  not  be  allocated  on  each node. Task
                     distribution will be round-robin among all the nodes with CPUs yet to be assigned to tasks.
                     Cyclic distribution is the default behavior if the number of tasks is no  larger  than  the
                     number of allocated nodes.

              plane  The  tasks  are  distributed  in  blocks of a specified size.  The options include a number
                     representing the size of the task block.  This is followed by an optional specification  of
                     the  task distribution scheme within a block of tasks and between the blocks of tasks.  The
                     number of tasks distributed to each node is the same as for cyclic  distribution,  but  the
                     taskids  assigned  to  each  node  depend  on  the  plane size. For more details (including
                     examples and diagrams), please see
                     http://slurm.schedmd.com/mc_support.html
                     and
                     http://slurm.schedmd.com/dist_plane.html

              arbitrary
                     The arbitrary method of distribution will allocate processes in-order  as  listed  in  file
                     designated  by the environment variable SLURM_HOSTFILE.  If this variable is listed it will
                     over ride any other method specified.  If not set the method will default to block.  Inside
                     the hostfile must contain at minimum the number of hosts requested and be one per  line  or
                     comma  separated.   If  specifying a task count (-n, --ntasks=<number>), your tasks will be
                     laid out on the nodes in the order of the file.
                     NOTE: The arbitrary distribution option on a job allocation only controls the nodes  to  be
                     allocated  to  the  job and not the allocation of CPUs on those nodes. This option is meant
                     primarily to control a job step's task layout in an existing job allocation  for  the  srun
                     command.

              Second distribution method:

              block  The  block distribution method will distribute tasks to sockets such that consecutive tasks
                     share a socket.

              cyclic The cyclic distribution method will distribute tasks to sockets such that consecutive tasks
                     are distributed over consecutive sockets (in a round-robin fashion).

       --mail-type=<type>
              Notify user by email when certain event types occur.  Valid type  values  are  BEGIN,  END,  FAIL,
              REQUEUE, and ALL (any state change). The user to be notified is indicated with --mail-user.

       --mail-user=<user>
              User  to receive email notification of state changes as defined by --mail-type.  The default value
              is the submitting user.

       --mem=<MB>
              Specify the real memory required per node in MegaBytes.  Default value is  DefMemPerNode  and  the
              maximum  value  is MaxMemPerNode. If configured, both of parameters can be seen using the scontrol
              show config command.  This parameter would generally be used if whole nodes are allocated to  jobs
              (SelectType=select/linear).   Also  see  --mem-per-cpu.   --mem  and  --mem-per-cpu  are  mutually
              exclusive.  NOTE: Enforcement of memory limits currently relies upon  the  task/cgroup  plugin  or
              enabling  of  accounting,  which  samples memory use on a periodic basis (data need not be stored,
              just collected). In both cases memory use is based upon the job's Resident Set Size (RSS). A  task
              may exceed the memory limit until the next periodic accounting sample.

       --mem-per-cpu=<MB>
              Minimum  memory  required  per  allocated CPU in MegaBytes.  Default value is DefMemPerCPU and the
              maximum value is MaxMemPerCPU (see exception below). If configured, both of parameters can be seen
              using the scontrol show config command.  Note that if the job's --mem-per-cpu  value  exceeds  the
              configured  MaxMemPerCPU,  then  the  user's  limit  will  be  treated as a memory limit per task;
              --mem-per-cpu will be reduced to a value no larger than MaxMemPerCPU; --cpus-per-task will be  set
              and  value  of  --cpus-per-task  multiplied by the new --mem-per-cpu value will equal the original
              --mem-per-cpu value specified by the user.  This parameter would generally be used  if  individual
              processors  are  allocated  to  jobs  (SelectType=select/cons_res).   Also  see  --mem.  --mem and
              --mem-per-cpu are mutually exclusive.

       --mem_bind=[{quiet,verbose},]type
              Bind tasks to memory. Used only when the task/affinity plugin  is  enabled  and  the  NUMA  memory
              functions  are  available.   Note that the resolution of CPU and memory binding may differ on some
              architectures. For example, CPU binding may be performed at  the  level  of  the  cores  within  a
              processor  while  memory  binding will be performed at the level of nodes, where the definition of
              "nodes" may differ from system to system. The use of any type other than "none" or "local" is  not
              recommended.   If  you  want  greater  control,  try  running  a simple test code with the options
              "--cpu_bind=verbose,none --mem_bind=verbose,none" to determine the specific configuration.

              NOTE: To have SLURM always report on the selected memory binding for all commands  executed  in  a
              shell,  you  can  enable  verbose mode by setting the SLURM_MEM_BIND environment variable value to
              "verbose".

              The following informational environment variables are set when --mem_bind is in use:

                   SLURM_MEM_BIND_VERBOSE
                   SLURM_MEM_BIND_TYPE
                   SLURM_MEM_BIND_LIST

              See the  ENVIRONMENT  VARIABLES  section  for  a  more  detailed  description  of  the  individual
              SLURM_MEM_BIND* variables.

              Supported options include:

              q[uiet]
                     quietly bind before task runs (default)

              v[erbose]
                     verbosely report binding before task runs

              no[ne] don't bind tasks to memory (default)

              rank   bind by task rank (not recommended)

              local  Use memory local to the processor in use

              map_mem:<list>
                     bind   by   mapping   a   node's   memory   to   tasks   as   specified   where  <list>  is
                     <cpuid1>,<cpuid2>,...<cpuidN>.  CPU IDs are interpreted as decimal values unless  they  are
                     preceded with '0x' in which case they interpreted as hexadecimal values (not recommended)

              mask_mem:<list>
                     bind    by    setting    memory   masks   on   tasks   as   specified   where   <list>   is
                     <mask1>,<mask2>,...<maskN>.  memory masks are always  interpreted  as  hexadecimal  values.
                     Note  that  masks  must  be preceded with a '0x' if they don't begin with [0-9] so they are
                     seen as numerical values by srun.

              help   show this help message

       --mincpus=<n>
              Specify a minimum number of logical cpus/processors per node.

       --msg-timeout=<seconds>
              Modify the job launch  message  timeout.   The  default  value  is  MessageTimeout  in  the  SLURM
              configuration file slurm.conf.  Changes to this are typically not recommended, but could be useful
              to diagnose problems.

       --mpi=<mpi_type>
              Identify the type of MPI to be used. May result in unique initiation procedures.

              list   Lists available mpi types to choose from.

              lam    Initiates  one  'lamd' process per node and establishes necessary environment variables for
                     LAM/MPI.

              mpich1_shmem
                     Initiates one process per node and establishes necessary environment variables  for  mpich1
                     shared memory model.  This also works for mvapich built for shared memory.

              mpichgm
                     For use with Myrinet.

              mvapich
                     For use with Infiniband.

              openmpi
                     For use with OpenMPI.

              none   No special MPI processing. This is the default and works with many other versions of MPI.

       --multi-prog
              Run  a  job  with  different  programs  and  different  arguments for each task. In this case, the
              executable program specified is actually  a  configuration  file  specifying  the  executable  and
              arguments for each task. See MULTIPLE PROGRAM CONFIGURATION below for details on the configuration
              file contents.

       -N, --nodes=<minnodes[-maxnodes]>
              Request  that a minimum of minnodes nodes be allocated to this job.  A maximum node count may also
              be specified with maxnodes.  If only one number is specified, this is used as both the minimum and
              maximum node count.  The partition's node limits supersede those of the  job.   If  a  job's  node
              limits  are outside of the range permitted for its associated partition, the job will be left in a
              PENDING state.  This permits possible execution at a later  time,  when  the  partition  limit  is
              changed.   If  a  job  node limit exceeds the number of nodes configured in the partition, the job
              will be rejected.  Note that the environment variable SLURM_JOB_NUM_NODES  (and  SLURM_NNODES  for
              backwards  compatibility) will be set to the count of nodes actually allocated to the job. See the
              ENVIRONMENT VARIABLES section for more information.  If -N is not specified, the default  behavior
              is to allocate enough nodes to satisfy the requirements of the -n and -c options.  The job will be
              allocated as many nodes as possible within the range specified and without delaying the initiation
              of  the job.  The node count specification may include a numeric value followed by a suffix of "k"
              (multiplies numeric value by 1,024) or "m" (multiplies numeric value by 1,048,576).

       -n, --ntasks=<number>
              Specify the number of tasks to run. Request that srun allocate resources for  ntasks  tasks.   The
              default is one task per node, but note that the --cpus-per-task option will change this default.

       --network=<type>
              Specify  the  communication  protocol to be used.  The interpretation of type is system dependent.
              This option is current supported on systems with  IBM's  Parallel  Environment  (PE).   See  IBM's
              LoadLeveler  job  command  keyword documentation about the keyword "network" for more information.
              Multiple values may be specified in a comma separated list.  All options  are  case  in-sensitive.
              Supported values include:

              BULK_XFER[=<resources>]
                          Enable  bulk  transfer of data using Remote Direct-Memory Access (RDMA).  The optional
                          resources specification is a numeric value which can have a suffix of "k",  "K",  "m",
                          "M",  "g"  or  "G"  for  kilobytes,  megabytes  or  gigabytes.   NOTE:  The  resources
                          specification is not supported by the underlying IBM  infrastructure  as  of  Parallel
                          Environment  version  2.2  and no value should be specified at this time.  The devices
                          allocated to a job must all be of the same  type.   The  default  value  depends  upon
                          depends  upon  what hardware is available and in order of preferences is IPONLY (which
                          is not considered in User Space mode), HFI, IB, HPCE, and KMUX.

              CAU=<count> Number of Collecitve Acceleration Units (CAU) required.  Applies only to IBM Power7-IH
                          processors.  Default value is zero.   Independent  CAU  will  be  allocated  for  each
                          programming interface (MPI, LAPI, etc.)

              DEVNAME=<name>
                          Specify the device name to use for communications (e.g. "eth0" or "mlx4_0").

              DEVTYPE=<type>
                          Specify  the device type to use for communications.  The supported values of type are:
                          "IB" (InfiniBand), "HFI" (P7 Host Fabric Interface),  "IPONLY"  (IP-Only  interfaces),
                          "HPCE"  (HPC  Ethernet), and "KMUX" (Kernel Emulation of HPCE).  The devices allocated
                          to a job must all be of the same type.  The default value depends  upon  depends  upon
                          what  hardware  is  available  and  in  order  of  preferences is IPONLY (which is not
                          considered in User Space mode), HFI, IB, HPCE, and KMUX.

              IMMED =<count>
                          Number of immediate send slots per window required.  Applies  only  to  IBM  Power7-IH
                          processors.  Default value is zero.

              INSTANCES =<count>
                          Specify  number  of network connections for each task on each network connection.  The
                          default instance count is 1.

              IPV4        Use Internet Protocol (IP) version 4 communications (default).

              IPV6        Use Internet Protocol (IP) version 6 communications.

              LAPI        Use the LAPI programming interface.

              MPI         Use the MPI programming interface.  MPI is the default interface.

              PAMI        Use the PAMI programming interface.

              SHMEM       Use the OpenSHMEM programming interface.

              SN_ALL      Use all available switch networks (default).

              SN_SINGLE   Use one available switch network.

              UPC         Use the UPC programming interface.

              US          Use User Space communications.

              Some examples of network specifications:

              Instances=2,US,MPI,SN_ALL
                          Create two user space connections for MPI communications on every switch  network  for
                          each task.

              US,MPI,Instances=3,Devtype=IB
                          Create three user space connections for MPI communications on every InfiniBand network
                          for each task.

              IPV4,LAPI,SN_Single
                          Create  a  IP  version  4 connection for LAPI communications on one switch network for
                          each task.

              Instances=2,US,LAPI,MPI
                          Create two user space connections each for LAPI and MPI communications on every switch
                          network for each task. Note that SN_ALL is the default option so every switch  network
                          is used. Also note that Instances=2 specifies that two connections are established for
                          each  protocol (LAPI and MPI) and each task.  If there are two networks and four tasks
                          on the node then a total of 32 connections are established (2 instances x 2  protocols
                          x 2 networks x 4 tasks).

       --nice[=adjustment]
              Run  the  job  with  an  adjusted  scheduling priority within SLURM.  With no adjustment value the
              scheduling priority is decreased by 100. The adjustment range is from -10000 (highest priority) to
              10000 (lowest priority). Only privileged users can  specify  a  negative  adjustment.  NOTE:  This
              option is presently ignored if SchedulerType=sched/wiki or SchedulerType=sched/wiki2.

       --ntasks-per-core=<ntasks>
              Request  the  maximum  ntasks be invoked on each core.  Meant to be used with the --ntasks option.
              Related to --ntasks-per-node except at the core level instead  of  the  node  level.   Masks  will
              automatically be generated to bind the tasks to specific core unless --cpu_bind=none is specified.
              NOTE:     This    option    is    not    supported    unless    SelectTypeParameters=CR_Core    or
              SelectTypeParameters=CR_Core_Memory is configured.

       --ntasks-per-node=<ntasks>
              Request the maximum ntasks be invoked on each node.  Meant to be used  with  the  --nodes  option.
              This  is  related to --cpus-per-task=ncpus, but does not require knowledge of the actual number of
              cpus on each node.  In some cases, it is more convenient to be able to request that no more than a
              specific number of tasks be invoked on each node.  Examples of this include  submitting  a  hybrid
              MPI/OpenMP  app  where only one MPI "task/rank" should be assigned to each node while allowing the
              OpenMP portion to utilize all of the parallelism present in  the  node,  or  submitting  a  single
              setup/cleanup/monitoring job to each node of a pre-existing allocation as one step in a larger job
              script.

       --ntasks-per-socket=<ntasks>
              Request  the maximum ntasks be invoked on each socket.  Meant to be used with the --ntasks option.
              Related to --ntasks-per-node except at the socket level instead of the  node  level.   Masks  will
              automatically  be  generated  to  bind  the  tasks  to  specific sockets unless --cpu_bind=none is
              specified.   NOTE:  This  option  is  not  supported  unless   SelectTypeParameters=CR_Socket   or
              SelectTypeParameters=CR_Socket_Memory is configured.

       -O, --overcommit
              Overcommit  resources.  Normally,  srun  will  not  allocate  more  than  one  process per CPU. By
              specifying --overcommit you are explicitly allowing more than one process per CPU. However no more
              than MAX_TASKS_PER_NODE tasks are permitted to execute  per  node.   NOTE:  MAX_TASKS_PER_NODE  is
              defined in the file slurm.h and is not a variable, it is set at SLURM build time.

       -o, --output=<mode>
              Specify the mode for stdout redirection. By default in interactive mode, srun collects stdout from
              all  tasks  and  line  buffers  this  output to the attached terminal. With --output stdout may be
              redirected to a file, to one file per task, or to /dev/null. See section IO Redirection below  for
              the various forms of mode.  If the specified file already exists, it will be overwritten.

              If  --error is not also specified on the command line, both stdout and stderr will directed to the
              file specified by --output.

       --open-mode=<append|truncate>
              Open the output and error files using append or truncate mode as specified.  The default value  is
              specified by the system configuration parameter JobFileAppend.

       -p, --partition=<partition_names>
              Request  a specific partition for the resource allocation.  If not specified, the default behavior
              is to allow the slurm controller to select the default  partition  as  designated  by  the  system
              administrator. If the job can use more than one partition, specify their names in a comma separate
              list and the one offering earliest initiation will be used.

       --profile=<all|none|[energy[,|task[,|lustre[,|network]]]]>
              enables  detailed  data collection by the acct_gather_profile plugin.  Detailed data are typically
              time-series that are stored in an HDF5 file for the job.

              All       All data types are collected. (Cannot be combined with other values.)

              None      No data types are collected. This is the default.
                         (Cannot be combined with other values.)

              Energy    Energy data is collected.

              Task      Task (I/O, Memory, ...) data is collected.

              Lustre    Lustre data is collected.

              Network   Network (InfiniBand) data is collected.

       --prolog=<executable>
              srun will run executable just before launching the job  step.   The  command  line  arguments  for
              executable  will  be  the command and arguments of the job step.  If executable is "none", then no
              srun prolog will be run. This parameter overrides the SrunProlog  parameter  in  slurm.conf.  This
              parameter is completely independent from the Prolog parameter in slurm.conf.

       --propagate[=rlimits]
              Allows users to specify which of the modifiable (soft) resource limits to propagate to the compute
              nodes  and  apply  to  their  jobs.  If rlimits is not specified, then all resource limits will be
              propagated.  The following rlimit names are supported by Slurm (although some options may  not  be
              supported on some systems):

              ALL       All limits listed below

              AS        The maximum address space for a process

              CORE      The maximum size of core file

              CPU       The maximum amount of CPU time

              DATA      The maximum size of a process's data segment

              FSIZE     The  maximum  size  of  files created. Note that if the user sets FSIZE to less than the
                        current size of the slurmd.log, job launches will fail with a 'File size limit exceeded'
                        error.

              MEMLOCK   The maximum size that may be locked into memory

              NOFILE    The maximum number of open files

              NPROC     The maximum number of processes available

              RSS       The maximum resident set size

              STACK     The maximum stack size

       --pty  Execute task zero in pseudo terminal mode.  Implicitly sets --unbuffered.  Implicitly sets --error
              and --output to /dev/null for all tasks except task zero, which may  cause  those  tasks  to  exit
              immediately  (e.g.  shells  will  typically  exit  immediately  in that situation).  Not currently
              supported on AIX platforms.

       -Q, --quiet
              Suppress informational messages from srun. Errors will still be displayed.

       -q, --quit-on-interrupt
              Quit immediately on single SIGINT (Ctrl-C).  Use  of  this  option  disables  the  status  feature
              normally  available  when  srun  receives  a  single Ctrl-C and causes srun to instead immediately
              terminate the running job.

       --qos=<qos>
              Request a quality of service for the job.  QOS values can be defined for each user/cluster/account
              association in the SLURM database.  Users will be limited to their association's  defined  set  of
              qos's  when  the  SLURM  configuration parameter, AccountingStorageEnforce, includes "qos" in it's
              definition.

       -r, --relative=<n>
              Run a job step relative to node n of the current allocation.  This option may be  used  to  spread
              several job steps out among the nodes of the current job. If -r is used, the current job step will
              begin  at  node  n  of  the allocated nodelist, where the first node is considered node 0.  The -r
              option is not permitted with -w or -x option and will result in a fatal  error  when  not  running
              within  a  prior  allocation  (i.e.  when SLURM_JOB_ID is not set). The default for n is 0. If the
              value of --nodes exceeds the number of nodes identified with  the  --relative  option,  a  warning
              message will be printed and the --relative option will take precedence.

       --resv-ports
              Reserve communication ports for this job.  Used for OpenMPI.

       --reservation=<name>
              Allocate resources for the job from the named reservation.

       --restart-dir=<directory>
              Specifies  the  directory  from which the job or job step's checkpoint should be read (used by the
              checkpoint/blcrm and checkpoint/xlch plugins only).

       -s, --share
              The job allocation can share nodes with other running jobs.  This is the opposite of  --exclusive,
              whichever  option  is  seen  last  on  the command line will be used. The default shared/exclusive
              behavior depends on system configuration and the partition's Shared option takes  precedence  over
              the  job's option.  This option may result the allocation being granted sooner than if the --share
              option was not set and allow higher system utilization, but application  performance  will  likely
              suffer due to competition for resources within a node.

       --signal=<sig_num>[@<sig_time>]
              When  a  job  is  within sig_time seconds of its end time, send it the signal sig_num.  Due to the
              resolution of event handling by SLURM, the signal may be  sent  up  to  60  seconds  earlier  than
              specified.   sig_num  may  either be a signal number or name (e.g. "10" or "USR1").  sig_time must
              have integer value between zero and 65535.  By default, no signal is sent  before  the  job's  end
              time.  If a sig_num is specified without any sig_time, the default time will be 60 seconds.

       --slurmd-debug=<level>
              Specify  a  debug level for slurmd(8). level may be an integer value between 0 [quiet, only errors
              are displayed] and 4 [verbose operation].  The slurmd debug information is copied onto the  stderr
              of the job. By default only errors are displayed.

       --sockets-per-node=<sockets>
              Restrict  node  selection  to nodes with at least the specified number of sockets.  See additional
              information under -B option above when task/affinity plugin is enabled.

       --switches=<count>[@<max-time>]
              When a tree topology is used, this defines the maximum count  of  switches  desired  for  the  job
              allocation  and optionally the maximum time to wait for that number of switches. If SLURM finds an
              allocation containing more switches than the count specified, the job  remains  pending  until  it
              either  finds  an  allocation with desired switch count or the time limit expires.  It there is no
              switch count limit, there is no delay in  starting  the  job.   Acceptable  time  formats  include
              "minutes",  "minutes:seconds",  "hours:minutes:seconds",  "days-hours",  "days-hours:minutes"  and
              "days-hours:minutes:seconds".  The  job's  maximum  time  delay  may  be  limited  by  the  system
              administrator  using  the  SchedulerParameters  configuration  parameter  with the max_switch_wait
              parameter option.  The default max-time is the max_switch_wait SchedulerParameter.

       -T, --threads=<nthreads>
              Allows limiting the number of concurrent threads used to  send  the  job  request  from  the  srun
              process to the slurmd processes on the allocated nodes. Default is to use one thread per allocated
              node  up  to  a  maximum  of  60  concurrent  threads. Specifying this option limits the number of
              concurrent threads to nthreads (less than or equal to 60).  This should only be used to set a  low
              thread count for testing on very small memory computers.

       -t, --time=<time>
              Set  a  limit on the total run time of the job or job step.  If the requested time limit for a job
              exceeds the  partition's  time  limit,  the  job  will  be  left  in  a  PENDING  state  (possibly
              indefinitely).  If the requested time limit for a job step exceeds the partition's time limit, the
              job  step  will  not  be initiated.  The default time limit is the partition's default time limit.
              When the time limit is reached, each task in each job step is sent SIGTERM  followed  by  SIGKILL.
              The  limit  is for the job, all job steps are signaled. If the time limit is for a single job step
              within an existing job allocation, only  that  job  step  will  be  affected.  A  job  time  limit
              supersedes  all job step time limits. The interval between SIGTERM and SIGKILL is specified by the
              SLURM configuration parameter KillWait.  A time limit of zero  requests  that  no  time  limit  be
              imposed.   Acceptable  time formats include "minutes", "minutes:seconds", "hours:minutes:seconds",
              "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds".

       --task-epilog=<executable>
              The slurmstepd daemon will run executable just after each task terminates. This will  be  executed
              before  any TaskEpilog parameter in slurm.conf is executed. This is meant to be a very short-lived
              program. If it fails to terminate within  a  few  seconds,  it  will  be  killed  along  with  any
              descendant processes.

       --task-prolog=<executable>
              The  slurmstepd  daemon will run executable just before launching each task. This will be executed
              after any TaskProlog  parameter  in  slurm.conf  is  executed.   Besides  the  normal  environment
              variables, this has SLURM_TASK_PID available to identify the process ID of the task being started.
              Standard  output from this program of the form "export NAME=value" will be used to set environment
              variables for the task being spawned.

       --test-only
              Returns an estimate of when a job would be scheduled to run given the current job  queue  and  all
              the  other  srun  arguments  specifying  the  job.   This  limits  srun's  behavior to just return
              information; no job is actually submitted.  EXCEPTION:  On  Bluegene/Q  systems  on  when  running
              within  an existing job allocation, this disables the use of "runjob" to launch tasks. The program
              will be executed directly by the slurmd dameon.

       --threads-per-core=<threads>
              Restrict node selection to nodes with at least the specified number of threads  per  core.   NOTE:
              "Threads"  refers  to  the  number  of  processing  units  on  each core rather than the number of
              application tasks to be launched per core.  See additional information under -B option above  when
              task/affinity plugin is enabled.

       --time-min=<time>
              Set  a minimum time limit on the job allocation.  If specified, the job may have it's --time limit
              lowered to a value no lower than --time-min if doing so permits the job to begin execution earlier
              than otherwise possible.  The job's time limit will not be changed  after  the  job  is  allocated
              resources.   This  is performed by a backfill scheduling algorithm to allocate resources otherwise
              reserved for higher priority jobs.  Acceptable time formats include "minutes",  "minutes:seconds",
              "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds".

       --tmp=<MB>
              Specify a minimum amount of temporary disk space.

       -u, --unbuffered
              Do not line buffer stdout from remote tasks. This option cannot be used with --label.

       --usage
              Display brief help message and exit.

       --uid=<user>
              Attempt  to  submit  and/or run a job as user instead of the invoking user id. The invoking user's
              credentials will be used to check access permissions for the target partition. User root  may  use
              this option to run jobs as a normal user in a RootOnly partition for example. If run as root, srun
              will  drop  its  permissions to the uid specified after node allocation is successful. user may be
              the user name or numerical user ID.

       -V, --version
              Display version information and exit.

       -v, --verbose
              Increase the verbosity of srun's informational messages.   Multiple  -v's  will  further  increase
              srun's verbosity.  By default only errors will be displayed.

       -W, --wait=<seconds>
              Specify how long to wait after the first task terminates before terminating all remaining tasks. A
              value  of  0  indicates an unlimited wait (a warning will be issued after 60 seconds). The default
              value is set by the WaitTime parameter in the slurm configuration file (see  slurm.conf(5)).  This
              option  can be useful to insure that a job is terminated in a timely fashion in the event that one
              or more tasks terminate prematurely.  Note: The -K,  --kill-on-bad-exit  option  takes  precedence
              over -W, --wait to terminate the job immediately if a task exits with a non-zero exit code.

       -w, --nodelist=<host1,host2,... or filename>
              Request  a  specific  list  of  hosts.  The job will contain at least these hosts. The list may be
              specified as a comma-separated list of hosts, a range of hosts (host[1-5,7,...] for example), or a
              filename.  The host list will be assumed to be a filename if it contains a "/" character.  If  you
              specify a max node count (-N1-2) if there are more than 2 hosts in the file only the first 2 nodes
              will  be  used in the request list.  Rather than repeating a host name multiple times, an asterisk
              and a repitition count may be appended to a host name. For example "host1,host1" and "host1*2" are
              equivalent.

       --wckey=<wckey>
              Specify wckey to be used with job.  If TrackWCKey=no (default) in the  slurm.conf  this  value  is
              ignored.

       -X, --disable-status
              Disable  the  display  of  task  status  when  srun  receives  a  single  SIGINT (Ctrl-C). Instead
              immediately forward the SIGINT to the running job.  Without this option a  second  Ctrl-C  in  one
              second  is  required to forcibly terminate the job and srun will immediately exit. May also be set
              via the environment variable SLURM_DISABLE_STATUS.

       -x, --exclude=<host1,host2,... or filename>
              Request that a specific list of hosts not be included in the resources allocated to this job.  The
              host list will be assumed to be a filename if it contains a "/"character.

       -Z, --no-allocate
              Run  the  specified  tasks  on  a  set  of nodes without creating a SLURM "job" in the SLURM queue
              structure, bypassing the normal resource allocation step.  The list of  nodes  must  be  specified
              with  the  -w,  --nodelist  option.   This  is  a  privileged  option only available for the users
              "SlurmUser" and "root".

       The following options support Blue Gene systems, but may be applicable to other systems as well.

       --blrts-image=<path>
              Path to blrts image for bluegene block.  BGL only.  Default from blugene.conf if not set.

       --cnload-image=<path>
              Path to compute node image for bluegene block.  BGP only.  Default from blugene.conf if not set.

       --conn-type=<type>
              Require the block connection type to be of a certain type.  On Blue Gene the  acceptable  of  type
              are  MESH,  TORUS  and  NAV.   If  NAV,  or  if  not  set,  then  SLURM will try to fit a what the
              DefaultConnType is set to in the bluegene.conf if that isn't set the default is TORUS.  You should
              not normally set this option.  If running on a BGP system and wanting to run in HTC mode (only for
              1 midplane and below).  You can use HTC_S for SMP, HTC_D for Dual, HTC_V for  virtual  node  mode,
              and  HTC_L  for  Linux mode.  For systems that allow a different connection type per dimension you
              can supply a comma separated list of connection types may be specified,  one  for  each  dimension
              (i.e. M,T,T,T will give you a torus connection is all dimensions expect the first).

       -g, --geometry=<XxYxZ> | <AxXxYxZ>
              Specify  the  geometry  requirements  for  the job. On BlueGene/L and BlueGene/P systems there are
              three numbers giving dimensions in the X, Y and Z directions, while on  BlueGene/Q  systems  there
              are four numbers giving dimensions in the A, X, Y and Z directions and can not be used to allocate
              sub-blocks.   For  example "--geometry=1x2x3x4", specifies a block of nodes having 1 x 2 x 3 x 4 =
              24 nodes (actually midplanes on BlueGene).

       --ioload-image=<path>
              Path to io image for bluegene block.  BGP only.  Default from blugene.conf if not set.

       --linux-image=<path>
              Path to linux image for bluegene block.  BGL only.  Default from blugene.conf if not set.

       --mloader-image=<path>
              Path to mloader image for bluegene block.  Default from blugene.conf if not set.

       -R, --no-rotate
              Disables rotation of the job's requested geometry in  order  to  fit  an  appropriate  block.   By
              default the specified geometry can rotate in three dimensions.

       --ramdisk-image=<path>
              Path to ramdisk image for bluegene block.  BGL only.  Default from blugene.conf if not set.

       --reboot
              Force the allocated nodes to reboot before starting the job.

       srun  will  submit the job request to the slurm job controller, then initiate all processes on the remote
       nodes. If the request cannot be met immediately, srun will block until the resources are free to run  the
       job.  If  the  -I  (--immediate) option is specified srun will terminate if resources are not immediately
       available.

       When initiating remote processes srun will propagate the current working directory, unless --chdir=<path>
       is specified, in which case path will become the working directory for the remote processes.

       The -n, -c, and -N options control how CPUs  and nodes will be allocated to the job. When specifying only
       the number of processes to run with -n, a default of one CPU per process is allocated. By specifying  the
       number  of  CPUs required per task (-c), more than one CPU may be allocated per process. If the number of
       nodes is specified with -N, srun will attempt to allocate at least the number of nodes specified.

       Combinations of the above three options may be used to change how processes are distributed across  nodes
       and  cpus.  For instance, by specifying both the number of processes and number of nodes on which to run,
       the number of processes per node is implied. However, if the number of CPUs per process is more important
       then number of processes (-n) and the number of CPUs per process (-c) should be specified.

       srun will refuse to  allocate more than one process per CPU unless --overcommit (-O) is also specified.

       srun will attempt to meet the above specifications "at a minimum." That is, if 16 nodes are requested for
       32 processes, and some nodes do not have 2 CPUs, the allocation of nodes will be increased  in  order  to
       meet the demand for CPUs. In other words, a minimum of 16 nodes are being requested. However, if 16 nodes
       are  requested  for  15 processes, srun will consider this an error, as 15 processes cannot run across 16
       nodes.

       IO Redirection

       By default, stdout and stderr will be redirected from all tasks to the stdout and  stderr  of  srun,  and
       stdin  will  be  redirected  from the standard input of srun to all remote tasks.  If stdin is only to be
       read by a subset of the spawned tasks, specifying a file to read from rather than forwarding  stdin  from
       the srun command may be preferable as it avoids moving and storing data that will never be read.

       For OS X, the poll() function does not support stdin, so input from a terminal is not possible.

       For  BGQ  srun only supports stdin to 1 task running on the system.  By default it is taskid 0 but can be
       changed with the -i<taskid> as described below, or --launcher-opts="--stdinrank=<taskid>".

       This behavior may be changed with the --output, --error, and --input (-o, -e, -i) options.  Valid  format
       specifications for these options are

       all       stdout  stderr  is  redirected from all tasks to srun.  stdin is broadcast to all remote tasks.
                 (This is the default behavior)

       none      stdout and stderr is not received from any task.  stdin is not  sent  to  any  task  (stdin  is
                 closed).

       taskid    stdout  and/or stderr are redirected from only the task with relative id equal to taskid, where
                 0 <= taskid <= ntasks, where ntasks is the total number of  tasks  in  the  current  job  step.
                 stdin is redirected from the stdin of srun to this same task.  This file will be written on the
                 node executing the task.

       filename  srun  will  redirect  stdout  and/or  stderr  to  the named file from all tasks.  stdin will be
                 redirected from the named file and broadcast to all tasks in the job.   filename  refers  to  a
                 path  on  the  host  that  runs  srun.  Depending on the cluster's file system layout, this may
                 result in the output appearing in different places depending on whether the job is run in batch
                 mode.

       format string
                 srun allows for a format string to be used to generate the named IO file described  above.  The
                 following  list  of  format  specifiers may be used in the format string to generate a filename
                 that will be unique to a given jobid, stepid, node, or task.  In  each  case,  the  appropriate
                 number  of  files  are opened and associated with the corresponding tasks. Note that any format
                 string containing %t, %n, and/or %N will be written on the node executing the task rather  than
                 the node where srun executes, these format specifiers are not supported on a BGQ system.

                 %A     Job array's master job allocation number.

                 %a     Job array ID (index) number.

                 %J     jobid.stepid of the running job. (e.g. "128.0")

                 %j     jobid of the running job.

                 %s     stepid of the running job.

                 %N     short hostname. This will create a separate IO file per node.

                 %n     Node  identifier relative to current job (e.g. "0" is the first node of the running job)
                        This will create a separate IO file per node.

                 %t     task identifier (rank) relative to current job. This will create a separate IO file  per
                        task.

                 %u     User name.

                 A  number placed between the percent character and format specifier may be used to zero-pad the
                 result in the IO filename. This number is  ignored  if  the  format  specifier  corresponds  to
                 non-numeric data (%N for example).

                 Some  examples  of how the format string may be used for a 4 task job step with a Job ID of 128
                 and step id of 0 are included below:

                 job%J.out      job128.0.out

                 job%4j.out     job0128.out

                 job%j-%2t.out  job128-00.out, job128-01.out, ...

INPUT ENVIRONMENT VARIABLES

       Some srun options may be set via environment variables.  These environment variables,  along  with  their
       corresponding options, are listed below.  Note: Command line options will always override these settings.

       PMI_FANOUT            This  is used exclusively with PMI (MPICH2 and MVAPICH2) and controls the fanout of
                             data communications. The srun command sends messages to application  programs  (via
                             the  PMI library) and those applications may be called upon to forward that data to
                             up to this number of additional tasks. Higher values offload  work  from  the  srun
                             command to the applications and likely increase the vulnerability to failures.  The
                             default value is 32.

       PMI_FANOUT_OFF_HOST   This  is used exclusively with PMI (MPICH2 and MVAPICH2) and controls the fanout of
                             data communications.  The srun command sends messages to application programs  (via
                             the  PMI library) and those applications may be called upon to forward that data to
                             additional tasks. By default, srun sends one message per host and one task on  that
                             host  forwards  the  data  to  other  tasks  on  that  host  up  to PMI_FANOUT.  If
                             PMI_FANOUT_OFF_HOST is defined, the user task may be required to forward  the  data
                             to  tasks  on  other  hosts.  Setting PMI_FANOUT_OFF_HOST may increase performance.
                             Since more work is performed by the PMI library loaded  by  the  user  application,
                             failures also can be more common and more difficult to diagnose.

       PMI_TIME              This  is  used exclusively with PMI (MPICH2 and MVAPICH2) and controls how much the
                             communications from the tasks to the srun are spread out in time in order to  avoid
                             overwhelming  the  srun  command with work. The default value is 500 (microseconds)
                             per task. On relatively slow processors or systems with very large processor counts
                             (and large PMI data sets), higher values may be required.

       SLURM_CONF            The location of the SLURM configuration file.

       SLURM_ACCOUNT         Same as -A, --account

       SLURM_ACCTG_FREQ      Same as --acctg-freq

       SLURM_BLRTS_IMAGE     Same as --blrts-image

       SLURM_CHECKPOINT      Same as --checkpoint

       SLURM_CHECKPOINT_DIR  Same as --checkpoint-dir

       SLURM_CNLOAD_IMAGE    Same as --cnload-image

       SLURM_CONN_TYPE       Same as --conn-type

       SLURM_CPU_BIND        Same as --cpu_bind

       SLURM_CPU_FREQ_REQ    Same as --cpu-freq. Can specify a numerical frequency in kilohertz, or the  request
                             can  specify  low,  medium,  or  high  for  the value. "Low" will select the lowest
                             available frequency, "high" will select  the  highest  available  frequency,  while
                             "medium"  attempts  to set a frequency in the middle of the available range. If the
                             numeric value specified does not exactly match a legal available  frequency,  SLURM
                             will attempt to pick a legal frequency close to the request.

       SLURM_CPUS_PER_TASK   Same as -c, --cpus-per-task

       SLURM_DEBUG           Same as -v, --verbose

       SLURMD_DEBUG          Same as -d, --slurmd-debug

       SLURM_DEPENDENCY      -P, --dependency=<jobid>

       SLURM_DISABLE_STATUS  Same as -X, --disable-status

       SLURM_DIST_PLANESIZE  Same as -m plane

       SLURM_DISTRIBUTION    Same as -m, --distribution

       SLURM_EPILOG          Same as --epilog

       SLURM_EXCLUSIVE       Same as --exclusive

       SLURM_EXIT_ERROR      Specifies the exit code generated when a SLURM error occurs (e.g. invalid options).
                             This  can  be  used  by a script to distinguish application exit codes from various
                             SLURM error conditions.  Also see SLURM_EXIT_IMMEDIATE.

       SLURM_EXIT_IMMEDIATE  Specifies the exit code generated when the --immediate option is used and resources
                             are not currently  available.   This  can  be  used  by  a  script  to  distinguish
                             application   exit   codes   from   various   SLURM  error  conditions.   Also  see
                             SLURM_EXIT_ERROR.

       SLURM_GEOMETRY        Same as -g, --geometry

       SLURM_GRES            Same as --gres. Also see SLURM_STEP_GRES

       SLURM_IMMEDIATE       Same as -I, --immediate

       SLURM_IOLOAD_IMAGE    Same as --ioload-image

       SLURM_JOB_ID (and SLURM_JOBID for backwards compatibility)
                             Same as --jobid

       SLURM_JOB_NAME        Same as -J, --job-name except within an existing allocation, in which  case  it  is
                             ignored to avoid using the batch job's name as the name of each job step.

       SLURM_JOB_NUM_NODES (and SLURM_NNODES for backwards compatibility)
                             Total number of nodes in the job’s resource allocation.

       SLURM_KILL_BAD_EXIT   Same as -K, --kill-on-bad-exit

       SLURM_LABELIO         Same as -l, --label

       SLURM_LINUX_IMAGE     Same as --linux-image

       SLURM_MEM_BIND        Same as --mem_bind

       SLURM_MEM_PER_CPU     Same as --mem-per-cpu

       SLURM_MEM_PER_NODE    Same as --mem

       SLURM_MLOADER_IMAGE   Same as --mloader-image

       SLURM_MPI_TYPE        Same as --mpi

       SLURM_NETWORK         Same as --network

       SLURM_NNODES          Same as -N, --nodes

       SLURM_NODELIST        Same as -w, --nodelist

       SLURM_NO_ROTATE       Same as -R, --no-rotate

       SLURM_NTASKS (and SLURM_NPROCS for backwards compatibility)
                             Same as -n, --ntasks

       SLURM_NTASKS_PER_CORE Same as --ntasks-per-core

       SLURM_NTASKS_PER_NODE Same as --ntasks-per-node

       SLURM_NTASKS_PER_SOCKET
                             Same as --ntasks-per-socket

       SLURM_OPEN_MODE       Same as --open-mode

       SLURM_OVERCOMMIT      Same as -O, --overcommit

       SLURM_PARTITION       Same as -p, --partition

       SLURM_PMI_KVS_NO_DUP_KEYS
                             If  set,  then  PMI key-pairs will contain no duplicate keys.  This is the case for
                             MPICH2 and reduces overhead in testing for duplicates for improved performance

       SLURM_PROFILE         Same as --profile

       SLURM_PROLOG          Same as --prolog

       SLURM_QOS             Same as --qos

       SLURM_RAMDISK_IMAGE   Same as --ramdisk-image

       SLURM_REMOTE_CWD      Same as -D, --chdir=

       SLURM_REQ_SWITCH      When a tree topology is used, this defines the maximum count  of  switches  desired
                             for  the  job allocation and optionally the maximum time to wait for that number of
                             switches. See --switches

       SLURM_RESERVATION     Same as --reservation

       SLURM_RESTART_DIR     Same as --restart-dir

       SLURM_RESV_PORTS      Same as --resv-ports

       SLURM_SIGNAL          Same as --signal

       SLURM_STDERRMODE      Same as -e, --error

       SLURM_STDINMODE       Same as -i, --input

       SLURM_SRUN_REDUCE_TASK_EXIT_MSG
                             if set and non-zero, successive task exit messages with the same exit code will  be
                             printed only once.

       SLURM_STEP_GRES       Same  as  --gres  (only  applies  to  job steps, not to job allocations).  Also see
                             SLURM_GRES

       SLURM_STEP_KILLED_MSG_NODE_ID=ID
                             If set, only the specified node will log when the job  or  step  are  killed  by  a
                             signal.

       SLURM_STDOUTMODE      Same as -o, --output

       SLURM_TASK_EPILOG     Same as --task-epilog

       SLURM_TASK_PROLOG     Same as --task-prolog

       SLURM_THREADS         Same as -T, --threads

       SLURM_TIMELIMIT       Same as -t, --time

       SLURM_UNBUFFEREDIO    Same as -u, --unbuffered

       SLURM_WAIT            Same as -W, --wait

       SLURM_WAIT4SWITCH     Max time waiting for requested switches. See --switches

       SLURM_WCKEY           Same as -W, --wckey

       SLURM_WORKING_DIR     -D, --chdir

OUTPUT ENVIRONMENT VARIABLES

       srun  will set some environment variables in the environment of the executing tasks on the remote compute
       nodes.  These environment variables are:

       SLURM_CHECKPOINT_IMAGE_DIR
                             Directory into which checkpoint images  should  be  written  if  specified  on  the
                             execute line.

       SLURM_CPU_BIND_VERBOSE
                             --cpu_bind verbosity (quiet,verbose).

       SLURM_CPU_BIND_TYPE   --cpu_bind type (none,rank,map_cpu:,mask_cpu:)

       SLURM_CPU_BIND_LIST   --cpu_bind map or mask list (list of SLURM CPU IDs or masks for this node, CPU_ID =
                             Board_ID   x  threads_per_board  +  Socket_ID  x  threads_per_socket  +  Core_ID  x
                             threads_per_core + Thread_ID).

       SLURM_CPU_FREQ_REQ    Contains the value requested for cpu frequency on the srun command as  a  numerical
                             frequency  in kilohertz, or a coded value for a request of low, medium, or high for
                             the  frequency.  See  the   description   of   the   --cpu-freq   option   or   the
                             SLURM_CPU_FREQ_REQ input environment variable.

       SLURM_CPUS_ON_NODE    Count  of  processors  available  to  the job on this node.  Note the select/linear
                             plugin allocates entire nodes to jobs, so the value indicates the  total  count  of
                             CPUs on the node.  For the select/cons_res plugin, this number indicates the number
                             of cores on this node allocated to the job.

       SLURM_DISTRIBUTION    Distribution   type   for  the  allocated  jobs.  Set  the  distribution  with  -m,
                             --distribution.

       SLURM_GTIDS           Global task IDs running on this node.  Zero origin and comma separated.

       SLURM_JOB_CPUS_PER_NODE
                             Number of CPUS per node.

       SLURM_JOB_DEPENDENCY  Set to value of the --dependency option.

       SLURM_JOB_ID (and SLURM_JOBID for backwards compatibility)
                             Job id of the executing job

       SLURM_JOB_NAME        Set to the value of the --job-name option or the command name when srun is used  to
                             create  a  new  job allocation. Not set when srun is used only to create a job step
                             (i.e. within an existing job allocation).

       SLURM_LAUNCH_NODE_IPADDR
                             IP address of the node from which the task launch was  initiated  (where  the  srun
                             command ran from)

       SLURM_LOCALID         Node local task ID for the process within a job

       SLURM_MEM_BIND_VERBOSE
                             --mem_bind verbosity (quiet,verbose).

       SLURM_MEM_BIND_TYPE   --mem_bind type (none,rank,map_mem:,mask_mem:)

       SLURM_MEM_BIND_LIST   --mem_bind map or mask list (<list of IDs or masks for this node>)

       SLURM_NNODES          Total number of nodes in the job's resource allocation

       SLURM_NODE_ALIASES    Sets  of  node  name, communication address and hostname for nodes allocated to the
                             job from the cloud. Each element in the set if colon  separated  and  each  set  is
                             comma separated. For example: SLURM_NODE_ALIASES=ec0:1.2.3.4:foo,ec1:1.2.3.5:bar

       SLURM_NODEID          The relative node ID of the current node

       SLURM_NODELIST        List of nodes allocated to the job

       SLURM_NTASKS (and SLURM_NPROCS for backwards compatibility)
                             Total number of processes in the current job

       SLURM_PRIO_PROCESS    The  scheduling priority (nice value) at the time of job submission.  This value is
                             propagated to the spawned processes.

       SLURM_PROCID          The MPI rank (or relative process ID) of the current process

       SLURM_SRUN_COMM_HOST  IP address of srun communication host.

       SLURM_SRUN_COMM_PORT  srun communication port.

       SLURM_STEP_LAUNCHER_PORT
                             Step launcher port.

       SLURM_STEP_NODELIST   List of nodes allocated to the step.

       SLURM_STEP_NUM_NODES  Number of nodes allocated to the step.

       SLURM_STEP_NUM_TASKS  Number of processes in the step.

       SLURM_STEP_TASKS_PER_NODE
                             Number of processes per node within the step.

       SLURM_STEP_ID (and SLURM_STEPID for backwards compatibility)
                             The step ID of the current job

       SLURM_SUBMIT_DIR      The directory from which srun was invoked.

       SLURM_SUBMIT_HOST     The hostname of the computer from which salloc was invoked.

       SLURM_TASK_PID        The process ID of the task being started.

       SLURM_TASKS_PER_NODE  Number of tasks to be initiated on each node. Values are comma separated and in the
                             same order as SLURM_NODELIST.  If two or more consecutive nodes  are  to  have  the
                             same  task  count,  that  count  is  followed by "(x#)" where "#" is the repetition
                             count. For example, "SLURM_TASKS_PER_NODE=2(x3),1" indicates that the  first  three
                             nodes will each execute three tasks and the fourth node will execute one task.

       SLURM_TOPOLOGY_ADDR   This  is set only if the system has the topology/tree plugin configured.  The value
                             will be set to the names network switches  which  may  be  involved  in  the  job's
                             communications  from  the  system's  top  level  switch down to the leaf switch and
                             ending with node name. A period is used to separate each hardware component name.

       SLURM_TOPOLOGY_ADDR_PATTERN
                             This is set only if the system has the topology/tree plugin configured.  The  value
                             will  be set component types listed in SLURM_TOPOLOGY_ADDR.  Each component will be
                             identified as either "switch" or  "node".   A  period  is  used  to  separate  each
                             hardware component type.

       SRUN_DEBUG            Set  to  the  logging  level of the srun command.  Default value is 3 (info level).
                             The value is incremented or  decremented  based  upon  the  --verbose  and  --quiet
                             options.

       MPIRUN_NOALLOCATE     Do not allocate a block on Blue Gene systems only.

       MPIRUN_NOFREE         Do not free a block on Blue Gene systems only.

       MPIRUN_PARTITION      The block name on Blue Gene systems only.

SIGNALS AND ESCAPE SEQUENCES

       Signals  sent  to  the srun command are automatically forwarded to the tasks it is controlling with a few
       exceptions. The escape sequence <control-c> will report the state of all tasks associated with  the  srun
       command.  If  <control-c>  is  entered twice within one second, then the associated SIGINT signal will be
       sent to all tasks and a termination sequence will be entered sending SIGCONT, SIGTERM, and SIGKILL to all
       spawned tasks.  If a third <control-c> is received, the srun program will be terminated  without  waiting
       for remote tasks to exit or their I/O to complete.

       The  escape sequence <control-z> is presently ignored. Our intent is for this put the srun command into a
       mode where various special actions may be invoked.

MPI SUPPORT

       MPI use depends upon the type of MPI being used.   There  are  three  fundamentally  different  modes  of
       operation used by these various MPI implementation.

       1. SLURM directly launches the tasks and performs initialization of communications (Quadrics MPI, MPICH2,
       MPICH-GM, MVAPICH, MVAPICH2 and some MPICH1 modes). For example: "srun -n16 a.out".

       2.  SLURM  creates  a  resource  allocation  for  the  job  and  then mpirun launches tasks using SLURM's
       infrastructure (OpenMPI, LAM/MPI, HP-MPI and some MPICH1 modes).

       3. SLURM creates a resource allocation for the job and then mpirun launches tasks  using  some  mechanism
       other than SLURM, such as SSH or RSH (BlueGene MPI and some MPICH1 modes).  These tasks initiated outside
       of SLURM's monitoring or control. SLURM's epilog should be configured to purge these tasks when the job's
       allocation is relinquished.

       See   http://slurm.schedmd.com/mpi_guide.html   for   more  information  on  use  of  these  various  MPI
       implementation with SLURM.

MULTIPLE PROGRAM CONFIGURATION

       Comments in the configuration file must have a "#" in column one.  The configuration  file  contains  the
       following fields separated by white space:

       Task rank
              One or more task ranks to use this configuration.  Multiple values may be comma separated.  Ranges
              may  be  indicated with two numbers separated with a '-' with the smaller number first (e.g. "0-4"
              and not "4-0").  To indicate all tasks not otherwise specified, specify a rank of '*' as the  last
              line  of  the  file.   If an attempt is made to initiate a task for which no executable program is
              defined, the following error message will be produced "No executable program  specified  for  this
              task".

       Executable
              The name of the program to execute.  May be fully qualified pathname if desired.

       Arguments
              Program  arguments.   The expression "%t" will be replaced with the task's number.  The expression
              "%o" will be replaced with the task's offset within this range (e.g. a configured task rank  value
              of  "1-5"  would  have  offset  values  of  "0-4").  Single quotes may be used to avoid having the
              enclosed values interpreted.  This field is optional.  Any arguments for the  program  entered  on
              the command line will be added to the arguments specified in the configuration file.

       For example:
       ###################################################################
       # srun multiple program configuration file
       #
       # srun -n8 -l --multi-prog silly.conf
       ###################################################################
       4-6       hostname
       1,7       echo  task:%t
       0,2-3     echo  offset:%o

       > srun -n8 -l --multi-prog silly.conf
       0: offset:0
       1: task:1
       2: offset:1
       3: offset:2
       4: linux15.llnl.gov
       5: linux16.llnl.gov
       6: linux17.llnl.gov
       7: task:7

EXAMPLES

       This  simple  example  demonstrates  the execution of the command hostname in eight tasks. At least eight
       processors will be allocated to the job (the same as the task count) on however many nodes  are  required
       to  satisfy  the  request.  The output of each task will be proceeded with its task number.  (The machine
       "dev" in the example below has a total of two CPUs per node)

       > srun -n8 -l hostname
       0: dev0
       1: dev0
       2: dev1
       3: dev1
       4: dev2
       5: dev2
       6: dev3
       7: dev3

       The srun -r option is used within a job script to run two job steps on disjoint nodes  in  the  following
       example. The script is run using allocate mode instead of as a batch job in this case.

       > cat test.sh
       #!/bin/sh
       echo $SLURM_NODELIST
       srun -lN2 -r2 hostname
       srun -lN2 hostname

       > salloc -N4 test.sh
       dev[7-10]
       0: dev9
       1: dev10
       0: dev7
       1: dev8

       The following script runs two job steps in parallel within an allocated set of nodes.

       > cat test.sh
       #!/bin/bash
       srun -lN2 -n4 -r 2 sleep 60 &
       srun -lN2 -r 0 sleep 60 &
       sleep 1
       squeue
       squeue -s
       wait

       > salloc -N4 test.sh
         JOBID PARTITION     NAME     USER  ST      TIME  NODES NODELIST
         65641     batch  test.sh   grondo   R      0:01      4 dev[7-10]

       STEPID     PARTITION     USER      TIME NODELIST
       65641.0        batch   grondo      0:01 dev[7-8]
       65641.1        batch   grondo      0:01 dev[9-10]

       This  example  demonstrates how one executes a simple MPICH job.  We use srun to build a list of machines
       (nodes) to be used by mpirun in its required format. A sample command line and the script to be  executed
       follow.

       > cat test.sh
       #!/bin/sh
       MACHINEFILE="nodes.$SLURM_JOB_ID"

       # Generate Machinefile for mpich such that hosts are in the same
       #  order as if run via srun
       #
       srun -l /bin/hostname | sort -n | awk '{print $2}' > $MACHINEFILE

       # Run using generated Machine file:
       mpirun -np $SLURM_NTASKS -machinefile $MACHINEFILE mpi-app

       rm $MACHINEFILE

       > salloc -N2 -n4 test.sh

       This  simple  example  demonstrates  the execution of different jobs on different nodes in the same srun.
       You can do this for any number of nodes or any number of jobs.  The executables are placed on  the  nodes
       sited  by  the  SLURM_NODEID  env  var.   Starting  at  0  and  going to the number specified on the srun
       commandline.

       > cat test.sh
       case $SLURM_NODEID in
           0) echo "I am running on "
              hostname ;;
           1) hostname
              echo "is where I am running" ;;
       esac

       > srun -N2 test.sh
       dev0
       is where I am running
       I am running on
       dev1

       This example demonstrates use of multi-core options to control layout of tasks.   We  request  that  four
       sockets per node and two cores per socket be dedicated to the job.

       > srun -N2 -B 4-4:2-2 a.out

       This  example shows a script in which Slurm is used to provide resource management for a job by executing
       the various job steps as processors become available for their dedicated use.

       > cat my.script
       #!/bin/bash
       srun --exclusive -n4 prog1 &
       srun --exclusive -n3 prog2 &
       srun --exclusive -n1 prog3 &
       srun --exclusive -n1 prog4 &
       wait

COPYING

       Copyright (C) 2006-2007 The Regents of the University of  California.   Produced  at  Lawrence  Livermore
       National Laboratory (cf, DISCLAIMER).
       Copyright (C) 2008-2010 Lawrence Livermore National Security.
       Copyright (C) 2010-2013 SchedMD LLC.

       This file is part of SLURM, a resource management program.  For details, see <http://slurm.schedmd.com/>.

       SLURM  is  free  software;  you  can  redistribute it and/or modify it under the terms of the GNU General
       Public License as published by the Free Software Foundation; either version 2 of the License, or (at your
       option) any later version.

       SLURM is distributed in the hope that it will be useful, but  WITHOUT  ANY  WARRANTY;  without  even  the
       implied  warranty  of  MERCHANTABILITY  or  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
       License for more details.

SEE ALSO

       salloc(1),  sattach(1),  sbatch(1),  sbcast(1),  scancel(1),   scontrol(1),   squeue(1),   slurm.conf(5),
       sched_setaffinity (2), numa (3) getrlimit (2)

January 2013                                        SLURM 2.6                                            srun(1)