Provided by: slurm-client_19.05.5-1_amd64 bug

NAME

       srun - Run parallel jobs

SYNOPSIS

       srun [OPTIONS(0)...] [ : [OPTIONS(N)...]] executable(0) [args(0)...]

       Option(s)  define  multiple  jobs  in  a co-scheduled heterogeneous job.  For more details
       about heterogeneous jobs see the document
       https://slurm.schedmd.com/heterogeneous_jobs.html

DESCRIPTION

       Run a parallel job on cluster managed by Slurm.  If necessary, srun will  first  create  a
       resource allocation in which to run the parallel job.

       The  following  document  describes  the influence of various options on the allocation of
       cpus to jobs and tasks.
       https://slurm.schedmd.com/cpu_management.html

RETURN VALUE

       srun will return the highest exit code of all tasks run or the highest  signal  (with  the
       high-order  bit set in an 8-bit integer -- e.g. 128 + signal) of any task that exited with
       a signal.

EXECUTABLE PATH RESOLUTION

       The executable is resolved in the following order:

       1. If executable starts with ".", then path is constructed as: current working directory /
          executable

       2. If executable starts with a "/", then path is considered absolute.

       3. If executable can be resolved through PATH. See path_resolution(7).

       4. If executable is in current working directory.

       Current  working  directory  is  the  calling process working directory unless the --chdir
       argument is passed, which will override the current working directory.

OPTIONS

       --accel-bind=<options>
              Control how tasks are bound  to  generic  resources  of  type  gpu,  mic  and  nic.
              Multiple options may be specified. Supported options include:

              g      Bind each task to GPUs which are closest to the allocated CPUs.

              m      Bind each task to MICs which are closest to the allocated CPUs.

              n      Bind each task to NICs which are closest to the allocated CPUs.

              v      Verbose mode. Log how tasks are bound to GPU and NIC devices.

              This option applies to job allocations.

       -A, --account=<account>
              Charge  resources  used  by  this  job  to  specified  account.   The account is an
              arbitrary string. The account name may be changed after job  submission  using  the
              scontrol command. This option applies to job allocations.

       --acctg-freq
              Define  the  job  accounting and profiling sampling intervals.  This can be used to
              override  the  JobAcctGatherFrequency  parameter  in  Slurm's  configuration  file,
              slurm.conf.  The supported format is follows:

              --acctg-freq=<datatype>=<interval>
                          where  <datatype>=<interval>  specifies  the task sampling interval for
                          the jobacct_gather plugin or a sampling interval for a  profiling  type
                          by    the   acct_gather_profile   plugin.   Multiple,   comma-separated
                          <datatype>=<interval> intervals may be specified.  Supported  datatypes
                          are as follows:

                          task=<interval>
                                 where  <interval>  is  the task sampling interval in seconds for
                                 the  jobacct_gather  plugins  and  for  task  profiling  by  the
                                 acct_gather_profile  plugin.   NOTE:  This  frequency is used to
                                 monitor memory usage. If memory limits are enforced the  highest
                                 frequency  a  user  can  request  is  what  is configured in the
                                 slurm.conf file.  They can not turn it off (=0) either.

                          energy=<interval>
                                 where <interval> is the sampling interval in seconds for  energy
                                 profiling using the acct_gather_energy plugin

                          network=<interval>
                                 where  <interval>  is  the  sampling  interval  in  seconds  for
                                 infiniband profiling using the acct_gather_infiniband plugin.

                          filesystem=<interval>
                                 where  <interval>  is  the  sampling  interval  in  seconds  for
                                 filesystem profiling using the acct_gather_filesystem plugin.

              The default value for the task sampling interval
              is  30.  The default value for all other intervals is 0.  An interval of 0 disables
              sampling of the specified type.  If the task sampling  interval  is  0,  accounting
              information  is collected only at job termination (reducing Slurm interference with
              the job).
              Smaller (non-zero) values have a greater impact upon job performance, but  a  value
              of  30  seconds  is  not  likely to be noticeable for applications having less than
              10,000 tasks. This option applies job allocations.

       -B --extra-node-info=<sockets[:cores[:threads]]>
              Restrict node selection to nodes with at least the  specified  number  of  sockets,
              cores  per  socket and/or threads per core.  NOTE: These options do not specify the
              resource allocation size.  Each  value  specified  is  considered  a  minimum.   An
              asterisk  (*)  can be used as a placeholder indicating that all available resources
              of that type are to be utilized. Values can  also  be  specified  as  min-max.  The
              individual levels can also be specified in separate options if desired:
                  --sockets-per-node=<sockets>
                  --cores-per-socket=<cores>
                  --threads-per-core=<threads>
              If  task/affinity  plugin  is enabled, then specifying an allocation in this manner
              also sets a default --cpu-bind option of threads  if  the  -B  option  specifies  a
              thread  count, otherwise an option of cores if a core count is specified, otherwise
              an option of sockets.  If SelectType is configured to select/cons_res, it must have
              a  parameter  of  CR_Core,  CR_Core_Memory, CR_Socket, or CR_Socket_Memory for this
              option to be honored.  If  not  specified,  the  scontrol  show  job  will  display
              'ReqS:C:T=*:*:*'. This option applies to job allocations.

       --bb=<spec>
              Burst  buffer  specification.  The  form  of the specification is system dependent.
              Also see --bbf. This option applies to job allocations.

       --bbf=<file_name>
              Path of file containing burst buffer specification.  The form of the  specification
              is system dependent.  Also see --bb. This option applies to job allocations.

       --bcast[=<dest_path>]
              Copy executable file to allocated compute nodes.  If a file name is specified, copy
              the executable to the specified destination file path. If  no  path  is  specified,
              copy  the  file  to  a  file  named "slurm_bcast_<job_id>.<step_id>" in the current
              working.  For example, "srun  --bcast=/tmp/mine  -N3  a.out"  will  copy  the  file
              "a.out"  from  your  current directory to the file "/tmp/mine" on each of the three
              allocated compute nodes  and  execute  that  file.  This  option  applies  to  step
              allocations.

       -b, --begin=<time>
              Defer  initiation  of  this  job until the specified time.  It accepts times of the
              form HH:MM:SS to run a job at a specific time of day (seconds are  optional).   (If
              that  time  is  already  past,  the  next  day  is  assumed.)  You may also specify
              midnight, noon, fika (3 PM) or teatime (4  PM)  and  you  can  have  a  time-of-day
              suffixed with AM or PM for running in the morning or the evening.  You can also say
              what day the job will be run, by specifying a date of the form MMDDYY  or  MM/DD/YY
              YYYY-MM-DD.    Combine    date    and    time    using    the    following   format
              YYYY-MM-DD[THH:MM[:SS]]. You can also give times like now + count time-units, where
              the time-units can be seconds (default), minutes, hours, days, or weeks and you can
              tell Slurm to run the job today with the keyword today and to run the job  tomorrow
              with the keyword tomorrow.  The value may be changed after job submission using the
              scontrol command.  For example:
                 --begin=16:00
                 --begin=now+1hour
                 --begin=now+60           (seconds by default)
                 --begin=2010-01-20T12:34:00

              Notes on date/time specifications:
               - Although the 'seconds' field of the HH:MM:SS time specification  is  allowed  by
              the  code,  note that the poll time of the Slurm scheduler is not precise enough to
              guarantee dispatch of the job on the exact second.  The job  will  be  eligible  to
              start  on  the  next  poll  following  the  specified time. The exact poll interval
              depends on the Slurm scheduler (e.g., 60 seconds with the default sched/builtin).
               - If no time (HH:MM:SS) is specified, the default is (00:00:00).
               - If a date is specified without a year (e.g., MM/DD) then  the  current  year  is
              assumed,  unless  the combination of MM/DD and HH:MM:SS has already passed for that
              year, in which case the next year is used.
              This option applies to job allocations.

       --checkpoint=<time>
              Specifies the interval between creating checkpoints of the job step.   By  default,
              the  job  step  will  have no checkpoints created.  Acceptable time formats include
              "minutes",      "minutes:seconds",      "hours:minutes:seconds",      "days-hours",
              "days-hours:minutes"  and  "days-hours:minutes:seconds". This option applies to job
              and step allocations.

       --cluster-constraint=<list>
              Specifies features that a federated  cluster  must  have  to  have  a  sibling  job
              submitted  to it. Slurm will attempt to submit a sibling job to a cluster if it has
              at least one of the specified features.

       --comment=<string>
              An arbitrary comment. This option applies to job allocations.

       --compress[=type]
              Compress file before sending it to compute hosts.  The optional argument  specifies
              the  data compression library to be used.  Supported values are "lz4" (default) and
              "zlib".  Some compression libraries may be unavailable on some  systems.   For  use
              with the --bcast option. This option applies to step allocations.

       -C, --constraint=<list>
              Nodes  can  have  features  assigned to them by the Slurm administrator.  Users can
              specify which of these features are required by  their  job  using  the  constraint
              option.   Only  nodes  having features matching the job constraints will be used to
              satisfy the request.  Multiple constraints may be specified with AND, OR,  matching
              OR,  resource  counts, etc. (some operators are not supported on all system types).
              Supported constraint options include:

              Single Name
                     Only nodes which have the specified feature  will  be  used.   For  example,
                     --constraint="intel"

              Node Count
                     A  request  can  specify  the  number  of  nodes needed with some feature by
                     appending an asterisk  and  count  after  the  feature  name.   For  example
                     "--nodes=16 --constraint=graphics*4 ..."  indicates that the job requires 16
                     nodes and  that  at  least  four  of  those  nodes  must  have  the  feature
                     "graphics."

              AND    If only nodes with all of specified features will be used.  The ampersand is
                     used for an AND operator.  For example, --constraint="intel&gpu"

              OR     If only nodes with at least one of specified features  will  be  used.   The
                     vertical    bar    is    used    for   an   OR   operator.    For   example,
                     --constraint="intel|amd"

              Matching OR
                     If only one of a set of possible options should be used  for  all  allocated
                     nodes,  then  use  the  OR  operator  and  enclose the options within square
                     brackets.  For example:  "--constraint=[rack1|rack2|rack3|rack4]"  might  be
                     used  to  specify  that  all nodes must be allocated on a single rack of the
                     cluster, but any of those four racks can be used.

              Multiple Counts
                     Specific counts of multiple resources may be  specified  by  using  the  AND
                     operator  and  enclosing  the  options within square brackets.  For example:
                     "--constraint=[rack1*2&rack2*4]" might be used to  specify  that  two  nodes
                     must be allocated from nodes with the feature of "rack1" and four nodes must
                     be allocated from nodes with the feature "rack2".

                     NOTE: This construct does not support multiple  Intel  KNL  NUMA  or  MCDRAM
                     modes.  For example, while "--constraint=[(knl&quad)*2&(knl&hemi)*4]" is not
                     supported,     "--constraint=[haswell*2&(knl&hemi)*4]"     is     supported.
                     Specification of multiple KNL modes requires the use of a heterogeneous job.

              Parenthesis
                     Parenthesis  can  be  used to group like node features together. For example
                     "--constraint=[(knl&snc4&flat)*4&haswell*1]" might be used to  specify  that
                     four nodes with the features "knl", "snc4" and "flat" plus one node with the
                     feature "haswell" are required. All options  within  parenthesis  should  be
                     grouped with AND (e.g. "&") operands.

       WARNING: When srun is executed from within salloc or sbatch, the constraint value can only
       contain a single feature name. None of the other operators are currently supported for job
       steps.
       This option applies to job and step allocations.

       --contiguous
              If  set, then the allocated nodes must form a contiguous set.  Not honored with the
              topology/tree or topology/3d_torus plugins, both  of  which  can  modify  the  node
              ordering. This option applies to job allocations.

       --cores-per-socket=<cores>
              Restrict  node  selection  to nodes with at least the specified number of cores per
              socket.  See additional information under -B option above when task/affinity plugin
              is enabled. This option applies to job allocations.

       --cpu-bind=[{quiet,verbose},]type
              Bind  tasks  to  CPUs.   Used  only when the task/affinity or task/cgroup plugin is
              enabled.  NOTE: To have Slurm always report on the selected  CPU  binding  for  all
              commands  executed  in  a  shell,  you  can  enable  verbose  mode  by  setting the
              SLURM_CPU_BIND environment variable value to "verbose".

              The following informational environment variables are set  when  --cpu-bind  is  in
              use:
                   SLURM_CPU_BIND_VERBOSE
                   SLURM_CPU_BIND_TYPE
                   SLURM_CPU_BIND_LIST

              See  the  ENVIRONMENT  VARIABLES  section  for  a  more detailed description of the
              individual SLURM_CPU_BIND variables. These  variable  are  available  only  if  the
              task/affinity plugin is configured.

              When using --cpus-per-task to run multithreaded tasks, be aware that CPU binding is
              inherited from the parent of the process.  This means that the  multithreaded  task
              should  either  specify or clear the CPU binding itself to avoid having all threads
              of the multithreaded task use the same mask/CPU as the parent.  Alternatively,  fat
              masks  (masks  which specify more than one allowed CPU) could be used for the tasks
              in order to provide multiple CPUs for the multithreaded tasks.

              By default, a job step has access to every CPU allocated to  the  job.   To  ensure
              that distinct CPUs are allocated to each job step, use the --exclusive option.

              Note  that a job step can be allocated different numbers of CPUs on each node or be
              allocated CPUs not starting at location zero. Therefore one of  the  options  which
              automatically generate the task binding is recommended.  Explicitly specified masks
              or bindings are only honored when the job step has been allocated  every  available
              CPU on the node.

              Binding  a task to a NUMA locality domain means to bind the task to the set of CPUs
              that belong to the NUMA locality domain or "NUMA node".  If  NUMA  locality  domain
              options  are used on systems with no NUMA support, then each socket is considered a
              locality domain.

              If the --cpu-bind option is not used, the default binding  mode  will  depend  upon
              Slurm's  configuration  and the step's resource allocation.  If all allocated nodes
              have the same configured CpuBind mode, that will be used.  Otherwise if  the  job's
              Partition has a configured CpuBind mode, that will be used.  Otherwise if Slurm has
              a configured TaskPluginParam value, that mode will be  used.   Otherwise  automatic
              binding will be performed as described below.

              Auto Binding
                     Applies  only  when  task/affinity  is  enabled.  If the job step allocation
                     includes an allocation with a number of sockets, cores, or threads equal  to
                     the  number  of tasks times cpus-per-task, then the tasks will by default be
                     bound to the appropriate resources (auto  binding).  Disable  this  mode  of
                     operation      by      explicitly     setting     "--cpu-bind=none".     Use
                     TaskPluginParam=autobind=[threads|cores|sockets]  to  set  a   default   cpu
                     binding in case "auto binding" doesn't find a match.

              Supported options include:

                     q[uiet]
                            Quietly bind before task runs (default)

                     v[erbose]
                            Verbosely report binding before task runs

                     no[ne] Do not bind tasks to CPUs (default unless auto binding is applied)

                     rank   Automatically  bind  by  task rank.  The lowest numbered task on each
                            node is bound to socket (or core or thread) zero, etc.  Not supported
                            unless the entire node is allocated to the job.

                     map_cpu:<list>
                            Bind  by  setting  CPU  masks  on tasks (or ranks) as specified where
                            <list> is <cpu_id_for_task_0>,<cpu_id_for_task_1>,...   CPU  IDs  are
                            interpreted  as  decimal values unless they are preceded with '0x' in
                            which case they interpreted as hexadecimal values.  If the number  of
                            tasks  (or  ranks)  exceeds  the  number  of  elements  in this list,
                            elements in the list will be  reused  as  needed  starting  from  the
                            beginning  of  the  list.  To simplify support for large task counts,
                            the lists may follow a map with an asterisk and repetition count  For
                            example  "map_cpu:0x0f*4,0xf0*4".   Not  supported  unless the entire
                            node is allocated to the job.

                     mask_cpu:<list>
                            Bind by setting CPU masks on tasks  (or  ranks)  as  specified  where
                            <list>    is   <cpu_mask_for_task_0>,<cpu_mask_for_task_1>,...    The
                            mapping is specified for a node and identical mapping is  applied  to
                            the  tasks  on  every  node  (i.e. the lowest task ID on each node is
                            mapped to the first mask specified in the list, etc.).  CPU masks are
                            always  interpreted as hexadecimal values but can be preceded with an
                            optional '0x'. Not supported unless the entire node is  allocated  to
                            the  job.   To  simplify support for large task counts, the lists may
                            follow a map with  an  asterisk  and  repetition  count  For  example
                            "mask_cpu:0x0f*4,0xf0*4".   Not  supported  unless the entire node is
                            allocated to the job.

                     rank_ldom
                            Bind to a NUMA locality domain by  rank.  Not  supported  unless  the
                            entire node is allocated to the job.

                     map_ldom:<list>
                            Bind  by mapping NUMA locality domain IDs to tasks as specified where
                            <list> is <ldom1>,<ldom2>,...<ldomN>.  The locality  domain  IDs  are
                            interpreted  as  decimal values unless they are preceded with '0x' in
                            which case they are interpreted as hexadecimal values.  Not supported
                            unless the entire node is allocated to the job.

                     mask_ldom:<list>
                            Bind  by  setting  NUMA  locality  domain masks on tasks as specified
                            where <list> is  <mask1>,<mask2>,...<maskN>.   NUMA  locality  domain
                            masks  are  always  interpreted  as  hexadecimal  values  but  can be
                            preceded with an optional '0x'.  Not supported unless the entire node
                            is allocated to the job.

                     sockets
                            Automatically generate masks binding tasks to sockets.  Only the CPUs
                            on the socket which have been allocated to the job will be used.   If
                            the number of tasks differs from the number of allocated sockets this
                            can result in sub-optimal binding.

                     cores  Automatically generate masks binding tasks to cores.  If  the  number
                            of  tasks  differs from the number of allocated cores this can result
                            in sub-optimal binding.

                     threads
                            Automatically generate masks binding tasks to threads.  If the number
                            of tasks differs from the number of allocated threads this can result
                            in sub-optimal binding.

                     ldoms  Automatically generate masks binding tasks to NUMA locality  domains.
                            If  the number of tasks differs from the number of allocated locality
                            domains this can result in sub-optimal binding.

                     boards Automatically generate masks binding tasks to boards.  If the  number
                            of  tasks differs from the number of allocated boards this can result
                            in sub-optimal binding. This option is supported by  the  task/cgroup
                            plugin only.

                     help   Show help message for cpu-bind

              This option applies to job and step allocations.

       --cpu-freq =<p1[-p2[:p3]]>

              Request  that  the job step initiated by this srun command be run at some requested
              frequency if possible, on the CPUs selected for the step on the compute node(s).

              p1 can be  [#### | low | medium | high |  highm1]  which  will  set  the  frequency
              scaling_speed to the corresponding value, and set the frequency scaling_governor to
              UserSpace. See below for definition of the values.

              p1 can be [Conservative | OnDemand | Performance | PowerSave] which  will  set  the
              scaling_governor to the corresponding value. The governor has to be in the list set
              by the slurm.conf option CpuFreqGovernors.

              When p2 is present, p1 will be the minimum scaling frequency and  p2  will  be  the
              maximum scaling frequency.

              p2 can be  [#### | medium | high | highm1] p2 must be greater than p1.

              p3  can  be  [Conservative  | OnDemand | Performance | PowerSave | UserSpace] which
              will set the governor to the corresponding value.

              If p3 is UserSpace, the frequency scaling_speed will be set by a  power  or  energy
              aware scheduling strategy to a value between p1 and p2 that lets the job run within
              the site's power goal. The job may be delayed if p1 is higher than a frequency that
              allows the job to run within the goal.

              If  the current frequency is < min, it will be set to min. Likewise, if the current
              frequency is > max, it will be set to max.

              Acceptable values at present include:

              ####          frequency in kilohertz

              Low           the lowest available frequency

              High          the highest available frequency

              HighM1        (high minus one) will select the next highest available frequency

              Medium        attempts to set a frequency in the middle of the available range

              Conservative  attempts to use the Conservative CPU governor

              OnDemand      attempts to use the OnDemand CPU governor (the default value)

              Performance   attempts to use the Performance CPU governor

              PowerSave     attempts to use the PowerSave CPU governor

              UserSpace     attempts to use the UserSpace CPU governor

              The following informational environment variable is set in the job
              step when --cpu-freq option is requested.
                      SLURM_CPU_FREQ_REQ

              This environment variable can also  be  used  to  supply  the  value  for  the  CPU
              frequency  request  if it is set when the 'srun' command is issued.  The --cpu-freq
              on the command line will override the environment variable value.  The form on  the
              environment  variable  is  the  same  as  the  command  line.   See the ENVIRONMENT
              VARIABLES section for a description of the SLURM_CPU_FREQ_REQ variable.

              NOTE: This parameter is treated as a request, not a requirement.  If the job step's
              node  does not support setting the CPU frequency, or the requested value is outside
              the bounds of the legal frequencies, an error  is  logged,  but  the  job  step  is
              allowed to continue.

              NOTE:  Setting  the  frequency  for  just the CPUs of the job step implies that the
              tasks   are   confined   to   those   CPUs.     If    task    confinement    (i.e.,
              TaskPlugin=task/affinity   or   TaskPlugin=task/cgroup  with  the  "ConstrainCores"
              option) is not configured, this parameter is ignored.

              NOTE: When the step completes, the frequency and governor of each selected  CPU  is
              reset to the previous values.

              NOTE:  When  submitting  jobs  with   the  --cpu-freq  option with linuxproc as the
              ProctrackType can cause jobs to run too quickly before Accounting is able  to  poll
              for job information. As a result not all of accounting information will be present.

              This option applies to job and step allocations.

       --cpus-per-gpu=<ncpus>
              Advise  Slurm  that  ensuing  job steps will require ncpus processors per allocated
              GPU.  Requires the --gpus option.  Not compatible with the --cpus-per-task option.

       -c, --cpus-per-task=<ncpus>
              Request that ncpus be allocated per process. This may  be  useful  if  the  job  is
              multithreaded  and requires more than one CPU per task for optimal performance. The
              default is one CPU per process.  If -c is specified without -n, as many tasks  will
              be allocated per node as possible while satisfying the -c restriction. For instance
              on a cluster with 8 CPUs per node, a job request for 4 nodes and 3  CPUs  per  task
              may  be  allocated  3  or  6  CPUs  per node (1 or 2 tasks per node) depending upon
              resource consumption by other jobs. Such a job may be unable to execute more than a
              total of 4 tasks.  This option may also be useful to spawn tasks without allocating
              resources to the job step from the job's allocation when running multiple job steps
              with the --exclusive option.

              WARNING:  There  are  configurations and options interpreted differently by job and
              job step requests which can result in inconsistencies for this option.  For example
              srun  -c2 --threads-per-core=1 prog may allocate two cores for the job, but if each
              of those cores contains two threads, the job allocation will include four CPUs. The
              job step allocation will then launch two threads per CPU for a total of two tasks.

              WARNING:   When   srun  is  executed  from  within  salloc  or  sbatch,  there  are
              configurations and options which can result in inconsistent allocations when -c has
              a value greater than -c on salloc or sbatch.

              This option applies to job allocations.

       --deadline=<OPT>
              remove  the  job if no ending is possible before this deadline (start > (deadline -
              time[-min])).  Default is no deadline.  Valid time formats are:
              HH:MM[:SS] [AM|PM]
              MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
              MM/DD[/YY]-HH:MM[:SS]
              YYYY-MM-DD[THH:MM[:SS]]]

              This option applies only to job allocations.

       --delay-boot=<minutes>
              Do not reboot nodes in order to satisfied this job's feature specification  if  the
              job has been eligible to run for less than this time period.  If the job has waited
              for less than the specified period, it will use only nodes which already  have  the
              specified  features.   The argument is in units of minutes.  A default value may be
              set  by   a   system   administrator   using   the   delay_boot   option   of   the
              SchedulerParameters  configuration  parameter in the slurm.conf file, otherwise the
              default value is zero (no delay).

              This option applies only to job allocations.

       -d, --dependency=<dependency_list>
              Defer the start of this job until the specified dependencies  have  been  satisfied
              completed.  This  option  does not apply to job steps (executions of srun within an
              existing salloc or sbatch allocation) only to job  allocations.   <dependency_list>
              is     of     the     form     <type:job_id[:job_id][,type:job_id[:job_id]]>     or
              <type:job_id[:job_id][?type:job_id[:job_id]]>.  All dependencies must be  satisfied
              if the "," separator is used.  Any dependency may be satisfied if the "?" separator
              is used.  Many jobs can share the same dependency and these jobs may even belong to
              different  users. The  value may be changed after job submission using the scontrol
              command.  Once a job dependency fails due to the termination state of  a  preceding
              job, the dependent job will never be run, even if the preceding job is requeued and
              has a different termination state in a subsequent execution. This option applies to
              job allocations.

              after:job_id[:jobid...]
                     This job can begin execution after the specified jobs have begun execution.

              afterany:job_id[:jobid...]
                     This job can begin execution after the specified jobs have terminated.

              afterburstbuffer:job_id[:jobid...]
                     This  job  can  begin execution after the specified jobs have terminated and
                     any associated burst buffer stage out operations have completed.

              aftercorr:job_id[:jobid...]
                     A task of this job array can begin execution after the corresponding task ID
                     in  the  specified job has completed successfully (ran to completion with an
                     exit code of zero).

              afternotok:job_id[:jobid...]
                     This job can begin execution after the specified  jobs  have  terminated  in
                     some failed state (non-zero exit code, node failure, timed out, etc).

              afterok:job_id[:jobid...]
                     This  job  can  begin  execution  after the specified jobs have successfully
                     executed (ran to completion with an exit code of zero).

              expand:job_id
                     Resources allocated to this job should be used to expand the specified  job.
                     The  job  to  expand  must  share  the  same  QOS  (Quality  of Service) and
                     partition.  Gang scheduling of  resources  in  the  partition  is  also  not
                     supported.

              singleton
                     This  job can begin execution after any previously launched jobs sharing the
                     same job name and user have terminated.  In other words,  only  one  job  by
                     that name and owned by that user can be running or suspended at any point in
                     time.

       -D, --chdir=<path>
              Have the remote processes do a  chdir  to  path  before  beginning  execution.  The
              default  is to chdir to the current working directory of the srun process. The path
              can be specified as full path or relative path to the directory where  the  command
              is executed. This option applies to job allocations.

       -e, --error=<filename pattern>
              Specify  how  stderr  is  to  be  redirected.  By default in interactive mode, srun
              redirects stderr to the same file as stdout,  if  one  is  specified.  The  --error
              option  is  provided  to  allow  stdout  and  stderr  to be redirected to different
              locations.  See IO Redirection below for  more  options.   If  the  specified  file
              already  exists,  it  will  be  overwritten.  This  option  applies to job and step
              allocations.

       -E, --preserve-env
              Pass the current values of environment variables SLURM_JOB_NODES  and  SLURM_NTASKS
              through  to the executable, rather than computing them from commandline parameters.
              This option applies to job allocations.

       --epilog=<executable>
              srun will run executable just after the  job  step  completes.   The  command  line
              arguments  for  executable  will  be the command and arguments of the job step.  If
              executable is "none", then no srun epilog will be run. This parameter overrides the
              SrunEpilog  parameter  in slurm.conf. This parameter is completely independent from
              the Epilog parameter in slurm.conf. This option applies to job allocations.

       --exclusive[=user|mcs]
              This option applies to job and job step allocations, and has two slightly different
              meanings  for  each  one.   When  used to initiate a job, the job allocation cannot
              share nodes with other running jobs  (or just other users with the  "=user"  option
              or  "=mcs"  option).   The  default  shared/exclusive  behavior  depends  on system
              configuration and the partition's OverSubscribe option takes  precedence  over  the
              job's option.

              This  option  can  also  be  used  when initiating more than one job step within an
              existing resource allocation, where you want separate processors to be dedicated to
              each job step. If sufficient processors are not available to initiate the job step,
              it will be deferred. This can be thought of as providing a mechanism  for  resource
              management to the job within it's allocation.

              The  exclusive allocation of CPUs only applies to job steps explicitly invoked with
              the --exclusive option.  For example, a job might be allocated one node  with  four
              CPUs and a remote shell invoked on the allocated node. If that shell is not invoked
              with the --exclusive option, then it may create a job step with  four  tasks  using
              the   --exclusive  option  and  not  conflict  with  the  remote  shell's  resource
              allocation.  Use the --exclusive option to invoke every job step to ensure distinct
              resources for each step.

              Note  that  all  CPUs  allocated to a job are available to each job step unless the
              --exclusive option is  used  plus  task  affinity  is  configured.  Since  resource
              management is provided by processor, the --ntasks option must be specified, but the
              following options should NOT  be  specified  --relative,  --distribution=arbitrary.
              See EXAMPLE below.

       --export=<environment variables [ALL] | NONE>
              Identify  which  environment  variables are propagated to the launched application.
              By default, all are propagated.  Multiple  environment  variable  names  should  be
              comma  separated.   Environment  variable  names  may be specified to propagate the
              current value (e.g. "--export=EDITOR") or specific values  may  be  exported  (e.g.
              "--export=EDITOR=/bin/emacs").  In  these  two examples, the propagated environment
              will only contain the variable EDITOR.  If one desires to add  to  the  environment
              instead    of    replacing    it,    have    the   argument   include   ALL   (e.g.
              "--export=ALL,EDITOR=/bin/emacs").  This  will  propagate  EDITOR  along  with  the
              current  environment.  Unlike sbatch, if ALL is specified, any additional specified
              environment variables are ignored.  If one  desires  no  environment  variables  be
              propagated,  use  the  argument  NONE.  Regardless of this setting, the appropriate
              SLURM_* task environment variables are always exported to  the  environment.   srun
              may  deviate from the above behavior if the default launch plugin, launch/slurm, is
              not used.

       -F, --nodefile=<node file>
              Much like --nodelist, but the list is contained in a file of name node  file.   The
              node  names of the list may also span multiple lines in the file.    Duplicate node
              names in the file will be ignored.  The order of the node names in the list is  not
              important; the node names will be sorted by Slurm.

       --gid=<group>
              If  srun  is run as root, and the --gid option is used, submit the job with group's
              group access permissions.  group may be the group name or the numerical  group  ID.
              This option applies to job allocations.

       -G, --gpus=[<type>:]<number>
              Specify  the  total  number  of  GPUs  required  for the job.  An optional GPU type
              specification can be supplied.  For example "--gpus=volta:3".  Multiple options can
              be  requested  in  a  comma separated list, for example: "--gpus=volta:3,kepler:1".
              See also the --gpus-per-node, --gpus-per-socket and --gpus-per-task options.

       --gpu-bind=<type>
              Bind tasks to specific GPUs.  By default every spawned task can  access  every  GPU
              allocated to the job.

              Supported type options:

              closest   Bind  each  task to the GPU(s) which are closest.  In a NUMA environment,
                        each task may be bound to more than one GPU (i.e.  all GPUs in that  NUMA
                        environment).

              map_gpu:<list>
                        Bind  by  setting GPU masks on tasks (or ranks) as specified where <list>
                        is <gpu_id_for_task_0>,<gpu_id_for_task_1>,... GPU IDs are interpreted as
                        decimal  values  unless  they  are  preceded with '0x' in which case they
                        interpreted as hexadecimal values. If the  number  of  tasks  (or  ranks)
                        exceeds the number of elements in this list, elements in the list will be
                        reused as needed starting from the beginning of  the  list.  To  simplify
                        support  for  large  task  counts,  the  lists  may  follow a map with an
                        asterisk and  repetition  count.   For  example  "map_gpu:0*4,1*4".   Not
                        supported unless the entire node is allocated to the job.

              mask_gpu:<list>
                        Bind  by  setting GPU masks on tasks (or ranks) as specified where <list>
                        is   <gpu_mask_for_task_0>,<gpu_mask_for_task_1>,...   The   mapping   is
                        specified  for  a  node  and identical mapping is applied to the tasks on
                        every node (i.e. the lowest task ID on each node is mapped to  the  first
                        mask  specified  in  the list, etc.). GPU masks are always interpreted as
                        hexadecimal values but  can  be  preceded  with  an  optional  '0x'.  Not
                        supported  unless  the  entire  node is allocated to the job. To simplify
                        support for large task counts,  the  lists  may  follow  a  map  with  an
                        asterisk  and  repetition  count.   For example "mask_gpu:0x0f*4,0xf0*4".
                        Not supported unless the entire node is allocated to the job.

       --gpu-freq=[<type]=value>[,<type=value>][,verbose]
              Request that GPUs allocated to the  job  are  configured  with  specific  frequency
              values.   This option can be used to independently configure the GPU and its memory
              frequencies.  After the job is completed, the frequencies of all affected GPUs will
              be  reset  to  the  highest  possible values.  In some cases, system power caps may
              override the requested values.  The field type can be "memory".   If  type  is  not
              specified,  the  GPU  frequency  is  implied.  The value field can either be "low",
              "medium", "high", "highm1" or a numeric value in megahertz (MHz).  If the specified
              numeric value is not possible, a value as close as possible will be used. See below
              for definition of the values.  The verbose  option  causes  current  GPU  frequency
              information  to be logged.  Examples of use include "--gpu-freq=medium,memory=high"
              and "--gpu-freq=450".

              Supported value definitions:

              low       the lowest available frequency.

              medium    attempts to set a frequency in the middle of the available range.

              high      the highest available frequency.

              highm1    (high minus one) will select the next highest available frequency.

       --gpus-per-node=[<type>:]<number>
              Specify the number of GPUs required for the job on each node included in the  job's
              resource  allocation.   An  optional  GPU  type specification can be supplied.  For
              example "--gpus-per-node=volta:3".  Multiple options can be requested  in  a  comma
              separated  list,  for  example:  "--gpus-per-node=volta:3,kepler:1".   See also the
              --gpus, --gpus-per-socket and --gpus-per-task options.

       --gpus-per-socket=[<type>:]<number>
              Specify the number of GPUs required for the job on  each  socket  included  in  the
              job's  resource  allocation.   An  optional GPU type specification can be supplied.
              For example "--gpus-per-socket=volta:3".  Multiple options can be  requested  in  a
              comma  separated list, for example: "--gpus-per-socket=volta:3,kepler:1".  Requires
              job to specify a sockets per  node  count  (  --sockets-per-node).   See  also  the
              --gpus,  --gpus-per-node  and  --gpus-per-task options.  This option applies to job
              allocations.

       --gpus-per-task=[<type>:]<number>
              Specify the number of GPUs required for the job on each task to be spawned  in  the
              job's  resource  allocation.   An  optional GPU type specification can be supplied.
              This  option  requires  the  specification  of   a   task   count.    For   example
              "--gpus-per-task=volta:1".   Multiple options can be requested in a comma separated
              list, for example: "--gpus-per-task=volta:3,kepler:1".  Requires job to  specify  a
              task  count  (--nodes).  See also the --gpus, --gpus-per-socket and --gpus-per-node
              options.

       --gres=<list>
              Specifies a comma delimited list of generic consumable resources.   The  format  of
              each  entry  on  the  list  is  "name[[:type]:count]".   The  name  is  that of the
              consumable resource.  The count is the number of those  resources  with  a  default
              value  of  1.  The count can have a suffix of "k" or "K" (multiple of 1024), "m" or
              "M" (multiple of 1024 x 1024), "g" or "G" (multiple of 1024 x 1024 x 1024), "t"  or
              "T"  (multiple of 1024 x 1024 x 1024 x 1024), "p" or "P" (multiple of 1024 x 1024 x
              1024 x 1024 x 1024).  The specified resources will be allocated to the job on  each
              node.   The  available  generic  consumable resources is configurable by the system
              administrator.  A list of available generic consumable resources  will  be  printed
              and  the  command  will  exit  if  the  option argument is "help".  Examples of use
              include "--gres=gpu:2,mic:1", "--gres=gpu:kepler:2", and "--gres=help".  NOTE: This
              option applies to job and step allocations. By default, a job step is allocated all
              of the generic resources that have allocated to the job.  To change the behavior so
              that  each  job step is allocated no generic resources, explicitly set the value of
              --gres to specify zero counts for each generic resource OR set "--gres=none" OR set
              the SLURM_STEP_GRES environment variable to "none".

       --gres-flags=<type>
              Specify  generic  resource  task  binding  options.   This  option  applies  to job
              allocations.

              disable-binding
                     Disable filtering of CPUs with respect to generic resource  locality.   This
                     option is currently required to use more CPUs than are bound to a GRES (i.e.
                     if a GPU is bound to the CPUs on one socket, but resources on more than  one
                     socket  are  required  to  run the job).  This option may permit a job to be
                     allocated resources sooner than otherwise possible, but may result in  lower
                     job performance.

              enforce-binding
                     The  only CPUs available to the job will be those bound to the selected GRES
                     (i.e. the CPUs identified in the gres.conf file will be strictly  enforced).
                     This  option  may  result in delayed initiation of a job.  For example a job
                     requiring two GPUs and one CPU will be delayed until both GPUs on  a  single
                     socket  are  available  rather  than  using  GPUs bound to separate sockets,
                     however  the  application  performance  may  be  improved  due  to  improved
                     communication  speed.  Requires the node to be configured with more than one
                     socket and resource filtering will be performed on a per-socket basis.

       -H, --hold
              Specify the job is to be submitted in a held state (priority of zero).  A held  job
              can  now  be  released using scontrol to reset its priority (e.g. "scontrol release
              <job_id>"). This option applies to job allocations.

       -h, --help
              Display help information and exit.

       --hint=<type>
              Bind tasks according to application hints.

              compute_bound
                     Select settings for compute  bound  applications:  use  all  cores  in  each
                     socket, one thread per core.

              memory_bound
                     Select  settings  for  memory  bound applications: use only one core in each
                     socket, one thread per core.

              [no]multithread
                     [don't] use extra threads with in-core  multi-threading  which  can  benefit
                     communication intensive applications.  Only supported with the task/affinity
                     plugin.

              help   show this help message

              This option applies to job allocations.

       -I, --immediate[=<seconds>]
              exit if resources are not available  within  the  time  period  specified.   If  no
              argument  is given (seconds defaults to 1), resources must be available immediately
              for the request to succeed. If  defer  is  configured  in  SchedulerParameters  and
              seconds=1  the  allocation request will fail immediately; defer conflicts and takes
              precedence over this option.  By default, --immediate is off, and the command  will
              block  until  resources become available. Since this option's argument is optional,
              for proper parsing the single letter option must be followed immediately  with  the
              value  and  not  include  a space between them. For example "-I60" and not "-I 60".
              This option applies to job and step allocations.

       -i, --input=<mode>
              Specify how stdin is to redirected. By  default,  srun  redirects  stdin  from  the
              terminal  all  tasks.  See  IO  Redirection  below for more options.  For OS X, the
              poll() function does not support stdin, so input from a terminal is  not  possible.
              This option applies to job and step allocations.

       -J, --job-name=<jobname>
              Specify  a  name  for the job. The specified name will appear along with the job id
              number when querying running jobs on  the  system.  The  default  is  the  supplied
              executable   program's   name.  NOTE:  This  information  may  be  written  to  the
              slurm_jobacct.log file. This file is space delimited so if a space is used  in  the
              jobname  name  it  will  cause  problems in properly displaying the contents of the
              slurm_jobacct.log file when the sacct command is used. This option applies  to  job
              and step allocations.

       --jobid=<jobid>
              Initiate  a  job  step  under  an already allocated job with job id id.  Using this
              option will cause srun  to  behave  exactly  as  if  the  SLURM_JOB_ID  environment
              variable was set. This option applies to step allocations.

       -K, --kill-on-bad-exit[=0|1]
              Controls  whether or not to terminate a step if any task exits with a non-zero exit
              code. If this option is not specified, the default action will be  based  upon  the
              Slurm  configuration  parameter  of  KillOnBadExit. If this option is specified, it
              will take precedence over KillOnBadExit.  An  option  argument  of  zero  will  not
              terminate  the  job.  A  non-zero  argument  or no argument will terminate the job.
              Note: This option takes precedence over the -W, --wait option to terminate the  job
              immediately  if  a  task  exits  with  a  non-zero  exit code.  Since this option's
              argument is optional, for proper parsing the single letter option must be  followed
              immediately  with the value and not include a space between them. For example "-K1"
              and not "-K 1".

       -k, --no-kill [=off]
              Do not automatically terminate a job if one of the  nodes  it  has  been  allocated
              fails.  This  option  applies to job and step allocations.  The job will assume all
              responsibilities for fault-tolerance.  Tasks launch using this option will  not  be
              considered terminated (e.g. -K, --kill-on-bad-exit and -W, --wait options will have
              no effect upon the job step).  The active job step (MPI job) will likely  suffer  a
              fatal error, but subsequent job steps may be run if this option is specified.

              Specify  an  optional  argument  of  "off"  disable the effect of the SLURM_NO_KILL
              environment variable.

              The default action is to terminate the job upon node failure.

       -l, --label
              Prepend task number to lines of stdout/err.  The --label option will prepend  lines
              of output with the remote task id. This option applies to step allocations.

       -L, --licenses=<license>
              Specification  of  licenses  (or  other  resources  available  on  all nodes of the
              cluster) which must be allocated to this job.  License names can be followed  by  a
              colon and count (the default count is one).  Multiple license names should be comma
              separated (e.g.  "--licenses=foo:4,bar"). This option applies to job allocations.

       -M, --clusters=<string>
              Clusters to issue commands to.  Multiple cluster names may be comma separated.  The
              job  will  be  submitted  to  the  one  cluster providing the earliest expected job
              initiation time. The default value is the current cluster. A value  of  'all'  will
              query  to  run  on  all  clusters.  Note the --export option to control environment
              variables exported between clusters.  This option applies only to job  allocations.
              Note that the SlurmDBD must be up for this option to work properly.

       -m, --distribution=
              *|block|cyclic|arbitrary|plane=<options>         [:*|block|cyclic|fcyclic[:*|block|
              cyclic|fcyclic]][,Pack|NoPack]

              Specify alternate distribution methods for remote processes.  This option  controls
              the  distribution of tasks to the nodes on which resources have been allocated, and
              the distribution of those resources to tasks for binding (task affinity). The first
              distribution  method  (before  the first ":") controls the distribution of tasks to
              nodes.   The  second  distribution  method  (after  the  first  ":")  controls  the
              distribution  of  allocated  CPUs  across  sockets  for binding to tasks. The third
              distribution method (after the second ":") controls the distribution  of  allocated
              CPUs  across  cores for binding to tasks.  The second and third distributions apply
              only if task affinity is enabled.  The third distribution is supported only if  the
              task/cgroup  plugin  is configured. The default value for each distribution type is
              specified by *.

              Note that with select/cons_res, the number of CPUs allocated  on  each  socket  and
              node  may be different. Refer to https://slurm.schedmd.com/mc_support.html for more
              information on resource allocation, distribution of tasks to nodes, and binding  of
              tasks to CPUs.
              First distribution method (distribution of tasks across nodes):

              *      Use the default method for distributing tasks to nodes (block).

              block  The  block  distribution  method  will  distribute tasks to a node such that
                     consecutive tasks share a node. For example, consider an allocation of three
                     nodes  each  with  two  cpus.  A  four-task  block distribution request will
                     distribute those tasks to the nodes with tasks one  and  two  on  the  first
                     node, task three on the second node, and task four on the third node.  Block
                     distribution is the default behavior if the  number  of  tasks  exceeds  the
                     number of allocated nodes.

              cyclic The  cyclic  distribution  method  will distribute tasks to a node such that
                     consecutive tasks are distributed over consecutive nodes (in  a  round-robin
                     fashion).  For  example, consider an allocation of three nodes each with two
                     cpus. A four-task cyclic distribution request will distribute those tasks to
                     the  nodes with tasks one and four on the first node, task two on the second
                     node, and task three on the  third  node.   Note  that  when  SelectType  is
                     select/cons_res,  the same number of CPUs may not be allocated on each node.
                     Task distribution will be round-robin among all the nodes with CPUs  yet  to
                     be  assigned  to  tasks.  Cyclic distribution is the default behavior if the
                     number of tasks is no larger than the number of allocated nodes.

              plane  The tasks are distributed in  blocks  of  a  specified  size.   The  options
                     include  a number representing the size of the task block.  This is followed
                     by an optional specification of the task distribution scheme within a  block
                     of  tasks  and between the blocks of tasks.  The number of tasks distributed
                     to each node is the  same  as  for  cyclic  distribution,  but  the  taskids
                     assigned  to each node depend on the plane size. For more details (including
                     examples and diagrams), please see
                     https://slurm.schedmd.com/mc_support.html
                     and
                     https://slurm.schedmd.com/dist_plane.html

              arbitrary
                     The arbitrary method of distribution will  allocate  processes  in-order  as
                     listed  in  file  designated by the environment variable SLURM_HOSTFILE.  If
                     this variable is listed it will over ride any other  method  specified.   If
                     not  set the method will default to block.  Inside the hostfile must contain
                     at minimum the number of hosts requested  and  be  one  per  line  or  comma
                     separated.   If  specifying a task count (-n, --ntasks=<number>), your tasks
                     will be laid out on the nodes in the order of the file.
                     NOTE: The arbitrary distribution option on a job  allocation  only  controls
                     the nodes to be allocated to the job and not the allocation of CPUs on those
                     nodes. This option is meant primarily to control a job step's task layout in
                     an existing job allocation for the srun command.
                     NOTE:  If  number  of  tasks  is given and a list of requested nodes is also
                     given the number of nodes used from that list will be reduced to match  that
                     of  the  number  of tasks if the number of nodes in the list is greater than
                     the number of tasks.

              Second distribution method (distribution of CPUs across sockets for binding):

              *      Use the default method for distributing CPUs across sockets (cyclic).

              block  The block distribution method will distribute allocated  CPUs  consecutively
                     from the same socket for binding to tasks, before using the next consecutive
                     socket.

              cyclic The cyclic distribution method will distribute allocated CPUs for binding to
                     a  given  task  consecutively  from  the  same  socket,  and  from  the next
                     consecutive socket for the  next  task,  in  a  round-robin  fashion  across
                     sockets.

              fcyclic
                     The  fcyclic  distribution method will distribute allocated CPUs for binding
                     to tasks from consecutive  sockets  in  a  round-robin  fashion  across  the
                     sockets.

              Third distribution method (distribution of CPUs across cores for binding):

              *      Use  the  default  method for distributing CPUs across cores (inherited from
                     second distribution method).

              block  The block distribution method will distribute allocated  CPUs  consecutively
                     from  the  same core for binding to tasks, before using the next consecutive
                     core.

              cyclic The cyclic distribution method will distribute allocated CPUs for binding to
                     a given task consecutively from the same core, and from the next consecutive
                     core for the next task, in a round-robin fashion across cores.

              fcyclic
                     The fcyclic distribution method will distribute allocated CPUs  for  binding
                     to tasks from consecutive cores in a round-robin fashion across the cores.

              Optional control for task distribution over nodes:

              Pack   Rather  than  evenly  distributing  a  job  step's  tasks evenly across it's
                     allocated nodes, pack them as tightly as possible on the nodes.

              NoPack Rather than packing a job step's tasks as tightly as possible on the  nodes,
                     distribute   them   evenly.    This   user   option   will   supersede   the
                     SelectTypeParameters CR_Pack_Nodes configuration parameter.

              This option applies to job and step allocations.

       --mail-type=<type>
              Notify user by email when certain event types occur.  Valid type values  are  NONE,
              BEGIN,  END,  FAIL,  REQUEUE,  ALL  (equivalent  to  BEGIN, END, FAIL, REQUEUE, and
              STAGE_OUT), STAGE_OUT (burst buffer stage out and teardown completed),  TIME_LIMIT,
              TIME_LIMIT_90 (reached 90 percent of time limit), TIME_LIMIT_80 (reached 80 percent
              of time limit), and TIME_LIMIT_50 (reached 50 percent  of  time  limit).   Multiple
              type values may be specified in a comma separated list.  The user to be notified is
              indicated with --mail-user. This option applies to job allocations.

       --mail-user=<user>
              User to receive email notification of state changes as defined by --mail-type.  The
              default value is the submitting user. This option applies to job allocations.

       --mcs-label=<mcs>
              Used  only  when  the mcs/group plugin is enabled.  This parameter is a group among
              the groups of the user.  Default value is calculated by  the  Plugin  mcs  if  it's
              enabled. This option applies to job allocations.

       --mem=<size[units]>
              Specify  the real memory required per node.  Default units are megabytes unless the
              SchedulerParameters configuration parameter includes  the  "default_gbytes"  option
              for  gigabytes.   Different  units  can  be  specified  using the suffix [K|M|G|T].
              Default  value  is  DefMemPerNode  and  the  maximum  value  is  MaxMemPerNode.  If
              configured,  both of parameters can be seen using the scontrol show config command.
              This parameter would generally be  used  if  whole  nodes  are  allocated  to  jobs
              (SelectType=select/linear).   Specifying a memory limit of zero for a job step will
              restrict the job step to the amount of memory allocated to the job, but not  remove
              any  of  the job's memory allocation from being available to other job steps.  Also
              see --mem-per-cpu and --mem-per-gpu.  The --mem,  --mem-per-cpu  and  --mem-per-gpu
              options  are  mutually  exclusive.  If  --mem,  --mem-per-cpu  or --mem-per-gpu are
              specified as command line arguments,  then  they  will  take  precedence  over  the
              environment (potentially inherited from salloc or sbatch).

              NOTE:  A  memory size specification of zero is treated as a special case and grants
              the job access to all of the memory on each node for newly submitted jobs  and  all
              available job memory to a new job steps.

              Specifying new memory limits for job steps are only advisory.

              If the job is allocated multiple nodes in a heterogeneous cluster, the memory limit
              on each node will be that of the node in the allocation with  the  smallest  memory
              size (same limit will apply to every node in the job's allocation).

              NOTE:  Enforcement of memory limits currently relies upon the task/cgroup plugin or
              enabling of accounting, which samples memory use on a periodic basis (data need not
              be  stored,  just  collected).  In  both  cases  memory use is based upon the job's
              Resident Set Size (RSS). A task may exceed the memory limit until the next periodic
              accounting sample.

              This option applies to job and step allocations.

       --mem-per-cpu=<size[units]>
              Minimum  memory required per allocated CPU.  Default units are megabytes unless the
              SchedulerParameters configuration parameter includes  the  "default_gbytes"  option
              for  gigabytes.   Different  units  can  be  specified  using the suffix [K|M|G|T].
              Default value is DefMemPerCPU and the maximum value is MaxMemPerCPU (see  exception
              below).  If  configured,  both  of  parameters  can be seen using the scontrol show
              config command.  Note that if the job's --mem-per-cpu value exceeds the  configured
              MaxMemPerCPU,  then  the  user's  limit will be treated as a memory limit per task;
              --mem-per-cpu  will  be  reduced  to  a  value   no   larger   than   MaxMemPerCPU;
              --cpus-per-task  will be set and the value of --cpus-per-task multiplied by the new
              --mem-per-cpu value will equal the original --mem-per-cpu value  specified  by  the
              user.   This  parameter  would  generally  be  used  if  individual  processors are
              allocated to jobs (SelectType=select/cons_res).  If resources are allocated by  the
              core,  socket  or  whole nodes; the number of CPUs allocated to a job may be higher
              than the task count and the value of --mem-per-cpu should be adjusted  accordingly.
              Specifying  a memory limit of zero for a job step will restrict the job step to the
              amount of memory allocated to the job, but not  remove  any  of  the  job's  memory
              allocation   from  being  available  to  other  job  steps.   Also  see  --mem  and
              --mem-per-gpu.  The --mem, --mem-per-cpu and  --mem-per-gpu  options  are  mutually
              exclusive.

              NOTE:If  the  final  amount of memory requested by job (eg.: when --mem-per-cpu use
              with --exclusive option) can't be satisfied by  any  of  nodes  configured  in  the
              partition, the job will be rejected.

       --mem-per-gpu=<size[units]>
              Minimum  memory required per allocated GPU.  Default units are megabytes unless the
              SchedulerParameters configuration parameter includes  the  "default_gbytes"  option
              for  gigabytes.   Different  units  can  be  specified  using the suffix [K|M|G|T].
              Default value is DefMemPerGPU and is available on both a global and  per  partition
              basis.   If  configured,  the parameters can be seen using the scontrol show config
              and scontrol show partition commands.  Also see --mem.   The  --mem,  --mem-per-cpu
              and --mem-per-gpu options are mutually exclusive.

       --mem-bind=[{quiet,verbose},]type
              Bind  tasks  to  memory. Used only when the task/affinity plugin is enabled and the
              NUMA memory functions are available.  Note that the resolution of  CPU  and  memory
              binding may differ on some architectures. For example, CPU binding may be performed
              at the level of the cores within a processor while memory binding will be performed
              at  the  level  of nodes, where the definition of "nodes" may differ from system to
              system.  By default no memory binding is performed; any task using any CPU can  use
              any  memory. This option is typically used to ensure that each task is bound to the
              memory closest to it's assigned CPU. The use of  any  type  other  than  "none"  or
              "local" is not recommended.  If you want greater control, try running a simple test
              code  with  the  options   "--cpu-bind=verbose,none   --mem-bind=verbose,none"   to
              determine the specific configuration.

              NOTE:  To  have Slurm always report on the selected memory binding for all commands
              executed in a shell, you can enable verbose  mode  by  setting  the  SLURM_MEM_BIND
              environment variable value to "verbose".

              The  following  informational  environment  variables are set when --mem-bind is in
              use:

                   SLURM_MEM_BIND_LIST
                   SLURM_MEM_BIND_PREFER
                   SLURM_MEM_BIND_SORT
                   SLURM_MEM_BIND_TYPE
                   SLURM_MEM_BIND_VERBOSE

              See the ENVIRONMENT VARIABLES section  for  a  more  detailed  description  of  the
              individual SLURM_MEM_BIND* variables.

              Supported options include:

              help   show this help message

              local  Use memory local to the processor in use

              map_mem:<list>
                     Bind  by  setting memory masks on tasks (or ranks) as specified where <list>
                     is <numa_id_for_task_0>,<numa_id_for_task_1>,...  The mapping  is  specified
                     for a node and identical mapping is applied to the tasks on every node (i.e.
                     the lowest task ID on each node is mapped to the first ID specified  in  the
                     list,  etc.).   NUMA  IDs  are interpreted as decimal values unless they are
                     preceded with '0x' in which case they interpreted as hexadecimal values.  If
                     the  number of tasks (or ranks) exceeds the number of elements in this list,
                     elements in the list will be reused as needed starting from the beginning of
                     the list.  To simplify support for large task counts, the lists may follow a
                     map   with    an    asterisk    and    repetition    count    For    example
                     "map_mem:0x0f*4,0xf0*4".   Not supported unless the entire node is allocated
                     to the job.

              mask_mem:<list>
                     Bind by setting memory masks on tasks (or ranks) as specified  where  <list>
                     is   <numa_mask_for_task_0>,<numa_mask_for_task_1>,...    The   mapping   is
                     specified for a node and identical mapping is applied to the tasks on  every
                     node  (i.e.  the  lowest  task  ID  on each node is mapped to the first mask
                     specified in  the  list,  etc.).   NUMA  masks  are  always  interpreted  as
                     hexadecimal  values.   Note  that masks must be preceded with a '0x' if they
                     don't begin with [0-9] so they are seen as numerical values.  If the  number
                     of tasks (or ranks) exceeds the number of elements in this list, elements in
                     the list will be reused as needed starting from the beginning of  the  list.
                     To  simplify support for large task counts, the lists may follow a mask with
                     an asterisk  and  repetition  count  For  example  "mask_mem:0*4,1*4".   Not
                     supported unless the entire node is allocated to the job.

              no[ne] don't bind tasks to memory (default)

              nosort avoid  sorting  free  cache  pages  (default, LaunchParameters configuration
                     parameter can override this default)

              p[refer]
                     Prefer use of first specified NUMA node, but permit
                      use of other available NUMA nodes.

              q[uiet]
                     quietly bind before task runs (default)

              rank   bind by task rank (not recommended)

              sort   sort free cache pages (run zonesort on Intel KNL nodes)

              v[erbose]
                     verbosely report binding before task runs

              This option applies to job and step allocations.

       --mincpus=<n>
              Specify a minimum number of logical cpus/processors per node. This  option  applies
              to job allocations.

       --msg-timeout=<seconds>
              Modify  the job launch message timeout.  The default value is MessageTimeout in the
              Slurm  configuration  file  slurm.conf.   Changes  to  this   are   typically   not
              recommended,  but could be useful to diagnose problems.  This option applies to job
              allocations.

       --mpi=<mpi_type>
              Identify the type of MPI to be used. May result in unique initiation procedures.

              list   Lists available mpi types to choose from.

              openmpi
                     For use with OpenMPI.

              pmi2   To enable PMI2 support. The PMI2 support in Slurm  works  only  if  the  MPI
                     implementation supports it, in other words if the MPI has the PMI2 interface
                     implemented. The --mpi=pmi2  will  load  the  library  lib/slurm/mpi_pmi2.so
                     which  provides  the  server  side  functionality  but  the client side must
                     implement PMI2_Init() and the other interface calls.

              pmix   To enable PMIx support (http://pmix.github.io/master). The PMIx  support  in
                     Slurm  can be used to launch parallel applications (e.g. MPI) if it supports
                     PMIx, PMI2 or PMI1. Slurm must be configured with pmix  support  by  passing
                     "--with-pmix=<PMIx installation path>" option to its "./configure" script.

                     At  the  time of writing PMIx is supported in Open MPI starting from version
                     2.0.  PMIx also supports backward compatibility with PMI1 and PMI2  and  can
                     be  used  if  MPI was configured with PMI2/PMI1 support pointing to the PMIx
                     library ("libpmix").  If MPI supports PMI1/PMI2 but doesn't provide the  way
                     to  point  to  a  specific  implementation,  a  hack'ish solution leveraging
                     LD_PRELOAD can be used to force "libpmix" usage.

              none   No special MPI processing. This is the default and  works  with  many  other
                     versions of MPI.

              This option applies to step allocations.

       --multi-prog
              Run  a  job  with different programs and different arguments for each task. In this
              case, the executable program specified is actually a configuration file  specifying
              the  executable  and  arguments  for  each task. See MULTIPLE PROGRAM CONFIGURATION
              below for details on the configuration file contents. This option applies  to  step
              allocations.

       -N, --nodes=<minnodes[-maxnodes]>
              Request  that a minimum of minnodes nodes be allocated to this job.  A maximum node
              count may also be specified with maxnodes.  If only one number is  specified,  this
              is  used  as  both the minimum and maximum node count.  The partition's node limits
              supersede those of the job.  If a job's  node  limits  are  outside  of  the  range
              permitted  for  its  associated partition, the job will be left in a PENDING state.
              This permits possible execution at a  later  time,  when  the  partition  limit  is
              changed.   If  a  job  node  limit  exceeds  the  number of nodes configured in the
              partition,  the  job  will  be  rejected.   Note  that  the  environment   variable
              SLURM_JOB_NUM_NODES  (and  SLURM_NNODES for backwards compatibility) will be set to
              the count of nodes actually allocated to the job.  See  the  ENVIRONMENT  VARIABLES
              section  for  more information.  If -N is not specified, the default behavior is to
              allocate enough nodes to satisfy the requirements of the -n and  -c  options.   The
              job  will  be  allocated  as  many nodes as possible within the range specified and
              without delaying the initiation of the job.  If number of  tasks  is  given  and  a
              number  of requested nodes is also given the number of nodes used from that request
              will be reduced to match that of the number of tasks if the number of nodes in  the
              request  is  greater  than  the  number of tasks.  The node count specification may
              include a numeric value followed by a suffix of "k" (multiplies  numeric  value  by
              1,024)  or  "m" (multiplies numeric value by 1,048,576). This option applies to job
              and step allocations.

       -n, --ntasks=<number>
              Specify the number of tasks to run. Request that srun allocate resources for ntasks
              tasks.   The default is one task per node, but note that the --cpus-per-task option
              will change this default. This option applies to job and step allocations.

       --network=<type>
              Specify information pertaining to the switch or  network.   The  interpretation  of
              type  is  system  dependent.  This option is supported when running Slurm on a Cray
              natively.  It is used to request using  Network  Performance  Counters.   Only  one
              value  per  request  is  valid.   All  options  are  case  in-sensitive.   In  this
              configuration supported values include:

              system
                    Use the system-wide network performance counters. Only nodes  requested  will
                    be  marked  in  use  for the job allocation.  If the job does not fill up the
                    entire system the rest of the nodes are not able to be  used  by  other  jobs
                    using  NPC,  if  idle  their  state will appear as PerfCnts.  These nodes are
                    still available for other jobs not using NPC.

              blade Use the blade network performance counters.  Only  nodes  requested  will  be
                    marked in use for the job allocation.  If the job does not fill up the entire
                    blade(s) allocated to the job those blade(s) are not able to be used by other
                    jobs using NPC, if idle their state will appear as PerfCnts.  These nodes are
                    still available for other jobs not using NPC.

              In all cases the job or step allocation request must specify the
              --exclusive option.  Otherwise the request will be denied.

              Also with any of these options steps are not allowed to share blades, so  resources
              would remain idle inside an allocation if the step running on a blade does not take
              up all the nodes on the blade.

              The network option is also supported on systems  with  IBM's  Parallel  Environment
              (PE).   See  IBM's  LoadLeveler job command keyword documentation about the keyword
              "network" for more information.  Multiple  values  may  be  specified  in  a  comma
              separated list.  All options are case in-sensitive.  Supported values include:

              BULK_XFER[=<resources>]
                          Enable  bulk transfer of data using Remote Direct-Memory Access (RDMA).
                          The optional resources specification is a numeric value which can  have
                          a  suffix of "k", "K", "m", "M", "g" or "G" for kilobytes, megabytes or
                          gigabytes.  NOTE: The resources specification is not supported  by  the
                          underlying  IBM  infrastructure  as of Parallel Environment version 2.2
                          and no value should be specified at this time.  The  devices  allocated
                          to  a job must all be of the same type.  The default value depends upon
                          depends upon what hardware is available and in order of preferences  is
                          IPONLY (which is not considered in User Space mode), HFI, IB, HPCE, and
                          KMUX.

              CAU=<count> Number of Collective Acceleration Units (CAU) required.   Applies  only
                          to  IBM  Power7-IH processors.  Default value is zero.  Independent CAU
                          will be allocated for each programming interface (MPI, LAPI, etc.)

              DEVNAME=<name>
                          Specify the device name to  use  for  communications  (e.g.  "eth0"  or
                          "mlx4_0").

              DEVTYPE=<type>
                          Specify  the  device  type  to  use  for communications.  The supported
                          values  of  type  are:  "IB"  (InfiniBand),  "HFI"  (P7   Host   Fabric
                          Interface),  "IPONLY"  (IP-Only interfaces), "HPCE" (HPC Ethernet), and
                          "KMUX" (Kernel Emulation of HPCE).  The devices allocated to a job must
                          all  be  of the same type.  The default value depends upon depends upon
                          what hardware is available and in order of preferences is IPONLY (which
                          is not considered in User Space mode), HFI, IB, HPCE, and KMUX.

              IMMED =<count>
                          Number  of  immediate  send slots per window required.  Applies only to
                          IBM Power7-IH processors.  Default value is zero.

              INSTANCES =<count>
                          Specify number of network connections for each  task  on  each  network
                          connection.  The default instance count is 1.

              IPV4        Use Internet Protocol (IP) version 4 communications (default).

              IPV6        Use Internet Protocol (IP) version 6 communications.

              LAPI        Use the LAPI programming interface.

              MPI         Use the MPI programming interface.  MPI is the default interface.

              PAMI        Use the PAMI programming interface.

              SHMEM       Use the OpenSHMEM programming interface.

              SN_ALL      Use all available switch networks (default).

              SN_SINGLE   Use one available switch network.

              UPC         Use the UPC programming interface.

              US          Use User Space communications.

              Some examples of network specifications:

              Instances=2,US,MPI,SN_ALL
                          Create  two  user  space  connections  for  MPI communications on every
                          switch network for each task.

              US,MPI,Instances=3,Devtype=IB
                          Create three user space connections for  MPI  communications  on  every
                          InfiniBand network for each task.

              IPV4,LAPI,SN_Single
                          Create  a IP version 4 connection for LAPI communications on one switch
                          network for each task.

              Instances=2,US,LAPI,MPI
                          Create two user space connections each for LAPI and MPI  communications
                          on  every switch network for each task. Note that SN_ALL is the default
                          option so every switch network is  used.  Also  note  that  Instances=2
                          specifies  that two connections are established for each protocol (LAPI
                          and MPI) and each task.  If there are two networks and  four  tasks  on
                          the  node then a total of 32 connections are established (2 instances x
                          2 protocols x 2 networks x 4 tasks).

              This option applies to job and step allocations.

       --nice[=adjustment]
              Run the job with an adjusted scheduling priority within Slurm. With  no  adjustment
              value  the scheduling priority is decreased by 100. A negative nice value increases
              the priority, otherwise decreases it. The adjustment range is +/- 2147483645.  Only
              privileged users can specify a negative adjustment.

       --ntasks-per-core=<ntasks>
              Request the maximum ntasks be invoked on each core.  This option applies to the job
              allocation, but not to step allocations.   Meant  to  be  used  with  the  --ntasks
              option.   Related to --ntasks-per-node except at the core level instead of the node
              level.  Masks will automatically be generated to bind the tasks  to  specific  core
              unless  --cpu-bind=none  is  specified.   NOTE: This option is not supported unless
              SelectType=cons_res is configured (either directly or indirectly on  Cray  systems)
              along with the node's core count.

       --ntasks-per-node=<ntasks>
              Request that ntasks be invoked on each node.  If used with the --ntasks option, the
              --ntasks option will take precedence and the --ntasks-per-node will be treated as a
              maximum  count  of tasks per node.  Meant to be used with the --nodes option.  This
              is related to --cpus-per-task=ncpus, but does not require knowledge of  the  actual
              number  of  cpus  on each node.  In some cases, it is more convenient to be able to
              request that no more than a specific number of  tasks  be  invoked  on  each  node.
              Examples  of  this  include  submitting  a hybrid MPI/OpenMP app where only one MPI
              "task/rank" should be assigned to each node while allowing the  OpenMP  portion  to
              utilize  all  of  the  parallelism  present  in  the  node,  or submitting a single
              setup/cleanup/monitoring job to each node of a pre-existing allocation as one  step
              in a larger job script. This option applies to job allocations.

       --ntasks-per-socket=<ntasks>
              Request  the  maximum ntasks be invoked on each socket.  This option applies to the
              job allocation, but not to step allocations.  Meant to be used  with  the  --ntasks
              option.   Related  to  --ntasks-per-node  except at the socket level instead of the
              node level.  Masks will automatically be generated to bind the  tasks  to  specific
              sockets  unless  --cpu-bind=none  is specified.  NOTE: This option is not supported
              unless SelectType=cons_res is configured (either directly  or  indirectly  on  Cray
              systems) along with the node's socket count.

       -O, --overcommit
              Overcommit  resources.  This  option  applies  to  job  and step allocations.  When
              applied to job allocation, only one CPU is  allocated  to  the  job  per  node  and
              options  used  to  specify  the  number  of tasks per node, socket, core, etc.  are
              ignored.  When applied to job step allocations  (the  srun  command  when  executed
              within an existing job allocation), this option can be used to launch more than one
              task per CPU.  Normally, srun will not allocate more than one process per CPU.   By
              specifying  --overcommit you are explicitly allowing more than one process per CPU.
              However no more than MAX_TASKS_PER_NODE tasks are permitted to  execute  per  node.
              NOTE:  MAX_TASKS_PER_NODE  is defined in the file slurm.h and is not a variable, it
              is set at Slurm build time.

       -o, --output=<filename pattern>
              Specify the "filename pattern" for stdout redirection. By  default  in  interactive
              mode,  srun  collects stdout from all tasks and sends this output via TCP/IP to the
              attached terminal. With --output stdout may be redirected to a file,  to  one  file
              per  task,  or to /dev/null. See section IO Redirection below for the various forms
              of filename pattern.  If the specified file already exists, it will be overwritten.

              If --error is not also specified on the command line, both stdout and  stderr  will
              directed  to  the  file  specified by --output. This option applies to job and step
              allocations.

       --open-mode=<append|truncate>
              Open the output and error files using append or truncate mode  as  specified.   For
              heterogeneous job steps the default value is "append".  Otherwise the default value
              is specified by the  system  configuration  parameter  JobFileAppend.  This  option
              applies to job and step allocations.

       --pack-group=<expr>
              Identify  each  job  in  a  heterogeneous  job allocation for which a step is to be
              created. Applies only to srun commands issued inside a salloc allocation or  sbatch
              script.   <expr>  is a set of integers corresponding to one or more options indexes
              on   the   salloc   or   sbatch   command   line.    Examples:    "--pack-group=2",
              "--pack-group=0,4", "--pack-group=1,3-5".  The default value is --pack-group=0.

       -p, --partition=<partition_names>
              Request  a  specific  partition for the resource allocation.  If not specified, the
              default behavior is to allow the slurm controller to select the  default  partition
              as  designated  by  the  system  administrator.  If  the  job can use more than one
              partition, specify their names in a  comma  separate  list  and  the  one  offering
              earliest  initiation  will  be  used  with  no  regard  given to the partition name
              ordering (although higher priority partitions will be considered first).  When  the
              job  is  initiated,  the name of the partition used will be placed first in the job
              record partition string. This option applies to job allocations.

       --power=<flags>
              Comma separated list of power management plugin options.  Currently available flags
              include:  level  (all  nodes allocated to the job should have identical power caps,
              may be disabled by the Slurm  configuration  option  PowerParameters=job_no_level).
              This option applies to job allocations.

       --priority=<value>
              Request  a  specific  job  priority.   May  be  subject  to  configuration specific
              constraints.  value should either be a numeric value or "TOP" (for highest possible
              value).   Only  Slurm  operators  and administrators can set the priority of a job.
              This option applies to job allocations only.

       --profile=<all|none|[energy[,|task[,|filesystem[,|network]]]]>
              enables detailed data collection by the acct_gather_profile plugin.  Detailed  data
              are  typically  time-series  that  are  stored  in  an  HDF5 file for the job or an
              InfluxDB database depending on the configured plugin.

              All       All data types are collected. (Cannot be combined with other values.)

              None      No data types are collected. This is the default.
                         (Cannot be combined with other values.)

              Energy    Energy data is collected.

              Task      Task (I/O, Memory, ...) data is collected.

              Filesystem
                        Filesystem data is collected.

              Network   Network (InfiniBand) data is collected.

              This option applies to job and step allocations.

       --prolog=<executable>
              srun will run executable just before launching the  job  step.   The  command  line
              arguments  for  executable  will  be the command and arguments of the job step.  If
              executable is "none", then no srun prolog will be run. This parameter overrides the
              SrunProlog  parameter  in slurm.conf. This parameter is completely independent from
              the Prolog parameter in slurm.conf. This option applies to job allocations.

       --propagate[=rlimit[,rlimit...]]
              Allows users to specify which of the modifiable (soft) resource limits to propagate
              to  the  compute nodes and apply to their jobs. If no rlimit is specified, then all
              resource limits will be propagated.  The following rlimit names  are  supported  by
              Slurm (although some options may not be supported on some systems):

              ALL       All limits listed below (default)

              NONE      No limits listed below

              AS        The maximum address space for a process

              CORE      The maximum size of core file

              CPU       The maximum amount of CPU time

              DATA      The maximum size of a process's data segment

              FSIZE     The  maximum  size  of files created. Note that if the user sets FSIZE to
                        less than the current size of the slurmd.log, job launches will fail with
                        a 'File size limit exceeded' error.

              MEMLOCK   The maximum size that may be locked into memory

              NOFILE    The maximum number of open files

              NPROC     The maximum number of processes available

              RSS       The maximum resident set size

              STACK     The maximum stack size

              This option applies to job allocations.

       --pty  Execute   task  zero  in  pseudo  terminal  mode.   Implicitly  sets  --unbuffered.
              Implicitly sets --error and --output to /dev/null for all tasks except  task  zero,
              which  may  cause  those tasks to exit immediately (e.g. shells will typically exit
              immediately in that situation).  This option applies to step allocations.

       -q, --qos=<qos>
              Request a quality of service for the job.  QOS  values  can  be  defined  for  each
              user/cluster/account  association  in the Slurm database.  Users will be limited to
              their association's defined set of qos's when the  Slurm  configuration  parameter,
              AccountingStorageEnforce, includes "qos" in it's definition. This option applies to
              job allocations.

       -Q, --quiet
              Suppress informational messages from srun. Errors will  still  be  displayed.  This
              option applies to job and step allocations.

       --quit-on-interrupt
              Quit  immediately on single SIGINT (Ctrl-C). Use of this option disables the status
              feature normally available when srun receives a single Ctrl-C and  causes  srun  to
              instead  immediately  terminate  the  running  job.  This  option  applies  to step
              allocations.

       -r, --relative=<n>
              Run a job step relative to node n of the current allocation.  This  option  may  be
              used  to  spread several job steps out among the nodes of the current job. If -r is
              used, the current job step will begin at node n of the  allocated  nodelist,  where
              the  first node is considered node 0.  The -r option is not permitted with -w or -x
              option and will result in a fatal error when not running within a prior  allocation
              (i.e.  when  SLURM_JOB_ID  is  not  set).  The  default for n is 0. If the value of
              --nodes exceeds the number of  nodes  identified  with  the  --relative  option,  a
              warning  message  will  be  printed and the --relative option will take precedence.
              This option applies to step allocations.

       --reboot
              Force the allocated nodes  to  reboot  before  starting  the  job.   This  is  only
              supported  with  some system configurations and will otherwise be silently ignored.
              This option applies to job allocations.

       --resv-ports[=count]
              Reserve communication ports for this job. Users can specify the number of port they
              want  to  reserve.  The  parameter MpiParams=ports=12000-12999 must be specified in
              slurm.conf. If not specified and Slurm's OpenMPI plugin is used,  then  by  default
              the  number of reserved equal to the highest number of tasks on any node in the job
              step allocation.  If the number  of  reserved  ports  is  zero  then  no  ports  is
              reserved.  Used for OpenMPI. This option applies to job and step allocations.

       --reservation=<name>
              Allocate  resources  for the job from the named reservation. This option applies to
              job allocations.

       -s, --oversubscribe
              The job allocation can over-subscribe  resources  with  other  running  jobs.   The
              resources  to  be over-subscribed can be nodes, sockets, cores, and/or hyperthreads
              depending upon configuration.   The  default  over-subscribe  behavior  depends  on
              system configuration and the partition's OverSubscribe option takes precedence over
              the job's option.  This option may result in the allocation  being  granted  sooner
              than if the --oversubscribe option was not set and allow higher system utilization,
              but application performance will likely suffer due to  competition  for  resources.
              Also see the --exclusive option. This option applies to step allocations.

       -S, --core-spec=<num>
              Count  of  specialized cores per node reserved by the job for system operations and
              not used by the application. The application will not use these cores, but will  be
              charged  for  their  allocation.   Default  value  is  dependent  upon  the  node's
              configured CoreSpecCount value.  If a value of zero is  designated  and  the  Slurm
              configuration option AllowSpecResourcesUsage is enabled, the job will be allowed to
              override CoreSpecCount and use the specialized resources on nodes it is  allocated.
              This  option  can not be used with the --thread-spec option. This option applies to
              job allocations.

       --signal=<sig_num>[@<sig_time>]
              When a job is within sig_time seconds of its end time, send it the signal  sig_num.
              Due  to  the resolution of event handling by Slurm, the signal may be sent up to 60
              seconds earlier than specified.  sig_num may either be  a  signal  number  or  name
              (e.g.  "10"  or  "USR1").  sig_time must have an integer value between 0 and 65535.
              By default, no signal is sent before the job's end time.  If a sig_num is specified
              without  any  sig_time, the default time will be 60 seconds. This option applies to
              job  allocations.   To  have  the  signal  sent  at   preemption   time   see   the
              preempt_send_user_signal SlurmctldParameter.

       --slurmd-debug=<level>
              Specify  a  debug level for slurmd(8). The level may be specified either an integer
              value between 0 [quiet, only errors are displayed] and 4 [verbose operation] or the
              SlurmdDebug tags.

              quiet     Log nothing

              fatal     Log only fatal errors

              error     Log only errors

              info      Log errors and general informational messages

              verbose   Log errors and verbose informational messages

              The slurmd debug information is copied onto the stderr of
              the  job. By default only errors are displayed. This option applies to job and step
              allocations.

       --sockets-per-node=<sockets>
              Restrict node selection to nodes with at least the  specified  number  of  sockets.
              See  additional  information  under  -B  option  above when task/affinity plugin is
              enabled. This option applies to job allocations.

       --spread-job
              Spread the job allocation over as many nodes as  possible  and  attempt  to  evenly
              distribute   tasks   across   the   allocated  nodes.   This  option  disables  the
              topology/tree plugin.  This option applies to job allocations.

       --switches=<count>[@<max-time>]
              When a tree topology is used, this defines the maximum count  of  switches  desired
              for  the  job allocation and optionally the maximum time to wait for that number of
              switches. If Slurm finds an allocation containing  more  switches  than  the  count
              specified, the job remains pending until it either finds an allocation with desired
              switch count or the time limit expires.  It there is no switch count  limit,  there
              is  no  delay  in  starting  the  job.   Acceptable time formats include "minutes",
              "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes"  and
              "days-hours:minutes:seconds".   The  job's maximum time delay may be limited by the
              system administrator using the SchedulerParameters configuration parameter with the
              max_switch_wait  parameter  option.   On  a dragonfly network the only switch count
              supported is 1 since communication performance  will  be  highest  when  a  job  is
              allocate  resources  on  one leaf switch or more than 2 leaf switches.  The default
              max-time is the max_switch_wait SchedulerParameters. This  option  applies  to  job
              allocations.

       -T, --threads=<nthreads>
              Allows  limiting the number of concurrent threads used to send the job request from
              the srun process to the slurmd processes on the allocated nodes. Default is to  use
              one  thread per allocated node up to a maximum of 60 concurrent threads. Specifying
              this option limits the number of concurrent threads to nthreads (less than or equal
              to  60).   This  should  only be used to set a low thread count for testing on very
              small memory computers. This option applies to job allocations.

       -t, --time=<time>
              Set a limit on the total run time of the job allocation.   If  the  requested  time
              limit  exceeds  the partition's time limit, the job will be left in a PENDING state
              (possibly indefinitely).  The default time limit is the  partition's  default  time
              limit.   When the time limit is reached, each task in each job step is sent SIGTERM
              followed by SIGKILL.  The interval  between  signals  is  specified  by  the  Slurm
              configuration  parameter  KillWait.   The OverTimeLimit configuration parameter may
              permit the job to run longer than scheduled.  Time resolution  is  one  minute  and
              second values are rounded up to the next minute.

              A  time  limit  of  zero  requests  that no time limit be imposed.  Acceptable time
              formats    include    "minutes",    "minutes:seconds",     "hours:minutes:seconds",
              "days-hours",  "days-hours:minutes"  and  "days-hours:minutes:seconds". This option
              applies to job and step allocations.

       --task-epilog=<executable>
              The slurmstepd daemon will run executable just after  each  task  terminates.  This
              will be executed before any TaskEpilog parameter in slurm.conf is executed. This is
              meant to be a very short-lived program. If it  fails  to  terminate  within  a  few
              seconds, it will be killed along with any descendant processes. This option applies
              to step allocations.

       --task-prolog=<executable>
              The slurmstepd daemon will run executable just before  launching  each  task.  This
              will be executed after any TaskProlog parameter in slurm.conf is executed.  Besides
              the normal environment variables, this has SLURM_TASK_PID available to identify the
              process  ID  of  the  task being started.  Standard output from this program of the
              form "export NAME=value" will be used to set environment  variables  for  the  task
              being spawned. This option applies to step allocations.

       --test-only
              Returns  an  estimate of when a job would be scheduled to run given the current job
              queue and all the other srun arguments specifying  the  job.   This  limits  srun's
              behavior  to  just  return  information; no job is actually submitted.  The program
              will be executed directly  by  the  slurmd  daemon.  This  option  applies  to  job
              allocations.

       --thread-spec=<num>
              Count of specialized threads per node reserved by the job for system operations and
              not used by the application. The application will not use these threads,  but  will
              be  charged for their allocation.  This option can not be used with the --core-spec
              option. This option applies to job allocations.

       --threads-per-core=<threads>
              Restrict node selection to nodes with at least the specified number of threads  per
              core.  NOTE: "Threads" refers to the number of processing units on each core rather
              than the number of application tasks to  be  launched  per  core.   See  additional
              information under -B option above when task/affinity plugin is enabled. This option
              applies to job allocations.

       --time-min=<time>
              Set a minimum time limit on the job allocation.  If specified,  the  job  may  have
              it's  --time  limit lowered to a value no lower than --time-min if doing so permits
              the job to begin execution earlier than otherwise possible.  The job's  time  limit
              will  not  be changed after the job is allocated resources.  This is performed by a
              backfill scheduling algorithm to allocate resources otherwise reserved  for  higher
              priority  jobs.   Acceptable  time  formats  include  "minutes", "minutes:seconds",
              "hours:minutes:seconds",       "days-hours",        "days-hours:minutes"        and
              "days-hours:minutes:seconds". This option applies to job allocations.

       --tmp=<size[units]>
              Specify  a  minimum  amount  of  temporary  disk space per node.  Default units are
              megabytes unless  the  SchedulerParameters  configuration  parameter  includes  the
              "default_gbytes"  option for gigabytes.  Different units can be specified using the
              suffix [K|M|G|T].  This option applies to job allocations.

       -u, --unbuffered
              By default the connection between slurmstepd and the user launched  application  is
              over  a  pipe. The stdio output written by the application is buffered by the glibc
              until it is flushed or the output is set as unbuffered.   See  setbuf(3).  If  this
              option  is  specified  the  tasks  are  executed with a pseudo terminal so that the
              application output is unbuffered. This option applies to step allocations.

       --usage
              Display brief help message and exit.

       --uid=<user>
              Attempt to submit and/or run a job as user instead of the  invoking  user  id.  The
              invoking user's credentials will be used to check access permissions for the target
              partition. User root may use this option to run jobs as a normal user in a RootOnly
              partition  for  example.  If run as root, srun will drop its permissions to the uid
              specified after node allocation is  successful.  user  may  be  the  user  name  or
              numerical user ID. This option applies to job and step allocations.

       --use-min-nodes
              If a range of node counts is given, prefer the smaller count.

       -V, --version
              Display version information and exit.

       -v, --verbose
              Increase  the  verbosity  of  srun's  informational  messages.   Multiple -v's will
              further increase srun's verbosity.  By default only errors will be displayed.  This
              option applies to job and step allocations.

       -W, --wait=<seconds>
              Specify  how  long  to  wait after the first task terminates before terminating all
              remaining tasks. A value of 0 indicates an unlimited wait (a warning will be issued
              after  60 seconds). The default value is set by the WaitTime parameter in the slurm
              configuration file (see slurm.conf(5)). This option can be useful to ensure that  a
              job is terminated in a timely fashion in the event that one or more tasks terminate
              prematurely.  Note: The -K, --kill-on-bad-exit option  takes  precedence  over  -W,
              --wait  to terminate the job immediately if a task exits with a non-zero exit code.
              This option applies to job allocations.

       -w, --nodelist=<host1,host2,... or filename>
              Request a specific list of hosts.  The job will contain  all  of  these  hosts  and
              possibly additional hosts as needed to satisfy resource requirements.  The list may
              be specified as a comma-separated list of hosts, a range of hosts  (host[1-5,7,...]
              for  example), or a filename.  The host list will be assumed to be a filename if it
              contains a "/" character.  If you specify a minimum node or processor count  larger
              than  can  be  satisfied  by  the  supplied host list, additional resources will be
              allocated on other nodes as needed.  Rather than repeating  a  host  name  multiple
              times,  an  asterisk  and  a  repetition  count may be appended to a host name. For
              example "host1,host1" and "host1*2" are equivalent. If number of tasks is given and
              a  list  of  requested  nodes is also given the number of nodes used from that list
              will be reduced to match that of the number of tasks if the number of nodes in  the
              list  is  greater  than  the  number  of tasks. This option applies to job and step
              allocations.

       --wckey=<wckey>
              Specify wckey to be used with job.  If TrackWCKey=no (default)  in  the  slurm.conf
              this value is ignored. This option applies to job allocations.

       -X, --disable-status
              Disable  the  display  of  task status when srun receives a single SIGINT (Ctrl-C).
              Instead immediately forward the SIGINT to the running job.  Without this  option  a
              second Ctrl-C in one second is required to forcibly terminate the job and srun will
              immediately   exit.   May   also   be   set   via    the    environment    variable
              SLURM_DISABLE_STATUS. This option applies to job allocations.

       -x, --exclude=<host1,host2,... or filename>
              Request that a specific list of hosts not be included in the resources allocated to
              this job. The host list will  be  assumed  to  be  a  filename  if  it  contains  a
              "/"character. This option applies to job allocations.

       --x11[=<all|first|last>]
              Sets up X11 forwarding on all, first or last node(s) of the allocation. This option
              is only enabled if Slurm was compiled  with  X11  support  and  PrologFlags=x11  is
              defined in the slurm.conf. Default is all.

       -Z, --no-allocate
              Run  the  specified  tasks  on a set of nodes without creating a Slurm "job" in the
              Slurm queue structure, bypassing the normal resource allocation step.  The list  of
              nodes  must  be  specified  with  the  -w, --nodelist option.  This is a privileged
              option only available for the users "SlurmUser" and "root". This option applies  to
              job allocations.

       srun  will submit the job request to the slurm job controller, then initiate all processes
       on the remote nodes. If the request cannot be met immediately, srun will block  until  the
       resources  are  free to run the job. If the -I (--immediate) option is specified srun will
       terminate if resources are not immediately available.

       When initiating remote processes srun will propagate the current working directory, unless
       --chdir=<path>  is specified, in which case path will become the working directory for the
       remote processes.

       The -n, -c, and -N options control how CPUs  and nodes will be allocated to the job.  When
       specifying  only  the number of processes to run with -n, a default of one CPU per process
       is allocated. By specifying the number of CPUs required per task (-c), more than  one  CPU
       may  be  allocated  per  process.  If  the number of nodes is specified with -N, srun will
       attempt to allocate at least the number of nodes specified.

       Combinations of the  above  three  options  may  be  used  to  change  how  processes  are
       distributed  across  nodes  and  cpus.  For  instance,  by  specifying  both the number of
       processes and number of nodes on which to  run,  the  number  of  processes  per  node  is
       implied.  However,  if  the  number  of  CPUs per process is more important then number of
       processes (-n) and the number of CPUs per process (-c) should be specified.

       srun will refuse to  allocate more than one process per CPU unless  --overcommit  (-O)  is
       also specified.

       srun  will  attempt  to meet the above specifications "at a minimum." That is, if 16 nodes
       are requested for 32 processes, and some nodes do not have 2 CPUs, the allocation of nodes
       will  be  increased  in order to meet the demand for CPUs. In other words, a minimum of 16
       nodes are being requested. However, if 16 nodes are requested for 15 processes, srun  will
       consider this an error, as 15 processes cannot run across 16 nodes.

       IO Redirection

       By  default,  stdout and stderr will be redirected from all tasks to the stdout and stderr
       of srun, and stdin will be redirected from the standard input of srun to all remote tasks.
       If  stdin  is  only to be read by a subset of the spawned tasks, specifying a file to read
       from rather than forwarding stdin from the srun command may be  preferable  as  it  avoids
       moving and storing data that will never be read.

       For  OS  X,  the  poll()  function does not support stdin, so input from a terminal is not
       possible.

       This behavior may be changed with the --output, --error, and --input (-o, -e, -i) options.
       Valid format specifications for these options are

       all       stdout  stderr  is redirected from all tasks to srun.  stdin is broadcast to all
                 remote tasks.  (This is the default behavior)

       none      stdout and stderr is not received from any task.  stdin is not sent to any  task
                 (stdin is closed).

       taskid    stdout and/or stderr are redirected from only the task with relative id equal to
                 taskid, where 0 <= taskid <= ntasks, where ntasks is the total number  of  tasks
                 in  the  current  job  step.  stdin is redirected from the stdin of srun to this
                 same task.  This file will be written on the node executing the task.

       filename  srun will redirect stdout and/or stderr to the named file from all tasks.  stdin
                 will  be  redirected  from the named file and broadcast to all tasks in the job.
                 filename refers to a path  on  the  host  that  runs  srun.   Depending  on  the
                 cluster's  file  system  layout,  this  may  result  in  the output appearing in
                 different places depending on whether the job is run in batch mode.

       filename pattern
                 srun allows for a filename pattern to be used to  generate  the  named  IO  file
                 described  above.  The  following  list  of format specifiers may be used in the
                 format string to generate a filename that will  be  unique  to  a  given  jobid,
                 stepid,  node, or task. In each case, the appropriate number of files are opened
                 and associated with  the  corresponding  tasks.  Note  that  any  format  string
                 containing  %t,  %n,  and/or  %N  will be written on the node executing the task
                 rather than the node where  srun  executes,  these  format  specifiers  are  not
                 supported on a BGQ system.

                 \\     Do not process any of the replacement symbols.

                 %%     The character "%".

                 %A     Job array's master job allocation number.

                 %a     Job array ID (index) number.

                 %J     jobid.stepid of the running job. (e.g. "128.0")

                 %j     jobid of the running job.

                 %s     stepid of the running job.

                 %N     short hostname. This will create a separate IO file per node.

                 %n     Node  identifier  relative  to current job (e.g. "0" is the first node of
                        the running job) This will create a separate IO file per node.

                 %t     task identifier (rank) relative  to  current  job.  This  will  create  a
                        separate IO file per task.

                 %u     User name.

                 %x     Job name.

                 A  number  placed between the percent character and format specifier may be used
                 to zero-pad the result in the IO filename. This number is ignored if the  format
                 specifier corresponds to  non-numeric data (%N for example).

                 Some  examples of how the format string may be used for a 4 task job step with a
                 Job ID of 128 and step id of 0 are included below:

                 job%J.out      job128.0.out

                 job%4j.out     job0128.out

                 job%j-%2t.out  job128-00.out, job128-01.out, ...

INPUT ENVIRONMENT VARIABLES

       Some srun options may be set via  environment  variables.   These  environment  variables,
       along with their corresponding options, are listed below.  Note: Command line options will
       always override these settings.

       PMI_FANOUT            This is used exclusively with PMI (MPICH2 and MVAPICH2) and controls
                             the  fanout  of data communications. The srun command sends messages
                             to application programs (via the PMI library) and those applications
                             may  be  called  upon  to  forward that data to up to this number of
                             additional tasks. Higher values offload work from the  srun  command
                             to  the  applications  and  likely  increase  the  vulnerability  to
                             failures.  The default value is 32.

       PMI_FANOUT_OFF_HOST   This is used exclusively with PMI (MPICH2 and MVAPICH2) and controls
                             the  fanout of data communications.  The srun command sends messages
                             to application programs (via the PMI library) and those applications
                             may  be  called  upon  to  forward that data to additional tasks. By
                             default, srun sends one message per host and one task on  that  host
                             forwards  the data to other tasks on that host up to PMI_FANOUT.  If
                             PMI_FANOUT_OFF_HOST is defined, the user task  may  be  required  to
                             forward    the    data   to   tasks   on   other   hosts.    Setting
                             PMI_FANOUT_OFF_HOST may increase performance.  Since  more  work  is
                             performed  by  the  PMI  library  loaded  by  the  user application,
                             failures also can be more common and more difficult to diagnose.

       PMI_TIME              This is used exclusively with PMI (MPICH2 and MVAPICH2) and controls
                             how  much  the  communications from the tasks to the srun are spread
                             out in time in order to avoid overwhelming  the  srun  command  with
                             work.   The  default  value  is  500  (microseconds)  per  task.  On
                             relatively slow processors or  systems  with  very  large  processor
                             counts (and large PMI data sets), higher values may be required.

       SLURM_CONF            The location of the Slurm configuration file.

       SLURM_ACCOUNT         Same as -A, --account

       SLURM_ACCTG_FREQ      Same as --acctg-freq

       SLURM_BCAST           Same as --bcast

       SLURM_BURST_BUFFER    Same as --bb

       SLURM_CHECKPOINT      Same as --checkpoint

       SLURM_COMPRESS        Same as --compress

       SLURM_CONSTRAINT      Same as -C, --constraint

       SLURM_CORE_SPEC       Same as --core-spec

       SLURM_CPU_BIND        Same as --cpu-bind

       SLURM_CPU_FREQ_REQ    Same as --cpu-freq.

       SLURM_CPUS_PER_GPU    Same as --cpus-per-gpu

       SLURM_CPUS_PER_TASK   Same as -c, --cpus-per-task

       SLURM_DEBUG           Same as -v, --verbose

       SLURM_DELAY_BOOT      Same as --delay-boot

       SLURMD_DEBUG          Same as -d, --slurmd-debug

       SLURM_DEPENDENCY      Same as -P, --dependency=<jobid>

       SLURM_DISABLE_STATUS  Same as -X, --disable-status

       SLURM_DIST_PLANESIZE  Same as -m plane

       SLURM_DISTRIBUTION    Same as -m, --distribution

       SLURM_EPILOG          Same as --epilog

       SLURM_EXCLUSIVE       Same as --exclusive

       SLURM_EXIT_ERROR      Specifies  the  exit  code generated when a Slurm error occurs (e.g.
                             invalid options).  This can be  used  by  a  script  to  distinguish
                             application  exit  codes  from various Slurm error conditions.  Also
                             see SLURM_EXIT_IMMEDIATE.

       SLURM_EXIT_IMMEDIATE  Specifies the exit code generated when  the  --immediate  option  is
                             used and resources are not currently available.  This can be used by
                             a script to distinguish application exit codes  from  various  Slurm
                             error conditions.  Also see SLURM_EXIT_ERROR.

       SLURM_EXPORT_ENV      Same as --export

       SLURM_GPUS            Same as -G, --gpus

       SLURM_GPU_BIND        Same as --gpu-bind

       SLURM_GPU_FREQ        Same as --gpu-freq

       SLURM_GPUS_PER_NODE   Same as --gpus-per-node

       SLURM_GPUS_PER_TASK   Same as --gpus-per-task

       SLURM_GRES_FLAGS      Same as --gres-flags

       SLURM_HINT            Same as --hint

       SLURM_GRES            Same as --gres. Also see SLURM_STEP_GRES

       SLURM_IMMEDIATE       Same as -I, --immediate

       SLURM_JOB_ID          Same as --jobid

       SLURM_JOB_NAME        Same  as  -J,  --job-name  except  within an existing allocation, in
                             which case it is ignored to avoid using the batch job's name as  the
                             name of each job step.

       SLURM_JOB_NUM_NODES (and SLURM_NNODES for backwards compatibility)
                             Same  as  -N,  --nodes  Total  number of nodes in the job’s resource
                             allocation.

       SLURM_KILL_BAD_EXIT   Same as -K, --kill-on-bad-exit

       SLURM_LABELIO         Same as -l, --label

       SLURM_MEM_BIND        Same as --mem-bind

       SLURM_MEM_PER_CPU     Same as --mem-per-cpu

       SLURM_MEM_PER_GPU     Same as --mem-per-gpu

       SLURM_MEM_PER_NODE    Same as --mem

       SLURM_MPI_TYPE        Same as --mpi

       SLURM_NETWORK         Same as --network

       SLURM_NO_KILL         Same as -k, --no-kill

       SLURM_NTASKS (and SLURM_NPROCS for backwards compatibility)
                             Same as -n, --ntasks

       SLURM_NTASKS_PER_CORE Same as --ntasks-per-core

       SLURM_NTASKS_PER_NODE Same as --ntasks-per-node

       SLURM_NTASKS_PER_SOCKET
                             Same as --ntasks-per-socket

       SLURM_OPEN_MODE       Same as --open-mode

       SLURM_OVERCOMMIT      Same as -O, --overcommit

       SLURM_PARTITION       Same as -p, --partition

       SLURM_PMI_KVS_NO_DUP_KEYS
                             If set, then PMI key-pairs will contain no duplicate keys.  MPI  can
                             use  this  variable  to  inform the PMI library that it will not use
                             duplicate keys so PMI can skip the check for duplicate  keys.   This
                             is  the  case  for  MPICH2  and  reduces  overhead  in  testing  for
                             duplicates for improved performance

       SLURM_POWER           Same as --power

       SLURM_PROFILE         Same as --profile

       SLURM_PROLOG          Same as --prolog

       SLURM_QOS             Same as --qos

       SLURM_REMOTE_CWD      Same as -D, --chdir=

       SLURM_REQ_SWITCH      When a tree topology is used, this  defines  the  maximum  count  of
                             switches  desired  for the job allocation and optionally the maximum
                             time to wait for that number of switches. See --switches

       SLURM_RESERVATION     Same as --reservation

       SLURM_RESV_PORTS      Same as --resv-ports

       SLURM_SIGNAL          Same as --signal

       SLURM_STDERRMODE      Same as -e, --error

       SLURM_STDINMODE       Same as -i, --input

       SLURM_SPREAD_JOB      Same as --spread-job

       SLURM_SRUN_REDUCE_TASK_EXIT_MSG
                             if set and non-zero, successive task exit  messages  with  the  same
                             exit code will be printed only once.

       SLURM_STEP_GRES       Same  as --gres (only applies to job steps, not to job allocations).
                             Also see SLURM_GRES

       SLURM_STEP_KILLED_MSG_NODE_ID=ID
                             If set, only the specified node will log when the job  or  step  are
                             killed by a signal.

       SLURM_STDOUTMODE      Same as -o, --output

       SLURM_TASK_EPILOG     Same as --task-epilog

       SLURM_TASK_PROLOG     Same as --task-prolog

       SLURM_TEST_EXEC       If  defined,  srun  will  verify existence of the executable program
                             along with user execute permission on the node where srun was called
                             before attempting to launch it on nodes in the step.

       SLURM_THREAD_SPEC     Same as --thread-spec

       SLURM_THREADS         Same as -T, --threads

       SLURM_TIMELIMIT       Same as -t, --time

       SLURM_UNBUFFEREDIO    Same as -u, --unbuffered

       SLURM_USE_MIN_NODES   Same as --use-min-nodes

       SLURM_WAIT            Same as -W, --wait

       SLURM_WAIT4SWITCH     Max time waiting for requested switches. See --switches

       SLURM_WCKEY           Same as -W, --wckey

       SLURM_WORKING_DIR     -D, --chdir

       SRUN_EXPORT_ENV       Same as --export, and will override any setting for SRUN_EXPORT_ENV.

OUTPUT ENVIRONMENT VARIABLES

       srun  will set some environment variables in the environment of the executing tasks on the
       remote compute nodes.  These environment variables are:

       SLURM_*_PACK_GROUP_#  For a heterogeneous job allocation, the  environment  variables  are
                             set separately for each component.

       SLURM_CLUSTER_NAME    Name of the cluster on which the job is executing.

       SLURM_CPU_BIND_VERBOSE
                             --cpu-bind verbosity (quiet,verbose).

       SLURM_CPU_BIND_TYPE   --cpu-bind type (none,rank,map_cpu:,mask_cpu:).

       SLURM_CPU_BIND_LIST   --cpu-bind map or mask list (list of Slurm CPU IDs or masks for this
                             node,  CPU_ID  =  Board_ID  x  threads_per_board   +   Socket_ID   x
                             threads_per_socket + Core_ID x threads_per_core + Thread_ID).

       SLURM_CPU_FREQ_REQ    Contains  the  value requested for cpu frequency on the srun command
                             as a numerical frequency in  kilohertz,  or  a  coded  value  for  a
                             request  of  low,  medium,highm1 or high for the frequency.  See the
                             description of the --cpu-freq option or the SLURM_CPU_FREQ_REQ input
                             environment variable.

       SLURM_CPUS_ON_NODE    Count  of  processors  available  to the job on this node.  Note the
                             select/linear plugin allocates entire nodes to jobs,  so  the  value
                             indicates   the   total   count  of  CPUs  on  the  node.   For  the
                             select/cons_res plugin, this number indicates the number of cores on
                             this node allocated to the job.

       SLURM_CPUS_PER_GPU    Number  of  CPUs  requested  per  allocated  GPU.   Only  set if the
                             --cpus-per-gpu option is specified.

       SLURM_CPUS_PER_TASK   Number of cpus requested per task.  Only set if the  --cpus-per-task
                             option is specified.

       SLURM_DISTRIBUTION    Distribution  type for the allocated jobs. Set the distribution with
                             -m, --distribution.

       SLURM_GPUS            Number of GPUs requested.  Only set if  the  -G,  --gpus  option  is
                             specified.

       SLURM_GPU_BIND        Requested  binding  of  tasks  to  GPU.   Only set if the --gpu-bind
                             option is specified.

       SLURM_GPU_FREQ        Requested GPU frequency.  Only  set  if  the  --gpu-freq  option  is
                             specified.

       SLURM_GPUS_PER_NODE   Requested   GPU   count   per  allocated  node.   Only  set  if  the
                             --gpus-per-node option is specified.

       SLURM_GPUS_PER_SOCKET Requested  GPU  count  per  allocated  socket.   Only  set  if   the
                             --gpus-per-socket option is specified.

       SLURM_GPUS_PER_TASK   Requested   GPU   count   per  allocated  task.   Only  set  if  the
                             --gpus-per-task option is specified.

       SLURM_GTIDS           Global task IDs  running  on  this  node.   Zero  origin  and  comma
                             separated.

       SLURM_JOB_ACCOUNT     Account name associated of the job allocation.

       SLURM_JOB_CPUS_PER_NODE
                             Number of CPUS per node.

       SLURM_JOB_DEPENDENCY  Set to value of the --dependency option.

       SLURM_JOB_ID (and SLURM_JOBID for backwards compatibility)
                             Job id of the executing job.

       SLURM_JOB_NAME        Set  to  the value of the --job-name option or the command name when
                             srun is used to create a new job allocation. Not set  when  srun  is
                             used  only  to  create  a  job  step  (i.e.  within  an existing job
                             allocation).

       SLURM_JOB_PARTITION   Name of the partition in which the job is running.

       SLURM_JOB_QOS         Quality Of Service (QOS) of the job allocation.

       SLURM_JOB_RESERVATION Advanced reservation containing the job allocation, if any.

       SLURM_LAUNCH_NODE_IPADDR
                             IP address of the node from which  the  task  launch  was  initiated
                             (where the srun command ran from).

       SLURM_LOCALID         Node local task ID for the process within a job.

       SLURM_MEM_BIND_LIST   --mem-bind map or mask list (<list of IDs or masks for this node>).

       SLURM_MEM_BIND_PREFER --mem-bind prefer (prefer).

       SLURM_MEM_BIND_SORT   Sort free cache pages (run zonesort on Intel KNL nodes).

       SLURM_MEM_BIND_TYPE   --mem-bind type (none,rank,map_mem:,mask_mem:).

       SLURM_MEM_BIND_VERBOSE
                             --mem-bind verbosity (quiet,verbose).

       SLURM_MEM_PER_GPU     Requested  memory  per allocated GPU.  Only set if the --mem-per-gpu
                             option is specified.

       SLURM_JOB_NODES       Total number of nodes in the job's resource allocation.

       SLURM_NODE_ALIASES    Sets of node name, communication  address  and  hostname  for  nodes
                             allocated  to  the  job  from  the cloud. Each element in the set if
                             colon separated and each set is comma separated. For example:
                             SLURM_NODE_ALIASES=ec0:1.2.3.4:foo,ec1:1.2.3.5:bar

       SLURM_NODEID          The relative node ID of the current node.

       SLURM_JOB_NODELIST    List of nodes allocated to the job.

       SLURM_NTASKS (and SLURM_NPROCS for backwards compatibility)
                             Total number of processes in the current job or job step.

       SLURM_PACK_SIZE       Set to count of components in heterogeneous job.

       SLURM_PRIO_PROCESS    The  scheduling priority (nice value) at the time of job submission.
                             This value is propagated to the spawned processes.

       SLURM_PROCID          The MPI rank (or relative process ID) of the current process.

       SLURM_SRUN_COMM_HOST  IP address of srun communication host.

       SLURM_SRUN_COMM_PORT  srun communication port.

       SLURM_STEP_LAUNCHER_PORT
                             Step launcher port.

       SLURM_STEP_NODELIST   List of nodes allocated to the step.

       SLURM_STEP_NUM_NODES  Number of nodes allocated to the step.

       SLURM_STEP_NUM_TASKS  Number of processes in the step.

       SLURM_STEP_TASKS_PER_NODE
                             Number of processes per node within the step.

       SLURM_STEP_ID (and SLURM_STEPID for backwards compatibility)
                             The step ID of the current job.

       SLURM_SUBMIT_DIR      The directory from which srun was invoked  or,  if  applicable,  the
                             directory specified by the -D, --chdir option.

       SLURM_SUBMIT_HOST     The hostname of the computer from which salloc was invoked.

       SLURM_TASK_PID        The process ID of the task being started.

       SLURM_TASKS_PER_NODE  Number  of  tasks  to  be  initiated  on each node. Values are comma
                             separated and in the same order as SLURM_JOB_NODELIST.   If  two  or
                             more  consecutive  nodes are to have the same task count, that count
                             is followed by  "(x#)"  where  "#"  is  the  repetition  count.  For
                             example,  "SLURM_TASKS_PER_NODE=2(x3),1"  indicates  that  the first
                             three nodes will each execute three tasks and the fourth  node  will
                             execute one task.

       SLURM_TOPOLOGY_ADDR   This  is  set  only  if  the  system  has  the  topology/tree plugin
                             configured.  The value will be set to  the  names  network  switches
                             which  may be involved in the job's communications from the system's
                             top level switch down to the leaf switch and ending with node  name.
                             A period is used to separate each hardware component name.

       SLURM_TOPOLOGY_ADDR_PATTERN
                             This  is  set  only  if  the  system  has  the  topology/tree plugin
                             configured.  The  value  will  be  set  component  types  listed  in
                             SLURM_TOPOLOGY_ADDR.   Each  component  will be identified as either
                             "switch" or "node".  A period is  used  to  separate  each  hardware
                             component type.

       SLURM_UMASK           The umask in effect when the job was submitted.

       SLURMD_NODENAME       Name  of  the  node  running the task. In the case of a parallel job
                             executing on multiple compute nodes, the  various  tasks  will  have
                             this  environment  variable  set to different values on each compute
                             node.

       SRUN_DEBUG            Set to the logging level of the srun command.  Default  value  is  3
                             (info  level).   The  value is incremented or decremented based upon
                             the --verbose and --quiet options.

SIGNALS AND ESCAPE SEQUENCES

       Signals sent to  the  srun  command  are  automatically  forwarded  to  the  tasks  it  is
       controlling  with  a few exceptions. The escape sequence <control-c> will report the state
       of all tasks associated with the srun command. If <control-c> is entered twice within  one
       second,  then  the  associated  SIGINT  signal will be sent to all tasks and a termination
       sequence will be entered sending SIGCONT, SIGTERM, and SIGKILL to all spawned tasks.  If a
       third  <control-c>  is  received,  the srun program will be terminated without waiting for
       remote tasks to exit or their I/O to complete.

       The escape sequence <control-z> is presently ignored. Our intent is for this put the  srun
       command into a mode where various special actions may be invoked.

MPI SUPPORT

       MPI  use depends upon the type of MPI being used.  There are three fundamentally different
       modes of operation used by these various MPI implementation.

       1. Slurm directly launches the tasks and performs initialization of communications through
       the PMI2 or PMIx APIs.  For example: "srun -n16 a.out".

       2.  Slurm  creates  a resource allocation for the job and then mpirun launches tasks using
       Slurm's infrastructure (OpenMPI).

       3. Slurm creates a resource allocation for the job and then mpirun  launches  tasks  using
       some mechanism other than Slurm, such as SSH or RSH.  These tasks are initiated outside of
       Slurm's monitoring or control. Slurm's epilog should be configured to  purge  these  tasks
       when  the  job's  allocation  is  relinquished,  or  the  use of pam_slurm_adopt is highly
       recommended.

       See https://slurm.schedmd.com/mpi_guide.html for more information on use of these  various
       MPI implementation with Slurm.

MULTIPLE PROGRAM CONFIGURATION

       Comments  in the configuration file must have a "#" in column one.  The configuration file
       contains the following fields separated by white space:

       Task rank
              One or more task ranks to use this configuration.  Multiple  values  may  be  comma
              separated.   Ranges may be indicated with two numbers separated with a '-' with the
              smaller number first (e.g. "0-4"  and  not  "4-0").   To  indicate  all  tasks  not
              otherwise  specified,  specify  a  rank of '*' as the last line of the file.  If an
              attempt is made to initiate a task for which no executable program is defined,  the
              following  error message will be produced "No executable program specified for this
              task".

       Executable
              The name of the program to execute.  May be fully qualified pathname if desired.

       Arguments
              Program arguments.  The expression "%t" will be replaced with  the  task's  number.
              The expression "%o" will be replaced with the task's offset within this range (e.g.
              a configured task rank value of "1-5" would have offset values of  "0-4").   Single
              quotes  may be used to avoid having the enclosed values interpreted.  This field is
              optional.  Any arguments for the program entered on the command line will be  added
              to the arguments specified in the configuration file.

       For example:
       ###################################################################
       # srun multiple program configuration file
       #
       # srun -n8 -l --multi-prog silly.conf
       ###################################################################
       4-6       hostname
       1,7       echo  task:%t
       0,2-3     echo  offset:%o

       > srun -n8 -l --multi-prog silly.conf
       0: offset:0
       1: task:1
       2: offset:1
       3: offset:2
       4: linux15.llnl.gov
       5: linux16.llnl.gov
       6: linux17.llnl.gov
       7: task:7

EXAMPLES

       This  simple example demonstrates the execution of the command hostname in eight tasks. At
       least eight processors will be allocated to the job  (the  same  as  the  task  count)  on
       however  many  nodes  are required to satisfy the request. The output of each task will be
       proceeded with its task number.  (The machine "dev" in the example below has  a  total  of
       two CPUs per node)

       > srun -n8 -l hostname
       0: dev0
       1: dev0
       2: dev1
       3: dev1
       4: dev2
       5: dev2
       6: dev3
       7: dev3

       The  srun  -r option is used within a job script to run two job steps on disjoint nodes in
       the following example. The script is run using allocate mode instead of as a batch job  in
       this case.

       > cat test.sh
       #!/bin/sh
       echo $SLURM_JOB_NODELIST
       srun -lN2 -r2 hostname
       srun -lN2 hostname

       > salloc -N4 test.sh
       dev[7-10]
       0: dev9
       1: dev10
       0: dev7
       1: dev8

       The following script runs two job steps in parallel within an allocated set of nodes.

       > cat test.sh
       #!/bin/bash
       srun -lN2 -n4 -r 2 sleep 60 &
       srun -lN2 -r 0 sleep 60 &
       sleep 1
       squeue
       squeue -s
       wait

       > salloc -N4 test.sh
         JOBID PARTITION     NAME     USER  ST      TIME  NODES NODELIST
         65641     batch  test.sh   grondo   R      0:01      4 dev[7-10]

       STEPID     PARTITION     USER      TIME NODELIST
       65641.0        batch   grondo      0:01 dev[7-8]
       65641.1        batch   grondo      0:01 dev[9-10]

       This  example demonstrates how one executes a simple MPI job.  We use srun to build a list
       of machines (nodes) to be used by mpirun in its required format. A sample command line and
       the script to be executed follow.

       > cat test.sh
       #!/bin/sh
       MACHINEFILE="nodes.$SLURM_JOB_ID"

       # Generate Machinefile for mpi such that hosts are in the same
       #  order as if run via srun
       #
       srun -l /bin/hostname | sort -n | awk '{print $2}' > $MACHINEFILE

       # Run using generated Machine file:
       mpirun -np $SLURM_NTASKS -machinefile $MACHINEFILE mpi-app

       rm $MACHINEFILE

       > salloc -N2 -n4 test.sh

       This simple example demonstrates the execution of different jobs on different nodes in the
       same srun.  You can do this  for  any  number  of  nodes  or  any  number  of  jobs.   The
       executables  are placed on the nodes sited by the SLURM_NODEID env var.  Starting at 0 and
       going to the number specified on the srun commandline.

       > cat test.sh
       case $SLURM_NODEID in
           0) echo "I am running on "
              hostname ;;
           1) hostname
              echo "is where I am running" ;;
       esac

       > srun -N2 test.sh
       dev0
       is where I am running
       I am running on
       dev1

       This example demonstrates use of multi-core  options  to  control  layout  of  tasks.   We
       request that four sockets per node and two cores per socket be dedicated to the job.

       > srun -N2 -B 4-4:2-2 a.out

       This  example  shows  a script in which Slurm is used to provide resource management for a
       job by executing the various job steps as processors become available for their  dedicated
       use.

       > cat my.script
       #!/bin/bash
       srun --exclusive -n4 prog1 &
       srun --exclusive -n3 prog2 &
       srun --exclusive -n1 prog3 &
       srun --exclusive -n1 prog4 &
       wait

       This  example shows how to launch an application called "master" with one task, 8 CPUs and
       and 16 GB of memory (2 GB per CPU) plus another application called "slave" with 16  tasks,
       1 CPU per task (the default) and 1 GB of memory per task.

       > srun -n1 -c16 --mem-per-cpu=1gb master : -n16 --mem-per-cpu=1gb slave

COPYING

       Copyright (C) 2006-2007 The Regents of the University of California.  Produced at Lawrence
       Livermore National Laboratory (cf, DISCLAIMER).
       Copyright (C) 2008-2010 Lawrence Livermore National Security.
       Copyright (C) 2010-2015 SchedMD LLC.

       This  file  is  part  of  Slurm,  a  resource  management  program.   For   details,   see
       <https://slurm.schedmd.com/>.

       Slurm  is  free  software; you can redistribute it and/or modify it under the terms of the
       GNU General Public License as published by the Free Software Foundation; either version  2
       of the License, or (at your option) any later version.

       Slurm is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without
       even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
       GNU General Public License for more details.

SEE ALSO

       salloc(1),   sattach(1),   sbatch(1),   sbcast(1),   scancel(1),  scontrol(1),  squeue(1),
       slurm.conf(5), sched_setaffinity (2), numa (3) getrlimit (2)