Provided by: slurm-llnl_1.2.20-1_i386 bug

NAME

       slurm.conf - Slurm configuration file

DESCRIPTION

       /etc/slurm.conf   is  an  ASCII  file  which  describes  general  SLURM
       configuration information, the nodes to be managed,  information  about
       how  those  nodes  are  grouped into partitions, and various scheduling
       parameters associated with those partitions.

       The file location can be  modified  at  system  build  time  using  the
       DEFAULT_SLURM_CONF  parameter.  In addition, you can use the SLURM_CONF
       environment variable to override the built-in location  of  this  file.
       The  SLURM  daemons  also  allow  you to override both the built-in and
       environment-provided location using the  "-f"  option  on  the  command
       line.

       The  contents  of the file are case insensitive except for the names of
       nodes and partitions. Any text following a  "#"  in  the  configuration
       file is treated as a comment through the end of that line.  The size of
       each line in the file is limited to 1024 characters.   Changes  to  the
       configuration  file  take  effect upon restart of SLURM daemons, daemon
       receipt of the SIGHUP signal, or execution  of  the  command  "scontrol
       reconfigure" unless otherwise noted.

       If  a  line  begins  with the word "Include" followed by whitespace and
       then a file name, that file will be included inline  with  the  current
       configuration file.

       The overall configuration parameters available include:

       AuthType
              Define  the  authentication  method  for  communications between
              SLURM  components.   Acceptable  values   at   present   include
              "auth/none",  "auth/authd", and "auth/munge".  The default value
              is "auth/none", which means the UID  included  in  communication
              messages  is  not  verified.   This  may  be  fine  for  testing
              purposes, but do not use "auth/none" if you desire any security.
              "auth/authd"  indicates  that  Brett  Chun’s authd is to be used
              (see  "http://www.theether.org/authd/"  for  more  information).
              "auth/munge"  indicates  that Chris Dunlap’s munge is to be used
              (this is the best supported authentication mechanism for  SLURM,
              see  "http://www.llnl.gov/linux/munge/"  for  more information).
              All SLURM daemons and  commands  must  be  terminated  prior  to
              changing  the  value of AuthType and later restarted (SLURM jobs
              can be preserved).

       BackupAddr
              Name that BackupController should be referred to in establishing
              a  communications path. This name will be used as an argument to
              the gethostbyname() function for  identification.  For  example,
              "elx0000"  might  be  used to designate the ethernet address for
              node "lx0000".  By default the BackupAddr will be  identical  in
              value to BackupController.

       BackupController
              The  name of the machine where SLURM control functions are to be
              executed in the event that ControlMachine fails. This  node  may
              also  be  used  as  a compute server if so desired. It will come
              into  service  as  a  controller  only  upon  the   failure   of
              ControlMachine  and  will  revert  to  a "standby" mode when the
              ControlMachine becomes available once again.  This should  be  a
              node  name  without the full domain name (e.g. "lx0002").  While
              not essential, it is  recommended  that  you  specify  a  backup
              controller.   See   the  RELOCATING  CONTROLLERS  section if you
              change this.

       CacheGroups
              If set to 1, the slurmd daemon will  cache /etc/groups  entries.
              This  can  improve  performance  for highly parallel jobs if NIS
              servers are used  and  unable  to  respond  very  quickly.   The
              default value is 0 to disable caching group data.

       CheckpointType
              Define  the  system-initiated  checkpoint  method to be used for
              user jobs.  The slurmctld daemon must be restarted for a  change
              in  CheckpointType to take effect.  Acceptable values at present
              include    "checkpoint/aix"    (only    on     AIX     systems),
              "checkpoint/ompi"  (requires OpenMPI version 1.3 or higher), and
              "checkpoint/none".  (only on AIX systems).  The default value is
              "checkpoint/none".

       ControlAddr
              Name that ControlMachine should be referred to in establishing a
              communications path. This name will be used as  an  argument  to
              the  gethostbyname()  function  for identification. For example,
              "elx0000" might be used to designate the  ethernet  address  for
              node  "lx0000".  By default the ControlAddr will be identical in
              value to ControlMachine.

       ControlMachine
              The name of  the  machine  where  SLURM  control  functions  are
              executed.   This  should  be a node name without the full domain
              name (e.g. "lx0001").  This value must be specified.   See   the
              RELOCATING CONTROLLERS section if you change this.

       Epilog Fully  qualified pathname of a script to execute as user root on
              every   node    when    a    user’s    job    completes    (e.g.
              "/usr/local/slurm/epilog").  This  may  be  used to purge files,
              disable user login, etc. By default there is no epilog.

       FastSchedule
              Controls how a nodes configuration specifications in  slurm.conf
              are  used.   If  the number of node configuration entries in the
              configuration file is significantly lower  than  the  number  of
              nodes,  setting  FastSchedule  to  1  will  permit  much  faster
              scheduling decisions to be made.  (The scheduler can just  check
              the  values  in  a few configuration records instead of possibly
              thousands  of  node  records.  If  a  job  can’t  be   initiated
              immediately,  the scheduler may execute these tests repeatedly.)
              Note that on systems with hyper-threading, the  processor  count
              reported by the node will be twice the actually processor count.
              Consider  which  value  you  want  to  be  used  for  scheduling
              purposes.

              1 (default)
                   Consider   the  configuration  of  each  node  to  be  that
                   specified in the configuration file and any node with  less
                   than the configured resouces will be set DOWN.

              0    Base  scheduling decisions upon the actual configuration of
                   each individual node.

              2    Consider  the  configuration  of  each  node  to  be   that
                   specified in the slurm.conf configuration file and any node
                   with less resources than configured will not be  set  DOWN.
                   This can be useful for testing purposes.

       FirstJobId
              The job id to be used for the first submitted to SLURM without a
              specific  requested  value.  Job  id   values   generated   will
              incremented  by  1  for each subsequent job. This may be used to
              provide a meta-scheduler with a job id space which  is  disjoint
              from the interactive jobs.  The default value is 1.

       HeartbeatInterval
              Defunct  paramter.   Interval  of heartbeat for slurmd daemon is
              half of SlurmdTimeout.  Interval  of  heartbeat  for   slurmctld
              daemon is half of SlurmctldTimeout.

       InactiveLimit
              The  interval,  in seconds, a job or job step is permitted to be
              inactive  before  it  is  terminated.  A  job  or  job  step  is
              considered  inactive  if  the  associated  srun  command  is not
              responding  to  slurm  daemons.  This  could  be  due   to   the
              termination  of  the  srun  command  or  the  program being is a
              stopped state. A batch job is considered inactive if it  has  no
              active  job  steps  (e.g.  periods of pre- and post-processing).
              This limit permits defunct jobs to be purged in a timely fashion
              without  waiting for their time limit to be reached.  This value
              should reflect the possibility that the srun command may stopped
              by  a  debugger or considerable time could be required for batch
              job pre- and post-processing.  This limit is  ignored  for  jobs
              running  in partitions with the RootOnly flag set (the scheduler
              running as root will be responsible for the job).   The  default
              value is unlimited (zero).  May not exceed 65533.

       JobAcctType
              Define  the job accounting mechanism type.  Acceptable values at
              present  include  "jobacct/aix"  (for  AIX  operating   system),
              "jobacct/linux"  (for Linux operating system) and "jobacct/none"
              (no  accounting  data  collected).    The   default   value   is
              "jobacct/none".   In  order to use the sacct tool, "jobacct/aix"
              or "jobacct/linux" must be configured.

       JobAcctLogFile
              Define the location where job accounting logs are to be written.
              For  jobacct/none  this parameter is ignored.  For jobacct/linux
              this is the fully-qualified file name for the data file.

       JobAcctFrequency
              Define the polling frequencys to  pass  to  the  job  accounting
              plugin.   For  jobacct/none  this  parameter  is  ignored.   For
              jobacct/linux the parameter  is  a  number  is  seconds  between
              polls.

       JobCompLoc
              The  interpretation  of  this  value  depends  upon  the logging
              mechanism specified by the JobCompType parameter.

       JobCompType
              Define the job completion logging  mechanism  type.   Acceptable
              values at present include "jobcomp/none", "jobcomp/filetxt", and
              "jobcomp/script".  The default value  is  "jobcomp/none",  which
              means  that  upon job completion the record of the job is purged
              from the system.  The value "jobcomp/filetxt" indicates  that  a
              record  of the job should be written to a text file specified by
              the JobCompLoc parameter.  The value "jobcomp/script"  indicates
              that  a  script  specified  by the JobCompLoc parameter is to be
              executed  with  environment   variables   indicating   the   job
              information.

       JobCredentialPrivateKey
              Fully qualified pathname of a file containing a private key used
              for authentication by Slurm daemons.

       JobCredentialPublicCertificate
              Fully qualified pathname of a file containing a public key  used
              for authentication by Slurm daemons.

       JobFileAppend
              This  option controls what to do if a job’s output or error file
              exist when the job is started.  If JobFileAppend  is  set  to  a
              value  of  1, then append to the existing file.  By default, any
              existing file is truncated.  NOTE: This variable does not appear
              in  the output of the command "scontrol show config" in versions
              of SLURM less than version 1.3.

       KillTree
              This option is  mapped  to  "ProctrackType=proctrack/linuxproc".
              It will be removed from a future release.

       KillWait
              The interval, in seconds, given to a job’s processes between the
              SIGTERM and SIGKILL signals upon reaching its  time  limit.   If
              the job fails to terminate gracefully in the interval specified,
              it will  be  forcably  terminated.   The  default  value  is  30
              seconds.  May not exceed 65533.

       MailProg
              Fully  qualified  pathname to the program used to send email per
              user request.  The default value is "/bin/mail".

       MaxJobCount
              The maximum number of jobs SLURM can have in its active database
              at  one  time.  Set  the  values of MaxJobCount and MinJobAge to
              insure the slurmctld daemon does not exhaust its memory or other
              resources.  Once  this  limit  is  reached,  requests  to submit
              additional jobs will fail. The default value is 2000 jobs.  This
              value  may  not  be reset via "scontrol reconfig". It only takes
              effect upon restart of the slurmctld  daemon.   May  not  exceed
              65533.

       MessageTimeout
              Time  permitted  for  a  round-trip communication to complete in
              seconds. Default value is 10 seconds. For  systems  with  shared
              nodes,  the  slurmd  daemon  could  be paged out and necessitate
              higher values.

       MinJobAge
              The minimum age of a completed job before its record  is  purged
              from  SLURM’s active database. Set the values of MaxJobCount and
              MinJobAge to insure the slurmctld daemon does  not  exhaust  its
              memory  or other resources. The default value is 300 seconds.  A
              value of zero prevents any job record purging.  May  not  exceed
              65533.

       MpiDefault
              Identifies  the  default  type  of  MPI  to  be  used.  Srun may
              override this configuration parameter in  any  case.   Currently
              supported  versions  include:  mpichgm,  mvapich, none (default,
              which works for many other versions of MPI including LAM MPI and
              Open MPI).

       PluginDir
              Identifies  the places in which to look for SLURM plugins.  This
              is  a  colon-separated  list  of  directories,  like  the   PATH
              environment      variable.      The     default     value     is
              "/usr/local/lib/slurm".

       PlugStackConfig
              Location of the config file for SLURM stackable plugins that use
              the  Stackable  Plugin  Architecture  for  Node  job  (K)control
              (SPANK).  This provides support for a highly configurable set of
              plugins  to be called before and/or after execution of each task
              spawned as part of a  user’s  job  step.   Default  location  is
              "plugstack.conf" in the same directory as the system slurm.conf.
              For more information on SPANK plugins, see the spank(8)  manual.

       ProctrackType
              Identifies  the  plugin  to  be  used for process tracking.  The
              slurmd daemon uses this  mechanism  to  identify  all  processes
              which  are  children of processes it spawns for a user job.  The
              slurmd daemon must be restarted for a change in ProctrackType to
              take  effect.   NOTE: "proctrack/linuxproc" and "proctrack/pgid"
              can fail to identify all processes associated with a  job  since
              processes  can  become  a  child  of  the init process (when the
              parent process terminates) or change their  process  group.   To
              reliably  track  all  processes,  one  of  the  other mechanisms
              utilizing   kernel   modifications   is    preferable.     NOTE:
              "proctrack/linuxproc"  is  not  compatible  with  "switch/elan."
              Acceptable values at present include:

              proctrack/aix which uses an AIX kernel extenstion and is
                     the default for AIX systems

              proctrack/linuxproc which uses linux process tree using
                     parent process IDs

              proctrack/rms which uses Quadrics kernel patch and is the
                     default if "SwitchType=switch/elan"

              proctrack/sgi_job which uses SGI’s Process Aggregates (PAGG)
                     kernel module, see http://oss.sgi.com/projects/pagg/  for
                     more information

              proctrack/pgid which uses process group IDs and is the
                     default for all other systems

       Prolog Fully  qualified  pathname of a script for the slurmd to execute
              whenever it  is  asked  to  run  a  job  step  from  a  new  job
              allocation.    (e.g.   "/usr/local/slurm/prolog").   The  slurmd
              executes the script before starting the job step.  This  may  be
              used  to  purge files, enable user login, etc.  By default there
              is no prolog. Any configured  script  is  expected  to  complete
              execution quickly (in less time than MessageTimeout).

              NOTE:  The Prolog script is ONLY run on any individual node when
              it first sees a job step from a new allocation; it does not  run
              the Prolog immediately when an allocation is granted.  If no job
              steps from an allocation are run on a node, it  will  never  run
              the  Prolog for that allocation.  The Epilog, on the other hand,
              always runs on every node of an allocation when  the  allocation
              is released.

       PropagatePrioProcess
              Setting  PropagatePrioProcess  to "1", will cause a users job to
              run with the same priority (aka nice value) as the users process
              which  launched  the  job on the submit node.  If set to "0", or
              left unset, the users job will inherit the  scheduling  priority
              from the slurm daemon.

       PropagateResourceLimits
              A  list  of  comma  separated  resource limit names.  The slurmd
              daemon uses these names to obtain the  associated  (soft)  limit
              values  from  the  users process environment on the submit node.
              These limits are then propagated and applied to  the  jobs  that
              will  run  on  the  compute nodes.  This parameter can be useful
              when system limits vary among nodes.  Any resource  limits  that
              do not appear in the list are not propagated.  However, the user
              can  override  this  by  specifying  which  resource  limits  to
              propagate  with  the  srun  commands  "--propagate"  option.  If
              neither  of  the  ’propagate  resource  limit’  parameters   are
              specified,  then  the default action is to propagate all limits.
              Only one of the parameters,  either  PropagateResourceLimits  or
              PropagateResourceLimitsExcept,  may be specified.  The following
              limit names are supported by Slurm (although  some  options  may
              not be supported on some systems):

              ALL       All limits listed below

              AS        The maximum address space for a processes

              CORE      The maximum size of core file

              CPU       The maximum amount of CPU time

              DATA      The maximum size of a process’s data segment

              FSIZE     The maximum size of files created

              MEMLOCK   The maximum size that may be locked into memory

              NOFILE    The maximum number of open files

              NPROC     The maximum number of processes available

              RSS       The maximum resident set size

              STACK     The maximum stack size

       PropagateResourceLimitsExcept
              A list of comma separated resource limit names.  By default, all
              resource  limits  will  be  propagated,  (as  described  by  the
              PropagateResourceLimits   parameter),   except  for  the  limits
              appearing  in  this  list.    The  user  can  override  this  by
              specifying  which  resource  limits  to  propagate with the srun
              commands  "--propagate"  option.   See   PropagateResourceLimits
              above for a list of valid limit names.

       ReturnToService
              If  set  to  1,  then  a  non-responding (DOWN) node will become
              available for use upon registration. Note that DOWN node’s state
              will   be  changed  only  if  it  was  set  DOWN  due  to  being
              non-responsive. If the node was set DOWN for  any  other  reason
              (low  memory,  prolog  failure, epilog failure, etc.), its state
              will not automatically be changed.   The  default  value  is  0,
              which  means  that  a node will remain in the DOWN state until a
              system administrator explicitly changes its state (even  if  the
              slurmd daemon registers and resumes communications).

       SchedulerRootFilter
              If  set  to  ’1’  then  scheduler will filter and avoid RootOnly
              partitions (let root user or process schedule these partitions).
              Otherwise  scheduler will treat RootOnly partitions as any other
              standard partition.  Currently only supported by  sched/backfill
              schedululer plugin.

       SchedulerPort
              The  port number on which slurmctld should listen for connection
              requests.  This value is only used by the  Maui  Scheduler  (see
              SchedulerType).  The default value is 7321.

       SchedulerRootFilter
              Identifies whether or not RootOnly partitions should be filtered
              from any external scheduling  activities.  If  set  to  0,  then
              RootOnly partitions are treated like any other partition. If set
              to 1, then RootOnly partitions  are  exempt  from  any  external
              scheduling  activities.  The  default value is 1. Currently only
              used by the built-in backfill scheduling module "sched/backfill"
              (see SchedulerType).

       SchedulerType
              Identifies  the  type of scheduler to be used. Acceptable values
              include  "sched/builtin"  for  the  built-in   FIFO   scheduler,
              "sched/backfill" for a backfill scheduling module to augment the
              default FIFO scheduling, "sched/hold" to hold all newly arriving
              jobs  if  a  file  "/etc/slurm.hold"  exists  otherwise  use the
              built-in FIFO scheduler, and "sched/wiki" for the Wiki interface
              to  the  Maui  Scheduler.  The default value is "sched/builtin".
              Backfill scheduling will initiate lower-priority jobs  if  doing
              so  does  not  delay  the expected initiation time of any higher
              priority job.  Note that this backfill scheduler  implementation
              is  relatively simple. It does not support partitions configured
              to to share resources (run multiple jobs on the same  nodes)  or
              support  jobs requesting specific nodes.  When initially setting
              the value to "sched/wiki", any  pending  jobs  must  have  their
              priority  set  to  zero  (held).   When  changing the value from
              "sched/wiki", all pending jobs should have their priority change
              from  zero  to  some  large number.  The scontrol command can be
              used to change job priorities.  The  slurmctld  daemon  must  be
              restarted for a change in scheduler type to become effective.

       SelectType
              Identifies  the type of resource selection algorithm to be used.
              Acceptable values include

              select/linear
                     for allocation of entire nodes assuming a one-dimentional
                     array  of  nodes  in which sequentially ordered nodes are
                     preferable.  This is the default value  for  non-BlueGene
                     systems.

              select/cons_res
                     The resources within a node are individually allocated as
                     consumable resources.   Note  that  whole  nodes  can  be
                     allocated  to  jobs  for selected partitions by using the
                     Shared=EXCLUSIVE  option.   See  the   partition   Shared
                     parameter for more information.

              select/bluegene
                     for  a  three-dimentional  BlueGene  system.  The default
                     value is "select/bluegene" for BlueGene systems.

       SelectTypeParameters
              This only apply for SelectType=select/cons_res.

              CR_CPU CPUs are consumable resources.  There  is  no  notion  of
                     sockets,  cores or threads.  On a multi-core system, each
                     core will  be  consided  a  CPU.   On  a  multi-core  and
                     hyperthreaded  system,  each  thread will be considered a
                     CPU.   On  single-core  systems,  each   CPUs   will   be
                     considered a CPU.

              CR_CPU_Memory
                     CPUs and memory are consumable resources.

              CR_Core
                     Cores are consumable resources.

              CR_Core_Memory
                     Cores and memory are consumable resources.

              CR_Socket
                     Sockets are consumable resources.

              CR_Socket_Memory
                     Memory and CPUs are consumable resources.

              CR_Memory
                     Memory  is  a  consumable  resource.   NOTE: This implies
                     Shared=Yes for all partitions.

       SlurmUser
              The name of the user that the slurmctld daemon executes as.  For
              security purposes, a user other than "root" is recommended.  The
              default value is "root".

       SlurmctldDebug
              The level of detail to provide slurmctld daemon’s logs.   Values
              from  0 to 7 are legal, with ‘0’ being "quiet" operation and ‘7’
              being insanely verbose.  The default value is 3.

       SlurmctldLogFile
              Fully qualified pathname of a  file  into  which  the  slurmctld
              daemon’s  logs are written.  The default value is none (performs
              logging via syslog).

       SlurmctldPidFile
              Fully qualified pathname of a file  into  which  the   slurmctld
              daemon  may write its process id. This may be used for automated
              signal     processing.       The      default      value      is
              "/var/run/slurmctld.pid".

       SlurmctldPort
              The port number that the SLURM controller, slurmctld, listens to
              for work. The default value is SLURMCTLD_PORT as established  at
              system  build  time. If none is explicitly specified, it will be
              set to 6817.  NOTE: Either slurmctld and slurmd daemons must not
              execute  on  the  same  nodes or the values of SlurmctldPort and
              SlurmdPort must be different.

       SlurmctldTimeout
              The interval, in seconds, that the backup controller  waits  for
              the  primary controller to respond before assuming control.  The
              default value is 120 seconds.  May not exceed 65533.

       SlurmdDebug
              The level of detail to provide  slurmd  daemon’s  logs.   Values
              from  0 to 7 are legal, with ‘0’ being "quiet" operation and ‘7’
              being insanely verbose.  The default value is 3.

       SlurmdLogFile
              Fully qualified pathname  of  a  file  into  which  the   slurmd
              daemon’s  logs are written.  The default value is none (performs
              logging via syslog).  Any "%h" within the name is replaced  with
              the hostname on which the slurmd is running.

       SlurmdPidFile
              Fully qualified pathname of a file into which the  slurmd daemon
              may write its process id. This may be used for automated  signal
              processing.  The default value is "/var/run/slurmd.pid".

       SlurmdPort
              The  port  number  that  the  SLURM compute node daemon, slurmd,
              listens to  for  work.  The  default  value  is  SLURMD_PORT  as
              established   at  system  build  time.  If  none  is  explicitly
              specified, its value will be 6818.  NOTE: Either  slurmctld  and
              slurmd  daemons must not execute on the same nodes or the values
              of SlurmctldPort and SlurmdPort must be different.

       SlurmdSpoolDir
              Fully qualified pathname of a directory into  which  the  slurmd
              daemon’s  state information and batch job script information are
              written. This must be a  common  pathname  for  all  nodes,  but
              should  represent  a  directory  which  is  local  to  each node
              (reference  a  local  file  system).  The   default   value   is
              "/var/spool/slurmd."  NOTE: This directory is also used to store
              slurmd’s shared memory  lockfile,  and  should  not  be  changed
              unless the system is being cleanly restarted. If the location of
              SlurmdSpoolDir is changed  and  slurmd  is  restarted,  the  new
              daemon  will attach to a different shared memory region and lose
              track of any running jobs.

       SlurmdTimeout
              The interval, in seconds, that the SLURM  controller  waits  for
              slurmd  to respond before configuring that node’s state to DOWN.
              The default value is 300 seconds.  A value of zero indicates the
              node  will  not  be  tested by slurmctld to confirm the state of
              slurmd, the node will not be automatically set to a  DOWN  state
              indicating  a  non-responsive  slurmd,  and some other tool will
              take responsibility for monitoring the  state  of  each  compute
              node and its slurmd daemon.  The value may not exceed 65533.

       StateSaveLocation
              Fully  qualified  pathname  of  a directory into which the SLURM
              controller,     slurmctld,     saves     its     state     (e.g.
              "/usr/local/slurm/checkpoint").   SLURM state will saved here to
              recover from system failures.  SlurmUser must be able to  create
              files  in  this  directory.   If  you  have  a  BackupController
              configured, this location should be  readable  and  writable  by
              both  systems.   The  default  value  is  "/tmp".   If any slurm
              daemons terminate abnormally, their  core  files  will  also  be
              written into this directory.

       SrunEpilog
              Fully  qualified  pathname  of  an  executable to be run by srun
              following the completion  of  a  job  step.   The  command  line
              arguments  for  the executable will be the command and arguments
              of the job step.  This configuration parameter may be overridden
              by srun’s --epilog parameter.

       SrunProlog
              Fully  qualified  pathname  of  an  executable to be run by srun
              prior to the launch of a job step.  The command  line  arguments
              for  the executable will be the command and arguments of the job
              step.  This configuration parameter may be overridden by  srun’s
              --prolog parameter.

       SwitchType
              Identifies   the   type  of  switch  or  interconnect  used  for
              application   communications.    Acceptable    values    include
              "switch/none"  for switches not requiring special processing for
              job launch or termination (Myrinet, Ethernet,  and  InfiniBand),
              "switch/elan"  for  Quadrics Elan 3 or Elan 4 interconnect.  The
              default value is "switch/none".  All SLURM daemons, commands and
              running  jobs  must  be  restarted for a change in SwitchType to
              take effect.  If running jobs exist at  the  time  slurmctld  is
              restarted with a new value of SwitchType, records of all jobs in
              any state may be lost.

       TaskEpilog
              Fully qualified pathname of a program to be execute as the slurm
              job’s  owner after termination of each task.  See TaskPlugin for
              execution order details.

       TaskPlugin
              Identifies the type of task launch  plugin,  typically  used  to
              provide resource management within a node (e.g. pinning tasks to
              specific processors).  Acceptable values include "task/none" for
              systems  requiring  no  special  handling and "task/affinity" to
              enable the  --cpu_bind  and/or  --mem_bind  srun  options.   The
              default  value  is  "task/none".   If  you  "task/affinity"  and
              encounter problems, it may be due to the variety of system calls
              used  to implement task affinity on different operating systems.
              If that is the case, you may want to use Portable Linux  Process
              Affinity   (PLPA,   see  http://www.open-mpi.org/software/plpa),
              which is supported by SLURM.  The order  of  task  prolog/epilog
              execution is as follows:

              1. pre_launch(): function in TaskPlugin

              2.   TaskProlog:   system-wide   per  task  program  defined  in
              slurm.conf

              3. user prolog: job step specific task program defined using
                     srun’s    --task-prolog   option   or   SLURM_TASK_PROLOG
                     environment variable

              4. Execute the job step’s task

              5. user epilog: job step specific task program defined using
                     srun’s   --task-epilog   option   or    SLURM_TASK_EPILOG
                     environment variable

              6.   TaskEpilog:   system-wide   per  task  program  defined  in
              slurm.conf

              7. post_term(): function in TaskPlugin

       TaskPluginParam
              Optional parameters for the task plugin.

              Cpusets   Use cpusets to perform task affinity functions

              Sched     Use  sched_setaffinity  or  plpa_sched_setaffinity (if
                        available) to bind tasks to processors.  This  is  the
                        default   mode  of  operation  is  no  parameters  are
                        specified.

       TaskProlog
              Fully qualified pathname of a program to be execute as the slurm
              job’s  owner  prior  to  initiation  of  each task.  Besides the
              normal environment variables, this has SLURM_TASK_PID  available
              to  identify the process ID of the task being started.  Standard
              output from this program of the form "export NAME=value" will be
              used  to  set  environment variables for the task being spawned.
              See TaskPlugin for execution order details.

       TmpFS  Fully qualified pathname of the file system  available  to  user
              jobs   for   temporary   storage.  This  parameter  is  used  in
              establishing a node’s  TmpDisk  space.   The  default  value  is
              "/tmp".

       TreeWidth
              Slurmd  daemons  use  a virtual tree network for communications.
              TreeWidth specifies the width of the  tree  (i.e.  the  fanout).
              The  default  value  is  50,  meaning  each  slurmd  daemon  can
              communicate with up to 50 other slurmd  daemons  and  over  2500
              nodes can be contacted with two message hops.  The default value
              will work well for most clusters.  Optimaly  system  performance
              can typically be achieved if TreeWidth is set to the square root
              of the number of nodes in the cluster for systems having no more
              than 2500 nodes or the cube root for larger systems.

       UnkillableStepProgram
              If  the  processes in a job step are determined to be unkillable
              for a period of  time  specified  by  the  UnkillableStepTimeout
              variable,  the  program  specified  by the UnkillableStepProgram
              string will be executed.  This  program  can  be  used  to  take
              special  actions  to  clean  up  the  unkillable processes.  The
              program will be run as the same  user  as  the  slurmd  (usually
              "root").   NOTE:  This variable does not appear in the output of
              the command "scontrol show config" in  versions  of  SLURM  less
              than version 1.3.

       UnkillableStepTimeout
              The  length  of  time,  in  seconds, that SLURM will wait before
              deciding that processes in a job step are unkillable (after they
              have been signalled with SIGKILL).  The default timeout value is
              60 seconds.  NOTE: This variable does not appear in  the  output
              of  the command "scontrol show config" in versions of SLURM less
              than version 1.3.

       UsePAM If set to 1, PAM (Pluggable Authentication  Modules  for  Linux)
              will  be enabled.  PAM is used to establish the upper bounds for
              resource  limits.  With  PAM  support  enabled,   local   system
              administrators can dynamically configure system resource limits.
              Changing the upper bound of a resource limit will not alter  the
              limits  of  running  jobs,  only jobs started after a change has
              been made will pick up the new limits.  The default value  is  0
              (not to enable PAM support).  Remember that PAM also needs to be
              configured to support SLURM as a service.  For sites using PAM’s
              directory based configuration option, a configuration file named
              slurm should be created.  The  module-type,  control-flags,  and
              module-path names that should be included in the file are:
              auth        required      pam_localuser.so
              auth        required      pam_shells.so
              account     required      pam_unix.so
              account     required      pam_access.so
              session     required      pam_unix.so
              For sites configuring PAM with a general configuration file, the
              appropriate lines (see above), where slurm is the  service-name,
              should be added.

       WaitTime
              Specifies  how  many  seconds the srun command should by default
              wait after the first  task  terminates  before  terminating  all
              remaining  tasks.  The  "--wait" option on the srun command line
              overrides this value.  If set to 0, this  feature  is  disabled.
              May not exceed 65533.

       The configuration of nodes (or machines) to be managed by Slurm is also
       specified in /etc/slurm.conf.  Only the NodeName must  be  supplied  in
       the  configuration  file.   All other node configuration information is
       optional.  It is advisable to establish baseline  node  configurations,
       especially  if  the  cluster is heterogeneous.  Nodes which register to
       the system with less than the configured  resources  (e.g.  too  little
       memory), will be placed in the "DOWN" state to avoid scheduling jobs on
       them.  Establishing baseline configurations  will  also  speed  SLURM’s
       scheduling process by permitting it to compare job requirements against
       these (relatively few)  configuration  parameters  and  possibly  avoid
       having  to  check  job  requirements  against  every  individual node’s
       configuration.  The resources checked at node  registration  time  are:
       Procs, RealMemory and TmpDisk.  While baseline values for each of these
       can be established in the configuration file, the  actual  values  upon
       node  registration are recorded and these actual values may be used for
       scheduling purposes (depending upon the value of  FastSchedule  in  the
       configuration file.

       Default  values  can  be specified with a record in which "NodeName" is
       "DEFAULT".  The default entry values will apply only to lines following
       it  in  the  configuration  file  and  the  default values can be reset
       multiple times in the configuration file with  multiple  entries  where
       "NodeName=DEFAULT".   The  "NodeName="  specification must be placed on
       every line describing the configuration  of  nodes.   In  fact,  it  is
       generally  possible  and  desirable to define the configurations of all
       nodes in  only  a  few  lines.   This  convention  permits  significant
       optimization in the scheduling of larger clusters.  In order to support
       the concept of jobs requiring consecutive nodes on some  architectures,
       node  specifications should be place in this file in consecutive order.
       No single node name may be listed more than once in  the  configuration
       file.   Use  "DownNodes="  to  record  the  state  of  nodes  which are
       temporarily in  a  DOWN  or  DRAIN  state  without  altering  permanent
       configuration  information.   A job step’s tasks are allocated to nodes
       in order the nodes appear in the configuration file. There is presently
       no capability within SLURM to arbitarily order a job step’s tasks.

       Multiple  node  names  may be comma separated (e.g. "alpha,beta,gamma")
       and/or a simple node range expression may optionally be used to specify
       numeric  ranges  of  nodes  to avoid building a configuration file with
       large numbers of entries.  The node range expression  can  contain  one
       pair  of  square  brackets  with  a sequence of comma separated numbers
       and/or ranges of numbers separated by a "-" (e.g. "linux[0-64,128]", or
       "lx[15,18,32-33]").   Note  that  the numeric ranges can include one or
       more leading zeros to indicate the numeric portion has a  fixed  number
       of digits (e.g. "linux[0000-1023]").

       On  BlueGene  systems only, the square brackets should contain pairs of
       three digit numbers separated by a "x".   These  numbers  indicate  the
       boundaries  of  a rectangular prism (e.g. "bgl[000x144,400x544]").  See
       BlueGene documentation for more details.  Presently the  numeric  range
       must be the last characters in the node name (e.g. "unit[0-31]rack1" is
       invalid).  The node configuration specified the following information:

       NodeName
              Name that SLURM uses to refer to a node (or base  partition  for
              BlueGene  systems).   Typically  this  would  be the string that
              "/bin/hostname -s" returns, however it may be an arbitary string
              if NodeHostname is specified.  If the NodeName is "DEFAULT", the
              values specified with that record will apply to subsequent  node
              specifications  unless  explicitly  set  to other values in that
              node record or replaced with a different set of default  values.
              For  architectures in which the node order is significant, nodes
              will be  considered  consecutive  in  the  order  defined.   For
              example, if the configuration for "NodeName=charlie" immediately
              follows the configuration  for  "NodeName=baker"  they  will  be
              considered adjacent in the computer.

       NodeHostname
              The  string  that  "/bin/hostname  -s"  returns.   A  node range
              expression can be used  to  specify  a  set  of  nodes.   If  an
              expression   is   used,   the  number  of  nodes  identified  by
              NodeHostname on  a  line  in  the  configuration  file  must  be
              identical  to  the  number  of nodes identified by NodeName.  By
              default,  the  NodeHostname  will  be  identical  in  value   to
              NodeName.

       NodeAddr
              Name  that  a  node  should  be  referred  to  in establishing a
              communications path.  This name will be used as an  argument  to
              the  gethostbyname()  function  for  identification.   If a node
              range expression is used to designate multiple nodes, they  must
              exactly    match    the    entries   in   the   NodeName   (e.g.
              "NodeName=lx[0-7]  NodeAddr="elx[0-7]").   NodeAddr   may   also
              contain   IP  addresses.   By  default,  the  NodeAddr  will  be
              identical in value to NodeName.

       Feature
              A comma delimited list of arbitrary strings indicative  of  some
              characteristic  associated  with  the  node.   There is no value
              associated with a feature at this time,  a  node  either  has  a
              feature  or  it  does  not.   If desired a feature may contain a
              numeric component indicating, for example, processor speed.   By
              default a node has no features.

       RealMemory
              Size of real memory on the node in MegaBytes (e.g. "2048").  The
              default value is 1.

       Procs  Number of logical processors on the node (e.g. "2").   If  Procs
              is  omitted,  it  will be inferred from Sockets, CoresPerSocket,
              and ThreadsPerCore.  The default value is 1.

       Sockets
              Number of physical processor sockets/chips  on  the  node  (e.g.
              "2").   If  Sockets  is omitted, it will be inferred from Procs,
              CoresPerSocket,  and  ThreadsPerCore.    NOTE:   If   you   have
              multi-core  processors,  you  will  likely need to specify these
              parameters.  The default value is 1.

       CoresPerSocket
              Number of cores in a  single  physical  processor  socket  (e.g.
              "2").   The  CoresPerSocket  value describes physical cores, not
              the logical number of processors per socket.  NOTE: If you  have
              multi-core  processors,  you  will  likely  need to specify this
              parameter.  The default value is 1.

       ThreadsPerCore
              Number of logical threads in a single physical core (e.g.  "2").
              The default value is 1.

       Reason Identifies  the  reason  for  a  node  being  in state "DOWN" or
              "DRAIN".  Use quotes to enclose a reason having  more  than  one
              word.

       State  State  of  the node with respect to the initiation of user jobs.
              Acceptable values are "DOWN",  "DRAIN"  and  "UNKNOWN".   "DOWN"
              indicates  the  node  failed  and is unavailable to be allocated
              work.  "DRAIN" indicates the node is unavailable to be allocated
              work.   "UNKNOWN"  indicates the node’s state is undefined (BUSY
              or IDLE), but will be established when the slurmd daemon on that
              node  registers.   The default value is "UNKNOWN".  Also see the
              DownNodes paramter below.

       TmpDisk
              Total size of temporary disk storage in TmpFS in MegaBytes (e.g.
              "16384").  TmpFS  (for  "Temporary  File System") identifies the
              location which jobs should use for temporary storage.  Note this
              does not indicate the amount of free space available to the user
              on the node,  only  the  total  file  system  size.  The  system
              administration  should  insure  this  file  system  is purged as
              needed so that user jobs have access to most of this space.  The
              Prolog  and/or  Epilog  programs (specified in the configuration
              file) might be used to insure the file  system  is  kept  clean.
              The default value is 1.

       Weight The  priority  of  the node for scheduling purposes.  All things
              being equal, jobs will be allocated the nodes  with  the  lowest
              weight  which  satisfies  their  requirements.   For  example, a
              heterogeneous collection of nodes might be placed into a  single
              partition  for  greater  system  utilization, responsiveness and
              capability. It would be preferable to  allocate  smaller  memory
              nodes  rather  than larger memory nodes if either will satisfy a
              job’s requirements.  The units  of  weight  are  arbitrary,  but
              larger weights should be assigned to nodes with more processors,
              memory, disk space, higher processor speed, etc.  Weight  is  an
              integer value with a default value of 1.

       The  "DownNodes=" configuration permits you to mark certain nodes as in
       a DOWN or DRAIN state  without  altering  the  permanent  configuration
       information listed under a "NodeName=" specification.

       DownNodes
              Any  node  name,  or  list  of  node names, from the "NodeName="
              specifications.

       Reason Identifies the reason for  a  node  being  in  state  "DOWN"  or
              "DRAIN".   Use  quotes  to enclose a reason having more than one
              word.

       State  State of the node with respect to the initiation of  user  jobs.
              Acceptable  values  are  "DOWN",  "DRAIN" and "UNKNOWN".  "DOWN"
              indicates the node failed and is  unavailable  to  be  allocated
              work.  "DRAIN" indicates the node is unavailable to be allocated
              work.  "UNKNOWN" indicates the node’s state is  undefined  (BUSY
              or IDLE), but will be established when the slurmd daemon on that
              node registers.  The default value is "UNKNOWN".

       The partition configuration permits  you  to  establish  different  job
       limits  or access controls for various groups (or partitions) of nodes.
       Nodes may be in more than one partition,  making  partitions  serve  as
       general  purpose queues.  For example one may put the same set of nodes
       into two different partitions, each with  different  constraints  (time
       limit, job sizes, groups allowed to use the partition, etc.).  Jobs are
       allocated resources within a single partition.  Default values  can  be
       specified  with  a  record  in which "PartitionName" is "DEFAULT".  The
       default entry values will apply only  to  lines  following  it  in  the
       configuration  file  and the default values can be reset multiple times
       in   the   configuration   file    with    multiple    entries    where
       "PartitionName=DEFAULT".   The  "PartitionName="  specification must be
       placed on every line describing the configuration of partitions.  NOTE:
       Put  all  parameters for each partition on a single line.  Each line of
       partition  configuration  information  should  represent  a   different
       partition.   The  partition  configuration  file contains the following
       information:

       AllowGroups
              Comma separated list of group IDs which may execute jobs in  the
              partition.   If  at  least  one  group  associated with the user
              attempting to execute the job is  in  AllowGroups,  he  will  be
              permitted to use this partition.  Jobs executed as user root can
              use any partition without regard to the  value  of  AllowGroups.
              If  user  root  attempts  to execute a job as another user (e.g.
              using srun’s --uid option), this other user must be  in  one  of
              groups  identified  by  AllowGroups  for  the job to succesfully
              execute.  The default value is "ALL".

       Default
              If this keyword is  set,  jobs  submitted  without  a  partition
              specification  will utilize this partition.  Possible values are
              "YES" and "NO".  The default value is "NO".

       Hidden Specifies if the partition and its jobs  are  to  be  hidden  by
              default.   Hidden  partitions will by default not be reported by
              the SLURM APIs or commands.  Possible values are "YES" and "NO".
              The default value is "NO".

       RootOnly
              Specifies  if  only  user  ID zero (i.e. user root) may allocate
              resources in this partition. User root  may  allocate  resources
              for  any  other  user, but the request must be initiated by user
              root.  This option can be useful for a partition to  be  managed
              by  some  external  entity (e.g. a higher-level job manager) and
              prevents users from directly using  those  resources.   Possible
              values are "YES" and "NO".  The default value is "NO".

       MaxNodes
              Maximum count of nodes (or base partitions for BlueGene systems)
              which may be allocated to any single job.  The default value  is
              "UNLIMITED",  which is represented internally as -1.  This limit
              does not apply to jobs executed by SlurmUser or user root.

       MaxTime
              Maximum wall-time limit for any  job  in  minutes.  The  default
              value  is  "UNLIMITED",  which  is represented internally as -1.
              This limit does not apply to jobs executed by SlurmUser or  user
              root.

       MinNodes
              Minimum count of nodes (or base partitions for BlueGene systems)
              which may be allocated to any single job.  The default value  is
              1.   This  limit does not apply to jobs executed by SlurmUser or
              user root.

       Nodes  Comma separated list of nodes (or base partitions  for  BlueGene
              systems)  which  are associated with this partition.  Node names
              may  be  specified  using  the  node  range  expression   syntax
              described  above.  A blank list of nodes (i.e. "Nodes= ") can be
              used if one wants a partition to exist, but  have  no  resources
              (possibly on a temporary basis).

       PartitionName
              Name   by   which   the   partition   may  be  referenced  (e.g.
              "Interactive").  This  name  can  be  specified  by  users  when
              submitting  jobs.  If the PartitionName is "DEFAULT", the values
              specified with that record will apply  to  subsequent  partition
              specifications  unless  explicitly  set  to other values in that
              partition record or replaced with a  different  set  of  default
              values.

       Shared Ability  of the partition to execute more than one job at a time
              on each node. Shared nodes will offer unpredictable  performance
              for   application   programs,  but  can  provide  higher  system
              utilization  and   responsiveness   than   otherwise   possible.
              Possible  values  are  "EXCLUSIVE",  "FORCE",  "YES",  and "NO".
              "EXCLUSIVE"  allocates  entire   nodes   to   jobs   even   with
              select/cons_res  configured.  This can be used to allocate whole
              nodes in some partitions  and  individual  processors  in  other
              partitions.   "FORCE" makes all nodes in the partition available
              for sharing without user means of  disabling  it.   "YES"  makes
              nodes  in the partition available for sharing if and only if the
              individual jobs permit sharing (see the srun "--share"  option).
              "NO"   makes   nodes   unavailable   for   sharing   under   all
              circumstances.  The default value is "NO".

       State  State of partition or availability for use.  Possible values are
              "UP" or "DOWN". The default value is "UP".

RELOCATING CONTROLLERS

       If  the  cluster’s  computers used for the primary or backup controller
       will be out of service for an  extended  period  of  time,  it  may  be
       desirable to relocate them.  In order to do so, follow this procedure:

       1. Stop the SLURM daemons
       2. Modify the slurm.conf file appropriately
       3. Distribute the updated slurm.conf file to all nodes
       4. Restart the SLURM daemons

       There  should  be  no loss of any running or pending jobs.  Insure that
       any nodes added  to  the  cluster  have  the  current  slurm.conf  file
       installed.

       CAUTION:  If  two  nodes  are  simultaneously configured as the primary
       controller (two nodes on which ControlMachine specify  the  local  host
       and the slurmctld daemon is executing on each), system behavior will be
       destructive.  If a compute node  has  an  incorrect  ControlMachine  or
       BackupController  parameter, that node may be rendered unusable, but no
       other harm will result.

EXAMPLE

       #
       # Sample /etc/slurm.conf for dev[0-25].llnl.gov
       # Author: John Doe
       # Date: 11/06/2001
       #
       ControlMachine=dev0
       ControlAddr=edev0
       BackupController=dev1
       BackupAddr=edev1
       #
       AuthType=auth/authd
       Epilog=/usr/local/slurm/epilog
       Prolog=/usr/local/slurm/prolog
       FastSchedule=1
       FirstJobId=65536
       HeartbeatInterval=60
       InactiveLimit=120
       JobCompType=jobcomp/filetxt
       JobCompLoc=/var/log/slurm.job.log
       KillWait=30
       MaxJobCount=10000
       MinJobAge=3600
       PluginDir=/usr/local/lib:/usr/local/slurm/lib
       ReturnToService=0
       SchedulerType=sched/wiki
       SchedulerPort=7004
       SlurmctldLogFile=/var/log/slurmctld.log
       SlurmdLogFile=/var/log/slurmd.log
       SlurmctldPort=7002
       SlurmdPort=7003
       SlurmdSpoolDir=/usr/local/slurm/slurmd.spool
       StateSaveLocation=/usr/local/slurm/slurm.state
       SwitchType=switch/elan
       TmpFS=/tmp
       WaitTime=30
       JobCredentialPrivateKey=/usr/local/slurm/private.key
       JobCredentialPublicCertificate=/usr/local/slurm/public.cert
       JobAcctType=jobacct/linux
       JobAcctLogFile=/var/log/slurm_accounting.log
       JobAcctParameters="Frequency=30,MaxSendRetries=5"
       #
       # Node Configurations
       #
       NodeName=DEFAULT Procs=2 RealMemory=2000 TmpDisk=64000
       NodeName=DEFAULT State=UNKNOWN
       NodeName=dev[0-25] NodeAddr=edev[0-25] Weight=16
       # Update records for specific DOWN nodes
       DownNodes=dev20 State=DOWN Reason="power,ETA=Dec25"
       #
       # Partition Configurations
       #
       PartitionName=DEFAULT MaxTime=30 MaxNodes=10 State=UP
       PartitionName=debug Nodes=dev[0-8,18-25] Default=YES
       PartitionName=batch Nodes=dev[9-17]  MinNodes=4
       PartitionName=long Nodes=dev[9-17] MaxTime=120 AllowGroups=admin

COPYING

       Copyright (C) 2002-2007 The Regents of the  University  of  California.
       Produced  at  Lawrence  Livermore National Laboratory (cf, DISCLAIMER).
       UCRL-CODE-226842.

       This file is  part  of  SLURM,  a  resource  management  program.   For
       details, see <http://www.llnl.gov/linux/slurm/>.

       SLURM  is free software; you can redistribute it and/or modify it under
       the terms of the GNU General Public License as published  by  the  Free
       Software  Foundation;  either  version  2  of  the License, or (at your
       option) any later version.

       SLURM is distributed in the hope that it will be  useful,  but  WITHOUT
       ANY  WARRANTY;  without even the implied warranty of MERCHANTABILITY or
       FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General  Public  License
       for more details.

FILES

       /etc/slurm.conf

SEE ALSO

       bluegene.conf(5),     getrlimit(2),     gethostbyname(3),     group(5),
       hostname(1), scontrol(1), slurmctld(8), slurmd(8), spank(8), syslog(2),
       wiki.conf(5)