Provided by: slurm-llnl_2.1.0-1_i386 bug

NAME

       slurm.conf - Slurm configuration file

DESCRIPTION

       /etc/slurm.conf   is  an  ASCII  file  which  describes  general  SLURM
       configuration information, the nodes to be managed,  information  about
       how  those  nodes  are  grouped into partitions, and various scheduling
       parameters associated  with  those  partitions.  This  file  should  be
       consistent across all nodes in the cluster.

       The  file  location  can  be  modified  at  system build time using the
       DEFAULT_SLURM_CONF parameter. In addition, you can use  the  SLURM_CONF
       environment  variable  to  override the built-in location of this file.
       The SLURM daemons also allow you to  override  both  the  built-in  and
       environment-provided  location  using  the  "-f"  option on the command
       line.

       Note the while SLURM daemons  create  log  files  and  other  files  as
       needed,  it  treats  the  lack  of parent directories as a fatal error.
       This prevents the daemons from running if critical file systems are not
       mounted  and  will minimize the risk of cold-starting (starting without
       preserving jobs).

       The contents of the file are case insensitive except for the  names  of
       nodes  and  partitions.  Any  text following a "#" in the configuration
       file is treated as a comment through the end of that line.  The size of
       each  line  in  the file is limited to 1024 characters.  Changes to the
       configuration file take effect upon restart of  SLURM  daemons,  daemon
       receipt  of  the  SIGHUP  signal, or execution of the command "scontrol
       reconfigure" unless otherwise noted.

       If a line begins with the word "Include"  followed  by  whitespace  and
       then  a  file  name, that file will be included inline with the current
       configuration file.

       The overall configuration parameters available include:

       AccountingStorageBackupHost
              The name of the backup machine hosting  the  accounting  storage
              database.   If used with the accounting_storage/slurmdbd plugin,
              this is where the backup slurmdbd would be running.   Only  used
              for database type storage plugins, ignored otherwise.

       AccountingStorageEnforce
              This controls what level of enforcement you want on associations
              when new jobs are submitted.  Valid options are any  combination
              of  associations, limits, and wckeys, or all for all things.  If
              limits is set associations is implied.  If wckeys  is  set  both
              limits  and associations are implied along with TrackWckey being
              set.  By enforcing Associations no new job  is  allowed  to  run
              unless  a  corresponding  association  exists in the system.  If
              limits are enforced users can be limited by association  to  how
              many  nodes  or  how  long  jobs  can run or other limits.  With
              wckeys enforced jobs  will  not  be  scheduled  unless  a  valid
              workload  characterization key is specified.  This value may not
              be reset via "scontrol reconfig".  It  only  takes  effect  upon
              restart of the slurmctld daemon.

       AccountingStorageHost
              The name of the machine hosting the accounting storage database.
              Only used for database type storage plugins, ignored  otherwise.
              Also see DefaultStorageHost.

       AccountingStorageLoc
              The  fully  qualified  file  name  where  accounting records are
              written      when       the       AccountingStorageType       is
              "accounting_storage/filetxt"  or  else  the name of the database
              where    accounting    records    are    stored     when     the
              AccountingStorageType     is     a     database.     Also    see
              DefaultStorageLoc.

       AccountingStoragePass
              The password used to gain access to the database  to  store  the
              accounting  data.   Only used for database type storage plugins,
              ignored otherwise.  In the case of SLURM DBD  (Database  Daemon)
              with  Munge authentication this can be configured to use a Munge
              daemon specifically configured to provide authentication between
              clusters  while the default Munge daemon provides authentication
              within a cluster.  In that  case,  AccountingStoragePass  should
              specify  the  named  port to be used for communications with the
              alternate Munge daemon (e.g.  "/var/run/munge/global.socket.2").
              The default value is NULL.  Also see DefaultStoragePass.

       AccountingStoragePort
              The  listening  port  of the accounting storage database server.
              Only used for database type storage plugins, ignored  otherwise.
              Also see DefaultStoragePort.

       AccountingStorageType
              The  accounting  storage  mechanism  type.  Acceptable values at
              present          include           "accounting_storage/filetxt",
              "accounting_storage/mysql",           "accounting_storage/none",
              "accounting_storage/pgsql",  and  "accounting_storage/slurmdbd".
              The "accounting_storage/filetxt" value indicates that accounting
              records  will  be  written  to  the  file   specified   by   the
              AccountingStorageLoc  parameter.  The "accounting_storage/mysql"
              value indicates that accounting records will  be  written  to  a
              MySQL  database specified by the AccountingStorageLoc parameter.
              The "accounting_storage/pgsql" value indicates  that  accounting
              records  will  be  written to a PostgreSQL database specified by
              the         AccountingStorageLoc         parameter.          The
              "accounting_storage/slurmdbd"  value  indicates  that accounting
              records will be written to  the  SLURM  DBD,  which  manages  an
              underlying  MySQL or PostgreSQL database. See "man slurmdbd" for
              more      information.       The      default      value      is
              "accounting_storage/none" and indicates that account records are
              not maintained.  Note: the PostgreSQL plugin is not complete and
              should  not  be  used  if  wanting to use associations.  It will
              however work with basic accounting of jobs and  job  steps.   If
              interested in completing, please email slurm-dev@lists.llnl.gov.
              Also see DefaultStorageType.

       AccountingStorageUser
              The user account for accessing the accounting storage  database.
              Only  used for database type storage plugins, ignored otherwise.
              Also see DefaultStorageUser.

       AuthType
              The  authentication  method  for  communications  between  SLURM
              components.   Acceptable  values at present include "auth/none",
              "auth/authd",  and   "auth/munge".    The   default   value   is
              "auth/munge".     "auth/none"   includes   the   UID   in   each
              communication, but it is not verified.  This  may  be  fine  for
              testing  purposes,  but do not use "auth/none" if you desire any
              security.  "auth/authd" indicates that Brett Chun’s authd is  to
              be   used   (see   "http://www.theether.org/authd/"   for   more
              information. Note that authd is no longer  actively  supported).
              "auth/munge"  indicates that LLNL’s MUNGE is to be used (this is
              the best  supported  authentication  mechanism  for  SLURM,  see
              "http://home.gna.org/munge/"  for  more information).  All SLURM
              daemons and commands must be terminated prior  to  changing  the
              value  of  AuthType  and  later  restarted  (SLURM  jobs  can be
              preserved).

       BackupAddr
              The  name  that  BackupController  should  be  referred  to   in
              establishing a communications path. This name will be used as an
              argument to the gethostbyname() function for identification. For
              example,  "elx0000"  might  be  used  to  designate the Ethernet
              address for node "lx0000".  By default the  BackupAddr  will  be
              identical in value to BackupController.

       BackupController
              The  name of the machine where SLURM control functions are to be
              executed in the event that ControlMachine fails. This  node  may
              also  be  used  as  a compute server if so desired. It will come
              into  service  as  a  controller  only  upon  the   failure   of
              ControlMachine  and  will  revert  to  a "standby" mode when the
              ControlMachine becomes available once again.  This should  be  a
              node  name  without  the  full domain name.   I.e., the hostname
              returned by the gethostname() function  cut  at  the  first  dot
              (e.g.  use  "tux001"  rather  than  "tux001.my.com").  While not
              essential,  it  is  recommended  that  you  specify   a   backup
              controller.   See   the  RELOCATING  CONTROLLERS  section if you
              change this.

       BatchStartTimeout
              The maximum time (in seconds) that a batch job is permitted  for
              launching  before  being  considered  missing  and releasing the
              allocation. The default value is 10 (seconds). Larger values may
              be required if more time is required to execute the Prolog, load
              user environment variables (for Moab spawned jobs),  or  if  the
              slurmd daemon gets paged from memory.

       CacheGroups
              If  set to 1, the slurmd daemon will  cache /etc/groups entries.
              This can improve performance for highly  parallel  jobs  if  NIS
              servers  are  used  and  unable  to  respond  very quickly.  The
              default value is 0 to disable caching group data.

       CheckpointType
              The system-initiated checkpoint method to be used for user jobs.
              The   slurmctld  daemon  must  be  restarted  for  a  change  in
              CheckpointType  to  take  effect.   Supported  values  presently
              include:

              checkpoint/aix    for AIX systems only

              checkpoint/blcr   Berkeley Lab Checkpoint Restart (BLCR)

              checkpoint/none   no checkpoint support (default)

              checkpoint/ompi   OpenMPI (version 1.3 or higher)

              checkpoint/xlch   XLCH (requires that SlurmUser be root)

       ClusterName
              The  name  by  which  this SLURM managed cluster is known in the
              accounting database.   This  is  needed  distinguish  accounting
              records when multiple clusters report to the same database.

       CompleteWait
              The  time,  in  seconds, given for a job to remain in COMPLETING
              state before any additional jobs are scheduled.  If set to zero,
              pending  jobs  will  be  started  as  soon as possible.  Since a
              COMPLETING job’s resources are released for use by other jobs as
              soon  as  the Epilog completes on each individual node, this can
              result in very fragmented resource allocations.  To provide jobs
              with  the  minimum response time, a value of zero is recommended
              (no waiting).  To minimize fragmentation of resources,  a  value
              equal  to  KillWait  plus  two  is  recommended.   In that case,
              setting KillWait to  a  small  value  may  be  beneficial.   The
              default  value  of  CompleteWait is zero seconds.  The value may
              not exceed 65533.

       ControlAddr
              Name that ControlMachine should be referred to in establishing a
              communications  path.  This  name will be used as an argument to
              the gethostbyname() function for  identification.  For  example,
              "elx0000"  might  be  used to designate the Ethernet address for
              node "lx0000".  By default the ControlAddr will be identical  in
              value to ControlMachine.

       ControlMachine
              The  short hostname of the machine where SLURM control functions
              are executed (i.e. the name returned by  the  command  "hostname
              -s", use "tux001" rather than "tux001.my.com").  This value must
              be specified.   In  order  to  support  some  high  availability
              architectures,  multiple  hostnames  may  be  listed  with comma
              separators and one  ControlAddr  must  be  specified.  The  high
              availability  system  must  insure  that the slurmctld daemon is
              running on  only  one  of  these  hosts  at  a  time.   See  the
              RELOCATING CONTROLLERS section if you change this.

       CryptoType
              The  cryptographic  signature tool to be used in the creation of
              job step credentials.  The slurmctld daemon  must  be  restarted
              for a change in CryptoType to take effect.  Acceptable values at
              present  include  "crypto/munge"  and   "crypto/openssl".    The
              default value is "crypto/munge".

       DebugFlags
              Defines  specific  subsystems which should provide more detailed
              event logging.  Multiple subsystems can be specified with  comma
              separators.   Valid  subsystems  available  today  (with more to
              come) include:

              CPU_Bind       CPU binding details for jobs and steps

              Steps          Slurmctld resource allocation for job steps

              Triggers       Slurmctld triggers

              Wiki           Sched/wiki and wiki2 communications

       DefMemPerCPU
              Default  real  memory  size  available  per  allocated  CPU   in
              MegaBytes.   Used  to  avoid over-subscribing memory and causing
              paging.  DefMemPerCPU would  generally  be  used  if  individual
              processors  are  allocated to jobs (SelectType=select/cons_res).
              The default value is 0 (unlimited).  Also see DefMemPerNode  and
              MaxMemPerCPU.    DefMemPerCPU  and  DefMemPerNode  are  mutually
              exclusive.   NOTE:  Enforcement  of  memory   limits   currently
              requires  enabling  of accounting, which samples memory use on a
              periodic basis (data need not be stored, just collected).

       DefMemPerNode
              Default  real  memory  size  available  per  allocated  node  in
              MegaBytes.   Used  to  avoid over-subscribing memory and causing
              paging.  DefMemPerNode would generally be used  if  whole  nodes
              are  allocated  to jobs (SelectType=select/linear) and resources
              are shared (Shared=yes or Shared=force).  The default value is 0
              (unlimited).    Also   see   DefMemPerCPU   and   MaxMemPerNode.
              DefMemPerCPU and DefMemPerNode are  mutually  exclusive.   NOTE:
              Enforcement  of  memory  limits  currently  requires enabling of
              accounting, which samples memory use on a periodic  basis  (data
              need not be stored, just collected).

       DefaultStorageHost
              The  default  name of the machine hosting the accounting storage
              and job completion  databases.   Only  used  for  database  type
              storage   plugins   and   when   the  AccountingStorageHost  and
              JobCompHost have not been defined.

       DefaultStorageLoc
              The fully qualified file name where  accounting  records  and/or
              job  completion  records are written when the DefaultStorageType
              is "filetxt" or  the  name  of  the  database  where  accounting
              records  and/or  job  completion  records  are  stored  when the
              DefaultStorageType is a database.  Also see AccountingStorageLoc
              and JobCompLoc.

       DefaultStoragePass
              The  password  used  to gain access to the database to store the
              accounting and job completion data.  Only used for database type
              storage     plugins,     ignored     otherwise.      Also    see
              AccountingStoragePass and JobCompPass.

       DefaultStoragePort
              The  listening  port  of  the  accounting  storage  and/or   job
              completion database server.  Only used for database type storage
              plugins, ignored otherwise.  Also see AccountingStoragePort  and
              JobCompPort.

       DefaultStorageType
              The  accounting  and  job  completion  storage  mechanism  type.
              Acceptable values at present include "filetxt", "mysql", "none",
              "pgsql",  and  "slurmdbd".   The  value "filetxt" indicates that
              records will be written to a file.  The value "mysql"  indicates
              that  accounting  records  will  be written to a mysql database.
              The default value is "none", which means that  records  are  not
              maintained.   The  value  "pgsql" indicates that records will be
              written  to  a  PostgreSQL  database.   The   value   "slurmdbd"
              indicates  that  records will be written to the SLURM DBD, which
              maintains  its  own  database.  See  "man  slurmdbd"  for   more
              information.  Also see AccountingStorageType and JobCompType.

       DefaultStorageUser
              The user account for accessing the accounting storage and/or job
              completion  database.   Only  used  for  database  type  storage
              plugins,  ignored otherwise.  Also see AccountingStorageUser and
              JobCompUser.

       DisableRootJobs
              If set to "YES" then user root will be  prevented  from  running
              any  jobs.  The default value is "NO", meaning user root will be
              able to execute  jobs.   DisableRootJobs  may  also  be  set  by
              partition.

       EnforcePartLimits
              If set to "YES" then jobs which exceed a partition’s size and/or
              time limits will be rejected at submission time. If set to  "NO"
              then  the  job  will  be  accepted  and  remain queued until the
              partition limits are altered.  The default value is "NO".

       Epilog Fully qualified pathname of a script to execute as user root  on
              every    node    when    a    user’s    job    completes   (e.g.
              "/usr/local/slurm/epilog"). This may be  used  to  purge  files,
              disable  user  login,  etc.  By default there is no epilog.  See
              Prolog and Epilog Scripts for more information.

       EpilogMsgTime
              The number of microseconds the the slurmctld daemon requires  to
              process  an  epilog  completion message from the slurmd dameons.
              This parameter  can  be  used  to  prevent  a  burst  of  epilog
              completion  messages  from  being  sent  at  the same time which
              should help prevent lost messages  and  improve  throughput  for
              large jobs.  The default value is 2000 microseconds.  For a 1000
              node job, this spreads the epilog completion messages  out  over
              two seconds.

       EpilogSlurmctld
              Fully  qualified  pathname  of  a  program  for the slurmctld to
              execute   upon   termination   of   a   job   allocation   (e.g.
              "/usr/local/slurm/epilog_controller").   The program executes as
              SlurmUser, which gives it permission to drain nodes and  requeue
              the  job  if  a failure occurs or cancel the job if appropriate.
              The program can be used to reboot nodes or perform other work to
              prepare  resources  for  use.  See Prolog and Epilog Scripts for
              more information.

       FastSchedule
              Controls how a node’s configuration specifications in slurm.conf
              are  used.   If  the number of node configuration entries in the
              configuration file is significantly lower  than  the  number  of
              nodes,  setting  FastSchedule  to  1  will  permit  much  faster
              scheduling decisions to be made.  (The scheduler can just  check
              the  values  in  a few configuration records instead of possibly
              thousands  of  node  records.)   Note  that  on   systems   with
              hyper-threading,  the  processor count reported by the node will
              be twice the actual processor count.  Consider which  value  you
              want to be used for scheduling purposes.

              1 (default)
                   Consider   the  configuration  of  each  node  to  be  that
                   specified in the slurm.conf configuration file and any node
                   with less than the configured resources will be set DOWN.

              0    Base  scheduling decisions upon the actual configuration of
                   each individual node except that the node’s processor count
                   in  SLURM’s  configuration  must  match the actual hardware
                   configuration      if      SchedulerType=sched/gang      or
                   SelectType=select/cons_res  are  configured  (both of those
                   plugins  maintain  resource  allocation  information  using
                   bitmaps for the cores in the system and must remain static,
                   while the node’s memory and disk space can  be  established
                   later).

              2    Consider   the  configuration  of  each  node  to  be  that
                   specified in the slurm.conf configuration file and any node
                   with  less  than  the  configured resources will not be set
                   DOWN.  This can be useful for testing purposes.

       FirstJobId
              The job id to be used for the first submitted to SLURM without a
              specific   requested   value.   Job  id  values  generated  will
              incremented by 1 for each subsequent job. This may  be  used  to
              provide  a  meta-scheduler with a job id space which is disjoint
              from the interactive jobs.  The default value is 1.

       GetEnvTimeout
              Used for Moab scheduled jobs only. Controls how long job  should
              wait  in  seconds  for  loading  the  user’s  environment before
              attempting to load it from a cache file. Applies when  the  srun
              or sbatch --get-user-env option is used. If set to 0 then always
              load the user’s environment from the cache  file.   The  default
              value is 2 seconds.

       HealthCheckInterval
              The     interval    in    seconds    between    executions    of
              HealthCheckProgram.  The default value is zero,  which  disables
              execution.

       HealthCheckProgram
              Fully  qualified  pathname  of  a script to execute as user root
              periodically on all compute nodes  that  are  not  in  the  DOWN
              state.  This may be used to verify the node is fully operational
              and DRAIN the node or send email if a problem is detected.   Any
              action  to  be taken must be explicitly performed by the program
              (e.g.  execute   "scontrol   update   NodeName=foo   State=drain
              Reason=tmp_file_system_full"  to drain a node).  The interval is
              controlled using the HealthCheckInterval parameter.   Note  that
              the  HealthCheckProgram will be executed at the same time on all
              nodes to minimize  its  impact  upon  parallel  programs.   This
              program  is  will  be  killed  if it does not terminate normally
              within 60 seconds.  By default, no program will be executed.

       InactiveLimit
              The interval, in seconds, a job or job step is permitted  to  be
              inactive  before  it  is  terminated.  A  job  or  job  step  is
              considered inactive  if  the  associated  srun  command  is  not
              responding   to   slurm  daemons.  This  could  be  due  to  the
              termination of the srun  command  or  the  program  being  is  a
              stopped  state.  A batch job is considered inactive if it has no
              active job steps (e.g. periods  of  pre-  and  post-processing).
              This limit permits defunct jobs to be purged in a timely fashion
              without waiting for their time limit to be reached.  This  value
              should reflect the possibility that the srun command may stopped
              by a debugger or considerable time could be required  for  batch
              job  pre-  and  post-processing.  This limit is ignored for jobs
              running in partitions with the RootOnly flag set (the  scheduler
              running  as  root will be responsible for the job).  The default
              value is unlimited (zero).  May not exceed 65533.

       JobAcctGatherType
              The job accounting mechanism type.  Acceptable values at present
              include   "jobacct_gather/aix"   (for   AIX  operating  system),
              "jobacct_gather/linux"  (for   Linux   operating   system)   and
              "jobacct_gather/none"   (no  accounting  data  collected).   The
              default value is "jobacct_gather/none".  In  order  to  use  the
              sacct  tool, "jobacct_gather/aix" or "jobacct_gather/linux" must
              be configured.

       JobAcctGatherFrequency
              The job accounting sampling interval.   For  jobacct_gather/none
              this   parameter   is   ignored.   For   jobacct_gather/aix  and
              jobacct_gather/linux  the  parameter  is  a  number  is  seconds
              between sampling job state.  The default value is 30 seconds.  A
              value of zero  disables  real  the  periodic  job  sampling  and
              provides   accounting   information   only  on  job  termination
              (reducing SLURM interference with the job).

       JobCheckpointDir
              Set the default directory used to store  job  checkpoint  files.
              The default value is "/var/slurm/checkpoint".

       JobCompHost
              The  name  of  the  machine hosting the job completion database.
              Only used for database type storage plugins, ignored  otherwise.
              Also see DefaultStorageHost.

       JobCompLoc
              The  fully  qualified file name where job completion records are
              written  when  the  JobCompType  is  "jobcomp/filetxt"  or   the
              database  where  job  completion  records  are  stored  when the
              JobCompType is a database.  Also see DefaultStorageLoc.

       JobCompPass
              The password used to gain access to the database  to  store  the
              job  completion  data.   Only  used  for  database  type storage
              plugins, ignored otherwise.  Also see DefaultStoragePass.

       JobCompPort
              The listening port of the job completion database server.   Only
              used for database type storage plugins, ignored otherwise.  Also
              see DefaultStoragePort.

       JobCompType
              The job completion logging mechanism type.  Acceptable values at
              present      include      "jobcomp/none",     "jobcomp/filetxt",
              "jobcomp/mysql", "jobcomp/pgsql",  and  "jobcomp/script"".   The
              default  value  is  "jobcomp/none",  which  means  that upon job
              completion the record of the job is purged from the system.   If
              using  the  accounting  infrastructure this plugin may not be of
              interest since the information here  is  redundant.   The  value
              "jobcomp/filetxt"  indicates  that a record of the job should be
              written to a text file specified by  the  JobCompLoc  parameter.
              The  value  "jobcomp/mysql"  indicates  that a record of the job
              should  be  written  to  a  mysql  database  specified  by   the
              JobCompLoc  parameter.  The value "jobcomp/pgsql" indicates that
              a record of the job should be written to a  PostgreSQL  database
              specified    by    the    JobCompLoc   parameter.    The   value
              "jobcomp/script"  indicates  that  a  script  specified  by  the
              JobCompLoc   parameter   is  to  be  executed  with  environment
              variables indicating the job information.

       JobCompUser
              The user account for  accessing  the  job  completion  database.
              Only  used for database type storage plugins, ignored otherwise.
              Also see DefaultStorageUser.

       JobCredentialPrivateKey
              Fully qualified pathname of a file containing a private key used
              for  authentication by SLURM daemons.  This parameter is ignored
              if CryptoType=crypto/munge.

       JobCredentialPublicCertificate
              Fully qualified pathname of a file containing a public key  used
              for  authentication by SLURM daemons.  This parameter is ignored
              if CryptoType=crypto/munge.

       JobFileAppend
              This option controls what to do if a job’s output or error  file
              exist  when  the  job  is started.  If JobFileAppend is set to a
              value of 1, then append to the existing file.  By  default,  any
              existing file is truncated.

       JobRequeue
              This option controls what to do by default after a node failure.
              If JobRequeue is set to a value of 1, then any  job  running  on
              the  failed  node  will  be  requeued for execution on different
              nodes.  If JobRequeue is set to a  value  of  0,  then  any  job
              running  on  the failed node will be terminated.  Use the sbatch
              --no-requeue or --requeue option to change the default  behavior
              for individual jobs.  The default value is 1.

       KillOnBadExit
              If  set to 1, the job will be terminated immediately when one of
              the processes is crashed or aborted. With default value of 0, if
              one  of  the processes is crashed or aborted the other processes
              will continue to run.

       KillWait
              The interval, in seconds, given to a job’s processes between the
              SIGTERM  and  SIGKILL  signals upon reaching its time limit.  If
              the job fails to terminate gracefully in the interval specified,
              it  will  be  forcibly  terminated.   The  default  value  is 30
              seconds.  The value may not exceed 65533.

       Licenses
              Specification of licenses (or other resources available  on  all
              nodes  of  the cluster) which can be allocated to jobs.  License
              names can optionally be followed by an asterisk and count with a
              default  count  of  one.  Multiple license names should be comma
              separated  (e.g.   "Licenses=foo*4,bar").    Note   that   SLURM
              prevents  jobs  from  being  scheduled if their required license
              specification is not available.  SLURM  does  not  prevent  jobs
              from  using  licenses  that are not explicitly listed in the job
              submission specification.

       MailProg
              Fully qualified pathname to the program used to send  email  per
              user request.  The default value is "/bin/mail".

       MaxJobCount
              The maximum number of jobs SLURM can have in its active database
              at one time. Set the values  of  MaxJobCount  and  MinJobAge  to
              insure the slurmctld daemon does not exhaust its memory or other
              resources. Once  this  limit  is  reached,  requests  to  submit
              additional  jobs will fail. The default value is 5000 jobs. This
              value may not be reset via "scontrol reconfig".  It  only  takes
              effect  upon  restart  of  the slurmctld daemon.  May not exceed
              65533.

       MaxMemPerCPU
              Maximum  real  memory  size  available  per  allocated  CPU   in
              MegaBytes.   Used  to  avoid over-subscribing memory and causing
              paging.  MaxMemPerCPU would  generally  be  used  if  individual
              processors  are  allocated to jobs (SelectType=select/cons_res).
              The default value is 0 (unlimited).  Also see  DefMemPerCPU  and
              MaxMemPerNode.   MaxMemPerCPU  and  MaxMemPerNode  are  mutually
              exclusive.   NOTE:  Enforcement  of  memory   limits   currently
              requires  enabling  of accounting, which samples memory use on a
              periodic basis (data need not be stored, just collected).

       MaxMemPerNode
              Maximum  real  memory  size  available  per  allocated  node  in
              MegaBytes.   Used  to  avoid over-subscribing memory and causing
              paging.  MaxMemPerNode would generally be used  if  whole  nodes
              are  allocated  to jobs (SelectType=select/linear) and resources
              are shared (Shared=yes or Shared=force).  The default value is 0
              (unlimited).    Also   see   DefMemPerNode   and   MaxMemPerCPU.
              MaxMemPerCPU and MaxMemPerNode are  mutually  exclusive.   NOTE:
              Enforcement  of  memory  limits  currently  requires enabling of
              accounting, which samples memory use on a periodic  basis  (data
              need not be stored, just collected).

       MaxTasksPerNode
              Maximum  number of tasks SLURM will allow a job step to spawn on
              a single node. The default MaxTasksPerNode is 128.

       MessageTimeout
              Time permitted for a round-trip  communication  to  complete  in
              seconds.  Default  value  is 10 seconds. For systems with shared
              nodes, the slurmd daemon could  be  paged  out  and  necessitate
              higher values.

       MinJobAge
              The  minimum  age of a completed job before its record is purged
              from SLURM’s active database. Set the values of MaxJobCount  and
              MinJobAge  to  insure  the slurmctld daemon does not exhaust its
              memory or other resources. The default value is 300 seconds.   A
              value  of  zero prevents any job record purging.  May not exceed
              65533.

       MpiDefault
              Identifies the default  type  of  MPI  to  be  used.   Srun  may
              override  this  configuration  parameter in any case.  Currently
              supported versions include:  mpichgm,  mvapich,  none  (default,
              which works for many other versions of MPI including LAM MPI and
              Open MPI).

       MpiParams
              MPI parameters.  Used to identify ports used by OpenMPI only and
              the  input  format is "ports=12000-12999" to identify a range of
              communication ports to be used.

       OverTimeLimit
              Number of minutes by which a  job  can  exceed  its  time  limit
              before being canceled.  The configured job time limit is treated
              as a  soft  limit.   Adding  OverTimeLimit  to  the  soft  limit
              provides a hard limit, at which point the job is canceled.  This
              is particularly useful for backfill scheduling, which bases upon
              each job’s soft time limit.  The default value is zero.  Man not
              exceed exceed 65533 minutes.  A value  of  "UNLIMITED"  is  also
              supported.

       PluginDir
              Identifies  the places in which to look for SLURM plugins.  This
              is  a  colon-separated  list  of  directories,  like  the   PATH
              environment      variable.      The     default     value     is
              "/usr/local/lib/slurm".

       PlugStackConfig
              Location of the config file for SLURM stackable plugins that use
              the  Stackable  Plugin  Architecture  for  Node  job  (K)control
              (SPANK).  This provides support for a highly configurable set of
              plugins  to be called before and/or after execution of each task
              spawned as part of a  user’s  job  step.   Default  location  is
              "plugstack.conf" in the same directory as the system slurm.conf.
              For more information on SPANK plugins, see the spank(8)  manual.

       PreemptMode
              Enables  gang  scheduling  and/or controls the mechanism used to
              preempt jobs.  When the PreemptType parameter is set  to  enable
              preemption,  the  PreemptMode  selects  the  mechanism  used  to
              preempt the lower priority jobs.  The GANG  option  is  used  to
              enable  gang  scheduling  independent  of  whether preemption is
              enabled (the PreemptType  setting).   The  GANG  option  can  be
              specified  in  addition  to  a  PreemptMode setting with the two
              options comma separated.  The SUSPEND option requires that  gang
              scheduling be enabled (i.e, "PreemptMode=SUSPEND,GANG").

              OFF         is the default value and disables job preemption and
                          gang scheduling.  This is the only option compatible
                          with           SchedulerType=sched/wiki           or
                          SchedulerType=sched/wiki2 (used  by  Maui  and  Moab
                          respectively, which provide their own job preemption
                          functionality).

              CANCEL      always cancel the job.

              CHECKPOINT  preempts jobs by checkpointing them (if possible) or
                          canceling them.

              GANG        enables  gang  scheduling  (time slicing) of jobs in
                          the same partition.

              REQUEUE     preempts jobs by requeuing  them  (if  possible)  or
                          canceling them.

              SUSPEND     preempts  jobs  by suspending them.  A suspended job
                          will resume execution once  the  high  priority  job
                          preempting  it  completes.   The SUSPEND may only be
                          used with the GANG option (the gang scheduler module
                          performs the job resume operation).

       PreemptType
              This  specifies  the  plugin  used to identify which jobs can be
              preempted in order to start a pending job.

              preempt/none
                     Job preemption is disabled.  This is the default.

              preempt/partition_prio
                     Job preemption is based upon partition priority.  Jobs in
                     higher priority partitions (queues) may preempt jobs from
                     lower priority partitions.

              preempt/qos
                     Job preemption rules are specified by Quality Of  Service
                     (QOS)  specifications  in  the SLURM database a database.
                     This  is   not   compatible   with   PreemptMode=OFF   or
                     PreemptMode=SUSPEND  (i.e. preempted jobs must be removed
                     from the resources).

       PriorityDecayHalfLife
              This controls how long  prior  resource  use  is  considered  in
              determining how over- or under-serviced an association is (user,
              bank account and cluster) in determining job priority.   If  set
              to  0  no decay will be applied.  This is helpful if you want to
              enforce  hard  time  limits  per  association.   If  set  to   0
              PriorityUsageResetPeriod   must   be   set   to  some  interval.
              Applicable only if PriorityType=priority/multifactor.  The  unit
              is  a  time  string  (i.e.  min,  hr:min:00,  days-hr:min:00, or
              days-hr).  The default value is 7-0 (7 days).

       PriorityCalcPeriod
              The period of time in minutes in which the half-life decay  will
              be        re-calculated.         Applicable        only       if
              PriorityType=priority/multifactor.   The  default  value  is   5
              (minutes).

       PriorityFavorSmall

              Specifies   that   small   jobs  should  be  given  preferencial
              scheduling       priority.        Applicable       only       if
              PriorityType=priority/multifactor.   Supported  values are "YES"
              and "NO".  The default value is "NO".

       PriorityMaxAge
              Specifies the job age which will be given the maximum age factor
              in  computing priority. For example, a value of 30 minutes would
              result in all jobs over  30  minutes  old  would  get  the  same
              age-based        priority.        Applicable       only       if
              PriorityType=priority/multifactor.  The unit is  a  time  string
              (i.e.  min, hr:min:00, days-hr:min:00, or days-hr).  The default
              value is 7-0 (7 days).

       PriorityUsageResetPeriod
              At this interval the usage of associations will be reset  to  0.
              This  is  used  if you want to enforce hard limits of time usage
              per association.  If PriorityDecayHalfLife is set  to  be  0  no
              decay  will  happen  and this is the only way to reset the usage
              accumulated by running jobs.  By default this is turned off  and
              it  is  advised to use the PriorityDecayHalfLife option to avoid
              not having anything running on your cluster, but if your  schema
              is  set  up to only allow certain amounts of time on your system
              this   is   the   way   to   do   it.    Applicable   only    if
              PriorityType=priority/multifactor.

              NONE        Never clear historic usage. The default value.

              NOW         Clear  the  historic usage now.  Executed at startup
                          and reconfiguration time.

              DAILY       Cleared every day at midnight.

              WEEKLY      Cleared every week on Sunday at time 00:00.

              MONTHLY     Cleared on the first  day  of  each  month  at  time
                          00:00.

              QUARTERLY   Cleared  on  the  first  day of each quarter at time
                          00:00.

              YEARLY      Cleared on the first day of each year at time 00:00.

       PriorityType
              This  specifies  the  plugin  to be used in establishing a job’s
              scheduling priority. Supported values are "priority/basic" (jobs
              are   prioritized   by  order  of  arrival,  also  suitable  for
              sched/wiki and sched/wiki2) and "priority/multifactor" (jobs are
              prioritized  based  upon  size,  age,  fair-share of allocation,
              etc).  The default value is "priority/basic".

       PriorityWeightAge
              An integer value that sets the degree to which  the  queue  wait
              time  component  contributes  to the job’s priority.  Applicable
              only if PriorityType=priority/multifactor.  The default value is
              0.

       PriorityWeightFairshare
              An  integer  value  that sets the degree to which the fair-share
              component contributes to the job’s priority.  Applicable only if
              PriorityType=priority/multifactor.  The default value is 0.

       PriorityWeightJobSize
              An  integer  value  that  sets  the degree to which the job size
              component contributes to the job’s priority.  Applicable only if
              PriorityType=priority/multifactor.  The default value is 0.

       PriorityWeightPartition
              An  integer  value  that  sets  the  degree  to  which  the node
              partition  component  contributes   to   the   job’s   priority.
              Applicable   only   if  PriorityType=priority/multifactor.   The
              default value is 0.

       PriorityWeightQOS
              An integer value that sets the degree to which  the  Quality  Of
              Service component contributes to the job’s priority.  Applicable
              only if PriorityType=priority/multifactor.  The default value is
              0.

       PrivateData
              This  controls  what  type of information is hidden from regular
              users.  By default, all information is  visible  to  all  users.
              User  SlurmUser  and  root  can  always  view  all  information.
              Multiple  values  may  be  specified  with  a  comma  separator.
              Acceptable values include:

              accounts
                     (NON-SLURMDBD   ACCOUNTING   ONLY)  prevents  users  from
                     viewing  any  account   definitions   unless   they   are
                     coordinators of them.

              jobs   prevents  users  from viewing jobs or job steps belonging
                     to other users. (NON-SLURMDBD ACCOUNTING  ONLY)  prevents
                     users  from  viewing job records belonging to other users
                     unless they are coordinators of the  association  running
                     the job when using sacct.

              nodes  prevents users from viewing node state information.

              partitions
                     prevents  users from viewing partition state information.

              reservations
                     prevents regular users from viewing reservations.

              usage  (NON-SLURMDBD  ACCOUNTING  ONLY)  prevents   users   from
                     viewing  usage  of  any  other  user.   This  applies  to
                     sreport.

              users  (NON-SLURMDBD  ACCOUNTING  ONLY)  prevents   users   from
                     viewing  information  of  any user other than themselves,
                     this also makes it so users  can  only  see  associations
                     they deal with.  Coordinators can see associations of all
                     users  they  are  coordinator  of,  but  can   only   see
                     themselves when listing users.

       ProctrackType
              Identifies  the  plugin  to  be  used for process tracking.  The
              slurmd daemon uses this  mechanism  to  identify  all  processes
              which  are  children of processes it spawns for a user job.  The
              slurmd daemon must be restarted for a change in ProctrackType to
              take  effect.   NOTE: "proctrack/linuxproc" and "proctrack/pgid"
              can fail to identify all processes associated with a  job  since
              processes  can  become  a  child  of  the init process (when the
              parent process terminates) or change their  process  group.   To
              reliably  track  all  processes,  one  of  the  other mechanisms
              utilizing   kernel   modifications   is    preferable.     NOTE:
              "proctrack/linuxproc"  is  not  compatible  with  "switch/elan."
              Acceptable values at present include:

              proctrack/aix which uses an AIX kernel extension and is
                     the default for AIX systems

              proctrack/linuxproc which uses linux process tree using
                     parent process IDs

              proctrack/rms which uses Quadrics kernel patch and is the
                     default if "SwitchType=switch/elan"

              proctrack/sgi_job which uses SGI’s Process Aggregates (PAGG)
                     kernel module, see http://oss.sgi.com/projects/pagg/  for
                     more information

              proctrack/pgid which uses process group IDs and is the
                     default for all other systems

       Prolog Fully  qualified pathname of a program for the slurmd to execute
              whenever it is asked to run a job step from a new job allocation
              (e.g.   "/usr/local/slurm/prolog").   The  slurmd  executes  the
              script before starting the first job step.  This may be used  to
              purge  files,  enable  user  login, etc.  By default there is no
              prolog. Any configured script is expected to complete  execution
              quickly  (in  less  time  than  MessageTimeout).  See Prolog and
              Epilog Scripts for more information.

       PrologSlurmctld
              Fully qualified pathname of  a  program  for  the  slurmctld  to
              execute   before   granting   a   new   job   allocation   (e.g.
              "/usr/local/slurm/prolog_controller").  The program executes  as
              SlurmUser,  which gives it permission to drain nodes and requeue
              the job if a failure occurs or cancel the  job  if  appropriate.
              The program can be used to reboot nodes or perform other work to
              prepare resources for use.  While this program is  running,  the
              nodes    associated    with    the    job   will   be   have   a
              POWER_UP/CONFIGURING flag set  in  their  state,  which  can  be
              readily  viewed.   A  non-zero  exit code will result in the job
              being requeued (where  possible)  or  killed.   See  Prolog  and
              Epilog Scripts for more information.

       PropagatePrioProcess
              Setting  PropagatePrioProcess  to "1", will cause a users job to
              run with the same priority (aka nice value) as the users process
              which  launched  the  job on the submit node.  If set to "0", or
              left unset, the users job will inherit the  scheduling  priority
              from the slurm daemon.

       PropagateResourceLimits
              A  list  of  comma  separated  resource limit names.  The slurmd
              daemon uses these names to obtain the  associated  (soft)  limit
              values  from  the  users process environment on the submit node.
              These limits are then propagated and applied to  the  jobs  that
              will  run  on  the  compute nodes.  This parameter can be useful
              when system limits vary among nodes.  Any resource  limits  that
              do not appear in the list are not propagated.  However, the user
              can  override  this  by  specifying  which  resource  limits  to
              propagate  with  the  srun  commands  "--propagate"  option.  If
              neither  of  the  ’propagate  resource  limit’  parameters   are
              specified,  then  the default action is to propagate all limits.
              Only one of the parameters,  either  PropagateResourceLimits  or
              PropagateResourceLimitsExcept,  may be specified.  The following
              limit names are supported by SLURM (although  some  options  may
              not be supported on some systems):

              ALL       All limits listed below

              NONE      No limits listed below

              AS        The maximum address space for a processes

              CORE      The maximum size of core file

              CPU       The maximum amount of CPU time

              DATA      The maximum size of a process’s data segment

              FSIZE     The maximum size of files created

              MEMLOCK   The maximum size that may be locked into memory

              NOFILE    The maximum number of open files

              NPROC     The maximum number of processes available

              RSS       The maximum resident set size

              STACK     The maximum stack size

       PropagateResourceLimitsExcept
              A list of comma separated resource limit names.  By default, all
              resource  limits  will  be  propagated,  (as  described  by  the
              PropagateResourceLimits   parameter),   except  for  the  limits
              appearing  in  this  list.    The  user  can  override  this  by
              specifying  which  resource  limits  to  propagate with the srun
              commands  "--propagate"  option.   See   PropagateResourceLimits
              above for a list of valid limit names.

       ResumeProgram
              SLURM  supports a mechanism to reduce power consumption on nodes
              that remain idle for  an  extended  period  of  time.   This  is
              typically  accomplished  by  reducing  voltage  and frequency or
              powering the node down.  ResumeProgram is the program that  will
              be  executed  when a node in power save mode is assigned work to
              perform.  For reasons of reliability, ResumeProgram may  execute
              more  than once for a node when the slurmctld daemon crashes and
              is restarted.  If ResumeProgram is unable to restore a  node  to
              service, it should requeue any node associated with the node and
              set the node state to DRAIN.  The program executes as SlurmUser.
              The  argument  to  the  program will be the names of nodes to be
              removed  from  power  savings  mode  (using   SLURM’s   hostlist
              expression  format).   By  default  no  program is run.  Related
              configuration   options   include   ResumeTimeout,   ResumeRate,
              SuspendRate,    SuspendTime,   SuspendTimeout,   SuspendProgram,
              SuspendExcNodes,  and  SuspendExcParts.   More  information   is
              available        at        the        SLURM       web       site
              (https://computing.llnl.gov/linux/slurm/power_save.html).

       ResumeRate
              The rate at which nodes in  power  save  mode  are  returned  to
              normal operation by ResumeProgram.  The value is number of nodes
              per minute and it can be used to prevent power surges if a large
              number of nodes in power save mode are assigned work at the same
              time (e.g. a large job starts).  A value of zero results  in  no
              limits  being  imposed.   The  default  value  is  300 nodes per
              minute.  Related configuration  options  include  ResumeTimeout,
              ResumeProgram,    SuspendRate,    SuspendTime,   SuspendTimeout,
              SuspendProgram, SuspendExcNodes, and SuspendExcParts.

       ResumeTimeout
              Maximum time permitted (in second) between when a node is resume
              request  is  issued  and when the node is actually available for
              use.  Nodes which fail to respond in  this  time  frame  may  be
              marked  DOWN  and  the jobs scheduled on the node requeued.  The
              default value is  60  seconds.   Related  configuration  options
              include  ResumeProgram,  ResumeRate,  SuspendRate,  SuspendTime,
              SuspendTimeout,     SuspendProgram,     SuspendExcNodes      and
              SuspendExcParts.  More information is available at the SLURM web
              site (https://computing.llnl.gov/linux/slurm/power_save.html).

       ResvOverRun
              Describes how long a job already running in a reservation should
              be  permitted  to  execute after the end time of the reservation
              has been reached.  The time period is specified in  minutes  and
              the  default  value  is 0 (kill the job immediately).  The value
              may not exceed 65533 minutes, although a value of "UNLIMITED" is
              supported  to  permit  a  job  to  run  indefinitely  after  its
              reservation is terminated.

       ReturnToService
              Controls when a DOWN node will  be  returned  to  service.   The
              default value is 0.  Supported values include

              0   A  node  will  remain  in  the  DOWN  state  until  a system
                  administrator explicitly changes  its  state  (even  if  the
                  slurmd daemon registers and resumes communications).

              1   A  DOWN node will become available for use upon registration
                  with a valid configuration only if it was set  DOWN  due  to
                  being  non-responsive.   If  the  node  was set DOWN for any
                  other reason (low memory, prolog  failure,  epilog  failure,
                  etc.), its state will not automatically be changed.

              2   A  DOWN node will become available for use upon registration
                  with a valid configuration.  The node could  have  been  set
                  DOWN for any reason.

       SallocDefaultCommand
              Normally,  salloc(1)  will  run  the user’s default shell when a
              command to execute is not specified on the salloc command  line.
              If  SallocDefaultCommand  is  specified, salloc will instead run
              the configured command. The command is passed to  ’/bin/sh  -c’,
              so  shell metacharacters are allowed, and commands with multiple
              arguments should be quoted. For instance:

                  SallocDefaultCommand = "$SHELL"

              would run the shell in the user’s $SHELL  environment  variable.
              and

                  SallocDefaultCommand = "xterm -T Job_$SLURM_JOB_ID"

              would run xterm with the title set to the SLURM jobid.

       SchedulerParameters
              The  interpretation  of  this parameter varies by SchedulerType.
              Multiple options may be comma separated.  The following  options
              apply only to SchedulerType=sched/backfill.

              interval=#
                     The  number of seconds between iterations.  Higher values
                     result in less overhead and responsiveness.  The  default
                     value  is  5  seconds  on BlueGene systems and 10 seconds
                     otherwise.

              max_job_bf=#
                     The maximum number of jobs to attempt backfill scheduling
                     for (i.e. the queue depth).  Higher values result in more
                     overhead and less responsiveness.  Until  an  attempt  is
                     made  to backfill schedule a job, its expected initiation
                     time value will not be set.  The default value is 50.  In
                     the  case  of  large  clusters  (more  than  1000  nodes)
                     configured  with  SelectType=select/cons_res,  setting  a
                     smaller value may be desirable.

       SchedulerPort
              The  port number on which slurmctld should listen for connection
              requests.  This value is only used by the  Maui  Scheduler  (see
              SchedulerType).  The default value is 7321.

       SchedulerRootFilter
              Identifies whether or not RootOnly partitions should be filtered
              from any external scheduling  activities.  If  set  to  0,  then
              RootOnly partitions are treated like any other partition. If set
              to 1, then RootOnly partitions  are  exempt  from  any  external
              scheduling  activities.  The  default value is 1. Currently only
              used by the built-in backfill scheduling module "sched/backfill"
              (see SchedulerType).

       SchedulerTimeSlice
              Number     of     seconds    in    each    time    slice    when
              SchedulerType=sched/gang.  The default value is 30 seconds.

       SchedulerType
              Identifies the type of scheduler to be used.  Note the slurmctld
              daemon  must  be  restarted  for  a  change in scheduler type to
              become effective (reconfiguring a running daemon has  no  effect
              for  this  parameter).   The  scontrol  command  can  be used to
              manually change job priorities if  desired.   Acceptable  values
              include:

              sched/builtin
                     for  the  built-in  FIFO  (First In First Out) scheduler.
                     This is the default.

              sched/backfill
                     for a backfill scheduling module to augment  the  default
                     FIFO   scheduling.   Backfill  scheduling  will  initiate
                     lower-priority jobs  if  doing  so  does  not  delay  the
                     expected  initiation  time  of  any  higher priority job.
                     Effectiveness of backfill scheduling  is  dependent  upon
                     users specifying job time limits, otherwise all jobs will
                     have the same time limit and backfilling  is  impossible.
                     Note  documentation  for  the  SchedulerParameters option
                     above.

              sched/gang
                     Defunct option. See PreemptType and PreemptMode  options.

              sched/hold
                     to   hold   all   newly   arriving   jobs   if   a   file
                     "/etc/slurm.hold" exists otherwise use the built-in  FIFO
                     scheduler

              sched/wiki
                     for the Wiki interface to the Maui Scheduler

              sched/wiki2
                     for the Wiki interface to the Moab Cluster Suite

       SelectType
              Identifies  the type of resource selection algorithm to be used.
              Acceptable values include

              select/linear
                     for allocation of entire nodes assuming a one-dimensional
                     array  of  nodes  in which sequentially ordered nodes are
                     preferable.  This is the default value  for  non-BlueGene
                     systems.

              select/cons_res
                     The resources within a node are individually allocated as
                     consumable resources.   Note  that  whole  nodes  can  be
                     allocated  to  jobs  for selected partitions by using the
                     Shared=Exclusive  option.   See  the   partition   Shared
                     parameter for more information.

              select/bluegene
                     for  a  three-dimensional  BlueGene  system.  The default
                     value is "select/bluegene" for BlueGene systems.

       SelectTypeParameters
              The permitted values of  SelectTypeParameters  depend  upon  the
              configured   value  of  SelectType.   SelectType=select/bluegene
              supports no SelectTypeParameters.  The only supported option for
              SelectType=select/linear  is CR_Memory, which treats memory as a
              consumable resource and prevents memory over  subscription  with
              job  preemption  or  gang  scheduling.  The following values are
              supported for SelectType=select/cons_res:

              CR_CPU CPUs are consumable resources.  There  is  no  notion  of
                     sockets,  cores or threads; do not define those values in
                     the node specification.  If these are defined, unexpected
                     results  will  happen  when  hyper-threading  is  enabled
                     Procs= should be used instead.  On a  multi-core  system,
                     each  core will be considered a CPU.  On a multi-core and
                     hyper-threaded system, each thread will be  considered  a
                     CPU.    On   single-core   systems,  each  CPUs  will  be
                     considered a CPU.

              CR_CPU_Memory
                     CPUs and memory are consumable resources.   There  is  no
                     notion  of sockets, cores or threads; do not define those
                     values in the node specification.  If these are  defined,
                     unexpected  results  will  happen when hyper-threading is
                     enabled Procs= should be used instead.  Setting  a  value
                     for DefMemPerCPU is strongly recommended.

              CR_Core
                     Cores   are   consumable   resources.    On   nodes  with
                     hyper-threads, each thread is counted as a CPU to satisfy
                     a  job’s  resource requirement, but multiple jobs are not
                     allocated threads on the same core.

              CR_Core_Memory
                     Cores and memory are consumable resources.  On nodes with
                     hyper-threads, each thread is counted as a CPU to satisfy
                     a job’s resource requirement, but multiple jobs  are  not
                     allocated  threads on the same core.  Setting a value for
                     DefMemPerCPU is strongly recommended.

              CR_Socket
                     Sockets are consumable resources.  On nodes with multiple
                     cores, each core or thread is counted as a CPU to satisfy
                     a job’s resource requirement, but multiple jobs  are  not
                     allocated  resources  on the same socket.  Note that jobs
                     requesting one CPU will only be given access to that  one
                     CPU, but no other job will share the socket.

              CR_Socket_Memory
                     Memory  and  sockets  are consumable resources.  On nodes
                     with multiple cores, each core or thread is counted as  a
                     CPU to satisfy a job’s resource requirement, but multiple
                     jobs are not allocated  resources  on  the  same  socket.
                     Note  that  jobs  requesting  one  CPU will only be given
                     access to that one CPU, but no other job will  share  the
                     socket.   Setting  a  value  for DefMemPerCPU is strongly
                     recommended.

              CR_Memory
                     Memory is a  consumable  resource.   NOTE:  This  implies
                     Shared=YES or Shared=FORCE for all partitions.  Setting a
                     value for DefMemPerCPU is strongly recommended.

       SlurmUser
              The name of the user that the slurmctld daemon executes as.  For
              security  purposes,  a  user  other  than "root" is recommended.
              This  user  must  exist  on  all  nodes  of  the   cluster   for
              authentication  of communications between SLURM components.  The
              default value is "root".

       SlurmdUser
              The name of the user that the slurmd daemon executes  as.   This
              user  must  exist on all nodes of the cluster for authentication
              of communications between SLURM components.  The  default  value
              is "root".

       SlurmctldDebug
              The  level of detail to provide slurmctld daemon’s logs.  Values
              from 0 to 9 are legal, with ‘0’ being "quiet" operation and  ‘9’
              being insanely verbose.  The default value is 3.

       SlurmctldLogFile
              Fully  qualified  pathname  of  a  file into which the slurmctld
              daemon’s logs are written.  The default value is none  (performs
              logging via syslog).

       SlurmctldPidFile
              Fully  qualified  pathname  of  a file into which the  slurmctld
              daemon may write its process id. This may be used for  automated
              signal      processing.       The      default      value     is
              "/var/run/slurmctld.pid".

       SlurmctldPort
              The port number that the SLURM controller, slurmctld, listens to
              for  work. The default value is SLURMCTLD_PORT as established at
              system build time. If none is explicitly specified, it  will  be
              set to 6817.  NOTE: Either slurmctld and slurmd daemons must not
              execute on the same nodes or the  values  of  SlurmctldPort  and
              SlurmdPort must be different.

       SlurmctldTimeout
              The  interval,  in seconds, that the backup controller waits for
              the primary controller to respond before assuming control.   The
              default value is 120 seconds.  May not exceed 65533.

       SlurmdDebug
              The  level  of  detail  to provide slurmd daemon’s logs.  Values
              from 0 to 9 are legal, with ‘0’ being "quiet" operation and  ‘9’
              being insanely verbose.  The default value is 3.

       SlurmdLogFile
              Fully  qualified  pathname  of  a  file  into  which the  slurmd
              daemon’s logs are written.  The default value is none  (performs
              logging  via syslog).  Any "%h" within the name is replaced with
              the hostname on which the slurmd is running.

       SlurmdPidFile
              Fully qualified pathname of a file into which the  slurmd daemon
              may  write its process id. This may be used for automated signal
              processing.  The default value is "/var/run/slurmd.pid".

       SlurmdPort
              The port number that the  SLURM  compute  node  daemon,  slurmd,
              listens  to  for  work.  The  default  value  is  SLURMD_PORT as
              established  at  system  build  time.  If  none  is   explicitly
              specified,  its  value will be 6818.  NOTE: Either slurmctld and
              slurmd daemons must not execute on the same nodes or the  values
              of SlurmctldPort and SlurmdPort must be different.

       SlurmdSpoolDir
              Fully  qualified  pathname  of a directory into which the slurmd
              daemon’s state information and batch job script information  are
              written.  This  must  be  a  common  pathname for all nodes, but
              should represent  a  directory  which  is  local  to  each  node
              (reference   a   local   file  system).  The  default  value  is
              "/var/spool/slurmd." NOTE: This directory is also used to  store
              slurmd’s  shared  memory  lockfile,  and  should  not be changed
              unless the system is being cleanly restarted. If the location of
              SlurmdSpoolDir  is  changed  and  slurmd  is  restarted, the new
              daemon will attach to a different shared memory region and  lose
              track of any running jobs.

       SlurmdTimeout
              The  interval,  in  seconds, that the SLURM controller waits for
              slurmd to respond before configuring that node’s state to  DOWN.
              A  value  of  zero  indicates  the  node  will  not be tested by
              slurmctld to confirm the state of slurmd, the node will  not  be
              automatically  set  to  a DOWN state indicating a non-responsive
              slurmd,  and  some  other  tool  will  take  responsibility  for
              monitoring the state of each compute node and its slurmd daemon.
              SLURM’s hierarchical communication mechanism is used to ping the
              slurmd  daemons  in order to minimize system noise and overhead.
              The default value is 300 seconds.   The  value  may  not  exceed
              65533 seconds.

       SrunEpilog
              Fully  qualified  pathname  of  an  executable to be run by srun
              following the completion  of  a  job  step.   The  command  line
              arguments  for  the executable will be the command and arguments
              of the job step.  This configuration parameter may be overridden
              by srun’s --epilog parameter.

       SrunProlog
              Fully  qualified  pathname  of  an  executable to be run by srun
              prior to the launch of a job step.  The command  line  arguments
              for  the executable will be the command and arguments of the job
              step.  This configuration parameter may be overridden by  srun’s
              --prolog parameter.

       StateSaveLocation
              Fully  qualified  pathname  of  a directory into which the SLURM
              controller,     slurmctld,     saves     its     state     (e.g.
              "/usr/local/slurm/checkpoint").   SLURM state will saved here to
              recover from system failures.  SlurmUser must be able to  create
              files  in  this  directory.   If  you  have  a  BackupController
              configured, this location should be  readable  and  writable  by
              both  systems.  Since all running and pending job information is
              stored here, the use of a reliable file system  (e.g.  RAID)  is
              recommended.  The default value is "/tmp".  If any slurm daemons
              terminate abnormally, their core files will also be written into
              this directory.

       SuspendExcNodes
              Specifies  the  nodes  which  are to not be placed in power save
              mode, even if the node remains idle for an  extended  period  of
              time.   Use  SLURM’s  hostlist expression to identify nodes.  By
              default no nodes are excluded.   Related  configuration  options
              include      ResumeTimeout,      ResumeProgram,      ResumeRate,
              SuspendProgram, SuspendRate,  SuspendTime,  SuspendTimeout,  and
              SuspendExcParts.

       SuspendExcParts
              Specifies  the  partitions  whose  nodes are to not be placed in
              power save mode, even if the node remains idle for  an  extended
              period  of  time.   Multiple  partitions  can  be identified and
              separated by commas.  By default no nodes are excluded.  Related
              configuration   options  include  ResumeTimeout,  ResumeProgram,
              ResumeRate,     SuspendProgram,     SuspendRate,     SuspendTime
              SuspendTimeout, and SuspendExcNodes.

       SuspendProgram
              SuspendProgram  is the program that will be executed when a node
              remains idle for an extended period of time.   This  program  is
              expected  to place the node into some power save mode.  This can
              be used to reduce  the  frequency  and  voltage  of  a  node  or
              completely   power  the  node  off.   The  program  executes  as
              SlurmUser.  The argument to the program will  be  the  names  of
              nodes  to  be  placed  into  power  savings  mode (using SLURM’s
              hostlist expression format).  By default,  no  program  is  run.
              Related    configuration    options    include    ResumeTimeout,
              ResumeProgram,     ResumeRate,     SuspendRate,     SuspendTime,
              SuspendTimeout, SuspendExcNodes, and SuspendExcParts.

       SuspendRate
              The  rate  at  which  nodes  are  place  into power save mode by
              SuspendProgram.  The value is number of nodes per minute and  it
              can  be  used to prevent a large drop in power power consumption
              (e.g. after a large job completes).  A value of zero results  in
              no  limits  being  imposed.   The  default value is 60 nodes per
              minute.  Related configuration  options  include  ResumeTimeout,
              ResumeProgram,    ResumeRate,    SuspendProgram,    SuspendTime,
              SuspendTimeout, SuspendExcNodes, and SuspendExcParts.

       SuspendTime
              Nodes which remain idle for  this  number  of  seconds  will  be
              placed  into  power  save mode by SuspendProgram.  A value of -1
              disables  power  save  mode  and  is   the   default.    Related
              configuration   options  include  ResumeTimeout,  ResumeProgram,
              ResumeRate,   SuspendProgram,    SuspendRate,    SuspendTimeout,
              SuspendExcNodes, and SuspendExcParts.

       SuspendTimeout
              Maximum  time  permitted (in second) between when a node suspend
              request is issued and when the node shutdown.  At that time  the
              node  must ready for a resume request to be issued as needed for
              new  work.   The  default  value   is   30   seconds.    Related
              configuration   options   include   ResumeProgram,   ResumeRate,
              ResumeTimeout,   SuspendRate,    SuspendTime,    SuspendProgram,
              SuspendExcNodes   and   SuspendExcParts.   More  information  is
              available       at       the        SLURM        web        site
              (https://computing.llnl.gov/linux/slurm/power_save.html).

       SwitchType
              Identifies   the   type  of  switch  or  interconnect  used  for
              application   communications.    Acceptable    values    include
              "switch/none"  for switches not requiring special processing for
              job launch or termination (Myrinet, Ethernet,  and  InfiniBand),
              "switch/elan"  for  Quadrics Elan 3 or Elan 4 interconnect.  The
              default value is "switch/none".  All SLURM daemons, commands and
              running  jobs  must  be  restarted for a change in SwitchType to
              take effect.  If running jobs exist at  the  time  slurmctld  is
              restarted with a new value of SwitchType, records of all jobs in
              any state may be lost.

       TaskEpilog
              Fully qualified pathname of a program to be execute as the slurm
              job’s  owner after termination of each task.  See TaskProlog for
              execution order details.

       TaskPlugin
              Identifies the type of task launch  plugin,  typically  used  to
              provide resource management within a node (e.g. pinning tasks to
              specific processors).  Acceptable values include "task/none" for
              systems  requiring  no  special  handling and "task/affinity" to
              enable the  --cpu_bind  and/or  --mem_bind  srun  options.   The
              default  value  is  "task/none".   If  you  "task/affinity"  and
              encounter problems, it may be due to the variety of system calls
              used  to implement task affinity on different operating systems.
              If that is the case, you may want to use Portable Linux  Process
              Affinity   (PLPA,   see  http://www.open-mpi.org/software/plpa),
              which is supported by SLURM.

       TaskPluginParam
              Optional parameters  for  the  task  plugin.   Multiple  options
              should  be  comma  separated  If  None, Sockets, Cores, Threads,
              and/or Verbose are specified, they will override the  --cpu_bind
              option  specified  by  the  user  in  the  srun  command.  None,
              Sockets, Cores and Threads are mutually exclusive and since they
              decrease  scheduling  flexibility  are not generally recommended
              (select no more than  one  of  them).   Cpusets  and  Sched  are
              mutually exclusive (select only one of them).

              Cores     Always  bind  to  cores.   Overrides  user  options or
                        automatic binding.

              Cpusets   Use cpusets to perform task  affinity  functions.   By
                        default, Sched task binding is performed.

              None      Perform  no  task  binding.  Overrides user options or
                        automatic binding.

              Sched     Use sched_setaffinity  or  plpa_sched_setaffinity  (if
                        available) to bind tasks to processors.

              Sockets   Always  bind  to  sockets.   Overrides user options or
                        automatic binding.

              Threads   Always bind to threads.   Overrides  user  options  or
                        automatic binding.

              Verbose   Verbosely  report binding before tasks run.  Overrides
                        user options.

       TaskProlog
              Fully qualified pathname of a program to be execute as the slurm
              job’s  owner  prior  to  initiation  of  each task.  Besides the
              normal environment variables, this has SLURM_TASK_PID  available
              to  identify the process ID of the task being started.  Standard
              output from this program of the form "export NAME=value" will be
              used  to  set  environment variables for the task being spawned.
              Standard output from this program of the form "print  ..."  will
              cause  that line (without the leading "print ") to be printed to
              the job’s standard output.   The  order  of  task  prolog/epilog
              execution is as follows:

              1. pre_launch(): function in TaskPlugin

              2.   TaskProlog:   system-wide   per  task  program  defined  in
              slurm.conf

              3. user prolog: job step specific task program defined using
                     srun’s    --task-prolog   option   or   SLURM_TASK_PROLOG
                     environment variable

              4. Execute the job step’s task

              5. user epilog: job step specific task program defined using
                     srun’s   --task-epilog   option   or    SLURM_TASK_EPILOG
                     environment variable

              6.   TaskEpilog:   system-wide   per  task  program  defined  in
              slurm.conf

              7. post_term(): function in TaskPlugin

       TmpFS  Fully  qualified  pathname  of the file system available to user
              jobs  for  temporary  storage.  This  parameter   is   used   in
              establishing  a  node’s  TmpDisk  space.   The  default value is
              "/tmp".

       TopologyPlugin
              Identifies the plugin to be used  for  determining  the  network
              topology  and  optimizing  job  allocations  to minimize network
              contention.   Acceptable  values   include   "topology/3d_torus"
              (default  for  Cray  XT,  IBM  BlueGene  and  Sun  Constellation
              systems,  best-fit  logic   over   three-dimensional   topology)
              "topology/none"  (default for other systems, best-fit logic over
              one-dimensional topology)  and  "topology/tree"  (determine  the
              network   topology   based   upon  information  contained  in  a
              topology.conf file).  See NETWORK TOPOLOGY  below  for  details.
              Additional  plugins  may  be provided in the future which gather
              topology information directly from the network.

       TrackWCKey
              Boolean yes or no.   Used  to  set  display  and  track  of  the
              Workload  Characterization  Key.   Must  be  set  to track wckey
              usage.

       TreeWidth
              Slurmd daemons use a virtual tree  network  for  communications.
              TreeWidth  specifies  the  width  of the tree (i.e. the fanout).
              The  default  value  is  50,  meaning  each  slurmd  daemon  can
              communicate  with  up  to  50 other slurmd daemons and over 2500
              nodes can be contacted with two message hops.  The default value
              will  work  well  for most clusters.  Optimal system performance
              can typically be achieved if TreeWidth is set to the square root
              of the number of nodes in the cluster for systems having no more
              than 2500 nodes or the cube root for larger systems.

       UnkillableStepProgram
              If the processes in a job step are determined to  be  unkillable
              for  a  period  of  time  specified by the UnkillableStepTimeout
              variable, the program specified by UnkillableStepProgram will be
              executed.   This  program can be used to take special actions to
              clean  up  the  unkillable  processes  and/or  notify   computer
              administrators.   The  program  will  be run SlurmdUser (usually
              "root").  By default no program is run.

       UnkillableStepTimeout
              The length of time, in seconds,  that  SLURM  will  wait  before
              deciding that processes in a job step are unkillable (after they
              have    been    signaled    with    SIGKILL)     and     execute
              UnkillableStepProgram  as  described above.  The default timeout
              value is 60 seconds.

       UsePAM If set to 1, PAM (Pluggable Authentication  Modules  for  Linux)
              will  be enabled.  PAM is used to establish the upper bounds for
              resource  limits.  With  PAM  support  enabled,   local   system
              administrators can dynamically configure system resource limits.
              Changing the upper bound of a resource limit will not alter  the
              limits  of  running  jobs,  only jobs started after a change has
              been made will pick up the new limits.  The default value  is  0
              (not to enable PAM support).  Remember that PAM also needs to be
              configured to support SLURM as a service.  For sites using PAM’s
              directory based configuration option, a configuration file named
              slurm should be created.  The  module-type,  control-flags,  and
              module-path names that should be included in the file are:
              auth        required      pam_localuser.so
              auth        required      pam_shells.so
              account     required      pam_unix.so
              account     required      pam_access.so
              session     required      pam_unix.so
              For sites configuring PAM with a general configuration file, the
              appropriate lines (see above), where slurm is the  service-name,
              should be added.

       WaitTime
              Specifies  how  many  seconds the srun command should by default
              wait after the first  task  terminates  before  terminating  all
              remaining  tasks.  The  "--wait" option on the srun command line
              overrides this value.  If set to 0, this  feature  is  disabled.
              May not exceed 65533 seconds.

       The configuration of nodes (or machines) to be managed by SLURM is also
       specified in /etc/slurm.conf.   Changes  in  node  configuration  (e.g.
       adding  nodes, changing their processor count, etc.) require restarting
       the slurmctld daemon.  Only  the  NodeName  must  be  supplied  in  the
       configuration  file.   All  other  node  configuration  information  is
       optional.  It is advisable to establish baseline  node  configurations,
       especially  if  the  cluster is heterogeneous.  Nodes which register to
       the system with less than the configured  resources  (e.g.  too  little
       memory), will be placed in the "DOWN" state to avoid scheduling jobs on
       them.  Establishing baseline configurations  will  also  speed  SLURM’s
       scheduling process by permitting it to compare job requirements against
       these (relatively few)  configuration  parameters  and  possibly  avoid
       having  to  check  job  requirements  against  every  individual node’s
       configuration.  The resources checked at node  registration  time  are:
       Procs, RealMemory and TmpDisk.  While baseline values for each of these
       can be established in the configuration file, the  actual  values  upon
       node  registration are recorded and these actual values may be used for
       scheduling purposes (depending upon the value of  FastSchedule  in  the
       configuration file.

       Default  values  can  be specified with a record in which "NodeName" is
       "DEFAULT".  The default entry values will apply only to lines following
       it  in  the  configuration  file  and  the  default values can be reset
       multiple times in the configuration file with  multiple  entries  where
       "NodeName=DEFAULT".   The  "NodeName="  specification must be placed on
       every line describing the configuration  of  nodes.   In  fact,  it  is
       generally  possible  and  desirable to define the configurations of all
       nodes in  only  a  few  lines.   This  convention  permits  significant
       optimization in the scheduling of larger clusters.  In order to support
       the concept of jobs requiring consecutive nodes on some  architectures,
       node  specifications should be place in this file in consecutive order.
       No single node name may be listed more than once in  the  configuration
       file.   Use  "DownNodes="  to  record  the  state  of  nodes  which are
       temporarily  in  a  DOWN,  DRAIN  or  FAILING  state  without  altering
       permanent  configuration information.  A job step’s tasks are allocated
       to nodes in order the nodes appear in the configuration file. There  is
       presently  no capability within SLURM to arbitrarily order a job step’s
       tasks.

       Multiple node names may be comma  separated  (e.g.  "alpha,beta,gamma")
       and/or a simple node range expression may optionally be used to specify
       numeric ranges of nodes to avoid building  a  configuration  file  with
       large  numbers  of  entries.  The node range expression can contain one
       pair of square brackets with a  sequence  of  comma  separated  numbers
       and/or ranges of numbers separated by a "-" (e.g. "linux[0-64,128]", or
       "lx[15,18,32-33]").  Note that the numeric ranges can  include  one  or
       more  leading  zeros to indicate the numeric portion has a fixed number
       of digits (e.g. "linux[0000-1023]").  Up to two numeric ranges  can  be
       included  in the expression (e.g. "rack[0-63]_blade[0-41]").  If one or
       more numeric expressions are included, one of them must be at  the  end
       of the name (e.g. "unit[0-31]rack" is invalid), but arbitrary names can
       always be used in a comma separated list.

       On BlueGene systems only, the square brackets should contain  pairs  of
       three  digit  numbers  separated  by a "x".  These numbers indicate the
       boundaries of a rectangular prism (e.g.  "bgl[000x144,400x544]").   See
       BlueGene  documentation  for  more  details.   The  node  configuration
       specified the following information:

       NodeName
              Name that SLURM uses to refer to a node (or base  partition  for
              BlueGene  systems).   Typically  this  would  be the string that
              "/bin/hostname -s" returns.  It may also be the fully  qualified
              domain   name   as   returned   by   "/bin/hostname   -f"  (e.g.
              "foo1.bar.com"), or any valid domain name  associated  with  the
              host through the host database (/etc/hosts) or DNS, depending on
              the resolver settings.  Note that  if  the  short  form  of  the
              hostname is not used, it may prevent use of hostlist expressions
              (the numeric portion in brackets must  be  at  the  end  of  the
              string).   Only  short  hostname  forms  are compatible with the
              switch/elan and switch/federation plugins at this time.  It  may
              also  be  an  arbitrary string if NodeHostname is specified.  If
              the NodeName is "DEFAULT", the values specified with that record
              will  apply  to subsequent node specifications unless explicitly
              set to other values in that  node  record  or  replaced  with  a
              different set of default values.  For architectures in which the
              node order is significant, nodes will be considered  consecutive
              in  the  order  defined.   For example, if the configuration for
              "NodeName=charlie" immediately  follows  the  configuration  for
              "NodeName=baker"   they  will  be  considered  adjacent  in  the
              computer.

       NodeHostname
              Typically this would  be  the  string  that  "/bin/hostname  -s"
              returns.   It  may  also  be  the fully qualified domain name as
              returned by "/bin/hostname -f"  (e.g.  "foo1.bar.com"),  or  any
              valid  domain  name  associated  with  the host through the host
              database  (/etc/hosts)  or  DNS,  depending  on   the   resolver
              settings.   Note  that  if the short form of the hostname is not
              used, it may prevent use of hostlist  expressions  (the  numeric
              portion  in  brackets  must  be at the end of the string).  Only
              short hostname forms are compatible  with  the  switch/elan  and
              switch/federation plugins at this time.  A node range expression
              can be used to specify a set of  nodes.   If  an  expression  is
              used,  the  number of nodes identified by NodeHostname on a line
              in the configuration file must be identical  to  the  number  of
              nodes identified by NodeName.  By default, the NodeHostname will
              be identical in value to NodeName.

       NodeAddr
              Name that a  node  should  be  referred  to  in  establishing  a
              communications  path.   This name will be used as an argument to
              the gethostbyname() function  for  identification.   If  a  node
              range  expression is used to designate multiple nodes, they must
              exactly   match   the   entries   in    the    NodeName    (e.g.
              "NodeName=lx[0-7]   NodeAddr="elx[0-7]").    NodeAddr  may  also
              contain  IP  addresses.   By  default,  the  NodeAddr  will   be
              identical in value to NodeName.

       CoresPerSocket
              Number  of  cores  in  a  single physical processor socket (e.g.
              "2").  The CoresPerSocket value describes  physical  cores,  not
              the  logical number of processors per socket.  NOTE: If you have
              multi-core processors, you will  likely  need  to  specify  this
              parameter in order to optimize scheduling.  The default value is
              1.

       Feature
              A comma delimited list of arbitrary strings indicative  of  some
              characteristic  associated  with  the  node.   There is no value
              associated with a feature at this time,  a  node  either  has  a
              feature  or  it  does  not.   If desired a feature may contain a
              numeric component indicating, for example, processor speed.   By
              default a node has no features.

       Procs  Number  of  logical processors on the node (e.g. "2").  If Procs
              is omitted, it  will  set  equal  to  the  product  of  Sockets,
              CoresPerSocket, and ThreadsPerCore.  The default value is 1.

       RealMemory
              Size of real memory on the node in MegaBytes (e.g. "2048").  The
              default value is 1.

       Reason Identifies  the  reason  for  a  node  being  in  state  "DOWN",
              "DRAINED"  "DRAINING",  "FAIL"  or  "FAILING".   Use  quotes  to
              enclose a reason having more than one word.

       Sockets
              Number of physical processor sockets/chips  on  the  node  (e.g.
              "2").   If  Sockets  is omitted, it will be inferred from Procs,
              CoresPerSocket,  and  ThreadsPerCore.    NOTE:   If   you   have
              multi-core  processors,  you  will  likely need to specify these
              parameters.  The default value is 1.

       State  State of the node with respect to the initiation of  user  jobs.
              Acceptable  values  are  "DOWN",  "DRAIN", "FAIL", "FAILING" and
              "UNKNOWN".  "DOWN" indicates the node failed and is  unavailable
              to be allocated work.  "DRAIN" indicates the node is unavailable
              to be allocated work.  "FAIL" indicates the node is expected  to
              fail  soon,  has  no  jobs  allocated  to  it,  and  will not be
              allocated to any new jobs.   "FAILING"  indicates  the  node  is
              expected to fail soon, has one or more jobs allocated to it, but
              will not be allocated to any new jobs.  "UNKNOWN" indicates  the
              node’s   state   is  undefined  (BUSY  or  IDLE),  but  will  be
              established when the slurmd daemon on that node registers.   The
              default  value  is  "UNKNOWN".  Also see the DownNodes parameter
              below.

       ThreadsPerCore
              Number of logical threads in a single physical core (e.g.  "2").
              Note  that  the SLURM can allocate resources to jobs down to the
              resolution of a core. If your system  is  configured  with  more
              than  one  thread per core, execution of a different job on each
              thread is not supported.  A job  can  execute  a  one  task  per
              thread  from  within one job step or execute a distinct job step
              on each of the threads.  Note also if you are running with  more
              than  1  thread  per core and running the select/cons_res plugin
              you will  want  to  set  the  SelectTypeParameters  variable  to
              something  other  than  CR_CPU to avoid unexpected results.  The
              default value is 1.

       TmpDisk
              Total size of temporary disk storage in TmpFS in MegaBytes (e.g.
              "16384").  TmpFS  (for  "Temporary  File System") identifies the
              location which jobs should use for temporary storage.  Note this
              does not indicate the amount of free space available to the user
              on the node,  only  the  total  file  system  size.  The  system
              administration  should  insure  this  file  system  is purged as
              needed so that user jobs have access to most of this space.  The
              Prolog  and/or  Epilog  programs (specified in the configuration
              file) might be used to insure the file  system  is  kept  clean.
              The default value is 0.

       Weight The  priority  of  the node for scheduling purposes.  All things
              being equal, jobs will be allocated the nodes  with  the  lowest
              weight  which  satisfies  their  requirements.   For  example, a
              heterogeneous collection of nodes might be placed into a  single
              partition  for  greater  system  utilization, responsiveness and
              capability. It would be preferable to  allocate  smaller  memory
              nodes  rather  than larger memory nodes if either will satisfy a
              job’s requirements.  The units  of  weight  are  arbitrary,  but
              larger weights should be assigned to nodes with more processors,
              memory, disk space, higher processor speed, etc.  Note that if a
              job allocation request can not be satisfied using the nodes with
              the lowest weight, the set of nodes with the next lowest  weight
              is added to the set of nodes under consideration for use (repeat
              as needed for higher weight values). If you absolutely  want  to
              minimize  the  number  of higher weight nodes allocated to a job
              (at a cost of higher scheduling  overhead),  give  each  node  a
              distinct  Weight  value  and  they  will be added to the pool of
              nodes being considered for scheduling individually.  The default
              value is 1.

       The  "DownNodes=" configuration permits you to mark certain nodes as in
       a DOWN, DRAIN, FAIL, or FAILING state without  altering  the  permanent
       configuration information listed under a "NodeName=" specification.

       DownNodes
              Any  node  name,  or  list  of  node names, from the "NodeName="
              specifications.

       Reason Identifies the reason for a node being in state "DOWN", "DRAIN",
              "FAIL"  or "FAILING.  Use quotes to enclose a reason having more
              than one word.

       State  State of the node with respect to the initiation of  user  jobs.
              Acceptable values are "BUSY", "DOWN", "DRAIN", "FAIL", "FAILING,
              "IDLE", and "UNKNOWN".  "DOWN" indicates the node failed and  is
              unavailable to be allocated work.  "DRAIN" indicates the node is
              unavailable to be allocated work.  "FAIL" indicates the node  is
              expected to fail soon, has no jobs allocated to it, and will not
              be allocated to any new jobs.  "FAILING" indicates the  node  is
              expected to fail soon, has one or more jobs allocated to it, but
              will not be allocated to any new jobs.  "FUTURE"  indicates  the
              node is defined for future use and need not exist when the SLURM
              daemons are started. These nodes can be made available  for  use
              simply  by  updating  the  node state using the scontrol command
              rather than restarting the slurmctld daemon. After  these  nodes
              are  made  available, change their State in the slurm.conf file.
              Until these nodes are made available,  they  will  not  be  seen
              using  any  SLURM commands or Is nor will any attempt be made to
              contact them.  "UNKNOWN" indicates the node’s state is undefined
              (BUSY  or  IDLE), but will be established when the slurmd daemon
              on that node registers.  The default value is "UNKNOWN".

       The partition configuration permits  you  to  establish  different  job
       limits  or access controls for various groups (or partitions) of nodes.
       Nodes may be in more than one partition,  making  partitions  serve  as
       general  purpose queues.  For example one may put the same set of nodes
       into two different partitions, each with  different  constraints  (time
       limit, job sizes, groups allowed to use the partition, etc.).  Jobs are
       allocated resources within a single partition.  Default values  can  be
       specified  with  a  record  in which "PartitionName" is "DEFAULT".  The
       default entry values will apply only  to  lines  following  it  in  the
       configuration  file  and the default values can be reset multiple times
       in   the   configuration   file    with    multiple    entries    where
       "PartitionName=DEFAULT".   The  "PartitionName="  specification must be
       placed on every line describing the configuration of partitions.  NOTE:
       Put  all  parameters for each partition on a single line.  Each line of
       partition  configuration  information  should  represent  a   different
       partition.   The  partition  configuration  file contains the following
       information:

       AllocNodes
              Comma separated list of nodes from which users can execute  jobs
              in  the  partition.   Node names may be specified using the node
              range expression syntax described above.  The default  value  is
              "ALL".

       AllowGroups
              Comma  separated list of group IDs which may execute jobs in the
              partition.  If at least  one  group  associated  with  the  user
              attempting  to  execute  the  job  is in AllowGroups, he will be
              permitted to use this partition.  Jobs executed as user root can
              use  any  partition  without regard to the value of AllowGroups.
              If user root attempts to execute a job  as  another  user  (e.g.
              using  srun’s  --uid  option), this other user must be in one of
              groups identified by AllowGroups for  the  job  to  successfully
              execute.  The default value is "ALL".

       Default
              If  this  keyword  is  set,  jobs  submitted without a partition
              specification will utilize this partition.  Possible values  are
              "YES" and "NO".  The default value is "NO".

       DisableRootJobs
              If  set  to  "YES" then user root will be prevented from running
              any jobs on this partition.  The default value will be the value
              of  DisableRootJobs  set  outside  of  a partition specification
              (which is "NO", allowing user root to execute jobs).

       Hidden Specifies if the partition and its jobs  are  to  be  hidden  by
              default.   Hidden  partitions will by default not be reported by
              the SLURM APIs or commands.  Possible values are "YES" and "NO".
              The default value is "NO".

       MaxNodes
              Maximum  count of nodes (c-nodes for BlueGene systems) which may
              be  allocated  to  any  single  job.   The  default   value   is
              "UNLIMITED",  which is represented internally as -1.  This limit
              does not apply to jobs executed by SlurmUser or user root.

       MaxTime
              Maximum  run  time  limit  for   jobs.    Format   is   minutes,
              minutes:seconds,        hours:minutes:seconds,       days-hours,
              days-hours:minutes, days-hours:minutes:seconds  or  "UNLIMITED".
              Time  resolution  is one minute and second values are rounded up
              to the next minute.  This limit does not apply to jobs  executed
              by SlurmUser or user root.

       DefaultTime
              Run  time limit used for jobs that don’t specify a value. If not
              set then MaxTime will be  used.   Format  is  the  same  as  for
              MaxTime.

       MinNodes
              Minimum count of nodes (or base partitions for BlueGene systems)
              which may be allocated to any single job.  The default value  is
              1.   This  limit does not apply to jobs executed by SlurmUser or
              user root.

       Nodes  Comma separated list of nodes (or base partitions  for  BlueGene
              systems)  which  are associated with this partition.  Node names
              may  be  specified  using  the  node  range  expression   syntax
              described  above.  A blank list of nodes (i.e. "Nodes= ") can be
              used if one wants a partition to exist, but  have  no  resources
              (possibly on a temporary basis).

       PartitionName
              Name   by   which   the   partition   may  be  referenced  (e.g.
              "Interactive").  This  name  can  be  specified  by  users  when
              submitting  jobs.  If the PartitionName is "DEFAULT", the values
              specified with that record will apply  to  subsequent  partition
              specifications  unless  explicitly  set  to other values in that
              partition record or replaced with a  different  set  of  default
              values.

       Priority
              Jobs submitted to a higher priority partition will be dispatched
              before pending jobs in lower priority partitions and if possible
              they  will  preempt running jobs from lower priority partitions.
              Note that a partition’s priority takes precedence over  a  job’s
              priority.  The value may not exceed 65533.

       RootOnly
              Specifies  if  only  user  ID zero (i.e. user root) may allocate
              resources in this partition. User root  may  allocate  resources
              for  any  other  user, but the request must be initiated by user
              root.  This option can be useful for a partition to  be  managed
              by  some  external  entity (e.g. a higher-level job manager) and
              prevents users from directly using  those  resources.   Possible
              values are "YES" and "NO".  The default value is "NO".

       Shared Controls  the  ability of the partition to execute more than one
              job at a time on each resource (node, socket or  core  depending
              upon the value of SelectTypeParameters).  If resources are to be
              shared, avoiding memory  over-subscription  is  very  important.
              SelectTypeParameters  should  be configured to treat memory as a
              consumable resource and the --mem option should be used for  job
              allocations.  Sharing of resources is typically useful only when
              using gang scheduling (PreemptMode=suspend or PreemptMode=kill).
              Possible  values for Shared are "EXCLUSIVE", "FORCE", "YES", and
              "NO".  The default value is "NO".  For more information see  the
              following web pages:
              https://computing.llnl.gov/linux/slurm/cons_res.html,
              https://computing.llnl.gov/linux/slurm/cons_res_share.html,
              https://computing.llnl.gov/linux/slurm/gang_scheduling.html, and
              https://computing.llnl.gov/linux/slurm/preempt.html.

              EXCLUSIVE   Allocates   entire   nodes   to   jobs   even   with
                          select/cons_res   configured.    Jobs  that  run  in
                          partitions   with   "Shared=EXCLUSIVE"   will   have
                          exclusive access to all allocated nodes.

              FORCE       Makes  all  resources in the partition available for
                          sharing without any means for users to  disable  it.
                          May  be  followed with a colon and maximum number of
                          jobs in running or  suspended  state.   For  example
                          "Shared=FORCE:4"  enables  each node, socket or core
                          to execute up to four  jobs  at  once.   Recommended
                          only  for  BlueGene  systems  configured  with small
                          blocks or for systems running with  gang  scheduling
                          (SchedulerType=sched/gang).

              YES         Makes  all  resources in the partition available for
                          sharing, but honors a user’s request  for  dedicated
                          resources.    If   SelectType=select/cons_res,  then
                          resources will be over-subscribed unless  explicitly
                          disabled   in  the  job  submit  request  using  the
                          "--exclusive"             option.               With
                          SelectType=select/bluegene                        or
                          SelectType=select/linear,  resources  will  only  be
                          over-subscribed  when  explicitly  requested  by the
                          user using the "--share" option on  job  submission.
                          May  be  followed with a colon and maximum number of
                          jobs in running or  suspended  state.   For  example
                          "Shared=YES:4"  enables each node, socket or core to
                          execute up to four jobs at once.   Recommended  only
                          for    systems    running   with   gang   scheduling
                          (SchedulerType=sched/gang).

              NO          Selected resources are allocated to a single job. No
                          resource will be allocated to more than one job.

       State  State of partition or availability for use.  Possible values are
              "UP" or "DOWN". The default value is "UP".

Prolog and Epilog Scripts

       There are a variety of prolog and epilog program options  that  execute
       with  various  permissions and at various times.  The four options most
       likely to be used are: Prolog and Epilog (executed once on each compute
       node  for  each job) plus PrologSlurmctld and EpilogSlurmctld (executed
       once on the ControlMachine for each job).

       NOTE:  Standard output and error messages are normally  not  preserved.
       Explicitly  write  output and error messages to an appropriate location
       if you which to preserve that information.

       NOTE:  The Prolog script is ONLY run on any  individual  node  when  it
       first sees a job step from a new allocation; it does not run the Prolog
       immediately when an allocation is granted.  If no  job  steps  from  an
       allocation  are  run  on  a node, it will never run the Prolog for that
       allocation.  The Epilog, on the other hand, always runs on  every  node
       of an allocation when the allocation is released.

       Information  about  the  job  is passed to the script using environment
       variables.  Unless otherwise specified, these environment variables are
       available to all of the programs.

       BASIL_RESERVATION_ID
              Basil reservation ID.  Available on Cray XT systems only.

       MPIRUN_PARTITION
              BlueGene partition name.  Available on BlueGene systems only.

       SLURM_JOB_ACCOUNT
              Account name used for the job.  Available in PrologSlurmctld and
              EpilogSlurmctld only.

       SLURM_JOB_CONSTRAINTS
              Features required to run the job.  Available in  PrologSlurmctld
              and EpilogSlurmctld only.

       SLURM_JOB_GID
              Group  ID  of the job’s owner.  Available in PrologSlurmctld and
              EpilogSlurmctld only.

       SLURM_JOB_GROUP
              Group name of the job’s owner.  Available in PrologSlurmctld and
              EpilogSlurmctld only.

       SLURM_JOB_ID
              Job ID.

       SLURM_JOB_NAME
              Name   of   the   job.    Available   in   PrologSlurmctld   and
              EpilogSlurmctld only.

       SLURM_JOB_NODELIST
              Nodes assigned to job. A SLURM hostlist  expression.   "scontrol
              show  hostnames"  can  be  used  to  convert  this  to a list of
              individual  host  names.   Available  in   PrologSlurmctld   and
              EpilogSlurmctld only.

       SLURM_JOB_PARTITION
              Partition  that  job  runs in.  Available in PrologSlurmctld and
              EpilogSlurmctld only.

       SLURM_JOB_UID
              User ID of the job’s owner.

       SLURM_JOB_USER
              User name of the job’s owner.

NETWORK TOPOLOGY

       SLURM  is  able  to  optimize  job  allocations  to  minimize   network
       contention.   Special  SLURM  logic  is used to optimize allocations on
       systems  with   a   three-dimensional   interconnect   (BlueGene,   Sun
       Constellation,  etc.)   and information about configuring those systems
       are     available      on      web      pages      available      here:
       <https://computing.llnl.gov/linux/slurm/>.  For a hierarchical network,
       SLURM needs to have detailed information about how nodes are configured
       on the network switches.

       Given  network  topology  information,  SLURM  allocates all of a job’s
       resources onto a single leaf of  the  network  (if  possible)  using  a
       best-fit  algorithm.  Otherwise it will allocate a job’s resources onto
       multiple leaf switches so  as  to  minimize  the  use  of  higher-level
       switches.   The  TopologyPlugin parameter controls which plugin is used
       to collect network topology information.   The  only  values  presently
       supported  are  "topology/3d_torus"  (default  for  IBM  BlueGene,  Sun
       Constellation  and  Cray  XT  systems,  performs  best-fit  logic  over
       three-dimensional   topology),   "topology/none"   (default  for  other
       systems, best-fit logic over one-dimensional topology), "topology/tree"
       (determine  the  network topology based upon information contained in a
       topology.conf file, see  "man  topology.conf"  for  more  information).
       Future  plugins  may  gather  topology  information  directly  from the
       network.  The topology information is optional.  If not provided, SLURM
       will  perform  a  best-fit  algorithm  assuming  the  nodes  are  in  a
       one-dimensional array as configured  and  the  communications  cost  is
       related to the node distance in this array.

RELOCATING CONTROLLERS

       If  the  cluster’s  computers used for the primary or backup controller
       will be out of service for an  extended  period  of  time,  it  may  be
       desirable to relocate them.  In order to do so, follow this procedure:

       1. Stop the SLURM daemons
       2. Modify the slurm.conf file appropriately
       3. Distribute the updated slurm.conf file to all nodes
       4. Restart the SLURM daemons

       There  should  be  no loss of any running or pending jobs.  Insure that
       any nodes added  to  the  cluster  have  the  current  slurm.conf  file
       installed.

       CAUTION:  If  two  nodes  are  simultaneously configured as the primary
       controller (two nodes on which ControlMachine specify  the  local  host
       and the slurmctld daemon is executing on each), system behavior will be
       destructive.  If a compute node  has  an  incorrect  ControlMachine  or
       BackupController  parameter, that node may be rendered unusable, but no
       other harm will result.

EXAMPLE

       #
       # Sample /etc/slurm.conf for dev[0-25].llnl.gov
       # Author: John Doe
       # Date: 11/06/2001
       #
       ControlMachine=dev0
       ControlAddr=edev0
       BackupController=dev1
       BackupAddr=edev1
       #
       AuthType=auth/munge
       Epilog=/usr/local/slurm/epilog
       Prolog=/usr/local/slurm/prolog
       FastSchedule=1
       FirstJobId=65536
       InactiveLimit=120
       JobCompType=jobcomp/filetxt
       JobCompLoc=/var/log/slurm/jobcomp
       KillWait=30
       MaxJobCount=10000
       MinJobAge=3600
       PluginDir=/usr/local/lib:/usr/local/slurm/lib
       ReturnToService=0
       SchedulerType=sched/backfill
       SlurmctldLogFile=/var/log/slurm/slurmctld.log
       SlurmdLogFile=/var/log/slurm/slurmd.log
       SlurmctldPort=7002
       SlurmdPort=7003
       SlurmdSpoolDir=/usr/local/slurm/slurmd.spool
       StateSaveLocation=/usr/local/slurm/slurm.state
       SwitchType=switch/elan
       TmpFS=/tmp
       WaitTime=30
       JobCredentialPrivateKey=/usr/local/slurm/private.key
       JobCredentialPublicCertificate=/usr/local/slurm/public.cert
       #
       # Node Configurations
       #
       NodeName=DEFAULT Procs=2 RealMemory=2000 TmpDisk=64000
       NodeName=DEFAULT State=UNKNOWN
       NodeName=dev[0-25] NodeAddr=edev[0-25] Weight=16
       # Update records for specific DOWN nodes
       DownNodes=dev20 State=DOWN Reason="power,ETA=Dec25"
       #
       # Partition Configurations
       #
       PartitionName=DEFAULT MaxTime=30 MaxNodes=10 State=UP
       PartitionName=debug Nodes=dev[0-8,18-25] Default=YES
       PartitionName=batch Nodes=dev[9-17]  MinNodes=4
       PartitionName=long Nodes=dev[9-17] MaxTime=120 AllowGroups=admin

COPYING

       Copyright (C) 2002-2007 The Regents of the  University  of  California.
       Copyright (C) 2008-2009 Lawrence Livermore National Security.  Produced
       at   Lawrence   Livermore   National   Laboratory   (cf,   DISCLAIMER).
       CODE-OCEC-09-009. All rights reserved.

       This  file  is  part  of  SLURM,  a  resource  management program.  For
       details, see <https://computing.llnl.gov/linux/slurm/>.

       SLURM is free software; you can redistribute it and/or modify it  under
       the  terms  of  the GNU General Public License as published by the Free
       Software Foundation; either version 2  of  the  License,  or  (at  your
       option) any later version.

       SLURM  is  distributed  in the hope that it will be useful, but WITHOUT
       ANY WARRANTY; without even the implied warranty of  MERCHANTABILITY  or
       FITNESS  FOR  A PARTICULAR PURPOSE.  See the GNU General Public License
       for more details.

FILES

       /etc/slurm.conf

SEE ALSO

       bluegene.conf(5),     gethostbyname(3),     getrlimit(2),     group(5),
       hostname(1),   scontrol(1),   slurmctld(8),   slurmd(8),   slurmdbd(8),
       slurmdbd.conf(5),  srun(1),  spank(8),   syslog(2),   topology.conf(5),
       wiki.conf(5)