Provided by: slurm-llnl_2.3.2-1ubuntu1_amd64 bug

NAME

       slurm.conf - Slurm configuration file

DESCRIPTION

       slurm.conf  is  an ASCII file which describes general SLURM configuration information, the
       nodes to be managed, information about how those nodes are grouped  into  partitions,  and
       various  scheduling  parameters  associated  with  those  partitions.  This file should be
       consistent across all nodes in the cluster.

       The file location can be modified  at  system  build  time  using  the  DEFAULT_SLURM_CONF
       parameter  or  at execution time by setting the SLURM_CONF environment variable. The SLURM
       daemons also allow you to override both the  built-in  and  environment-provided  location
       using the "-f" option on the command line.

       The  contents  of  the  file  are  case  insensitive  except  for  the  names of nodes and
       partitions. Any text following a "#" in the configuration file is  treated  as  a  comment
       through  the  end  of  that  line.   The  size of each line in the file is limited to 1024
       characters.  Changes to the configuration file take effect upon restart of SLURM  daemons,
       daemon  receipt  of  the SIGHUP signal, or execution of the command "scontrol reconfigure"
       unless otherwise noted.

       If a line begins with the word "Include" followed by whitespace and then a file name, that
       file will be included inline with the current configuration file.

       Note on file permissions:

       The  slurm.conf  file  must be readable by all users of SLURM, since it is used by many of
       the SLURM commands.  Other files that are defined in the  slurm.conf  file,  such  as  log
       files and job accounting files, may need to be created/owned by the user "SlurmUser" to be
       successfully accessed.  Use the "chown" and "chmod" commands  to  set  the  ownership  and
       permissions appropriately.  See the section FILE AND DIRECTORY PERMISSIONS for information
       about the various files and directories used by SLURM.

PARAMETERS

       The overall configuration parameters available include:

       AccountingStorageBackupHost
              The name of the backup machine hosting the accounting storage  database.   If  used
              with  the  accounting_storage/slurmdbd  plugin,  this  is where the backup slurmdbd
              would be running.  Only used for database type storage plugins, ignored otherwise.

       AccountingStorageEnforce
              This controls  what  level  of  association-based  enforcement  to  impose  on  job
              submissions.   Valid  options are any combination of associations, limits, qos, and
              wckeys, or all for all things.  If limits, qos, or  wckeys  are  set,  associations
              will  automatically  be  set.   In  addition,  if  wckeys  is  set, TrackWCKey will
              automatically be set.  By enforcing Associations no  new  job  is  allowed  to  run
              unless  a  corresponding  association exists in the system.  If limits are enforced
              users can be limited by association to whatever job size or  run  time  limits  are
              defined.  With qos and/or wckeys enforced jobs will not be scheduled unless a valid
              qos    and/or    workload    characterization    key    is     specified.      When
              AccountingStorageEnforce  is changed, a restart of the slurmctld daemon is required
              (not just a "scontrol reconfig").

       AccountingStorageHost
              The name of the machine hosting the accounting storage  database.   Only  used  for
              database type storage plugins, ignored otherwise.  Also see DefaultStorageHost.

       AccountingStorageLoc
              The  fully  qualified  file  name  where  accounting  records  are written when the
              AccountingStorageType is "accounting_storage/filetxt"  or  else  the  name  of  the
              database  where  accounting  records are stored when the AccountingStorageType is a
              database.  Also see DefaultStorageLoc.

       AccountingStoragePass
              The password used to gain access to the database  to  store  the  accounting  data.
              Only  used  for  database  type storage plugins, ignored otherwise.  In the case of
              SLURM DBD (Database Daemon) with MUNGE authentication this can be configured to use
              a  MUNGE  daemon specifically configured to provide authentication between clusters
              while the default MUNGE daemon provides authentication within a cluster.   In  that
              case,   AccountingStoragePass  should  specify  the  named  port  to  be  used  for
              communications      with      the      alternate      MUNGE      daemon       (e.g.
              "/var/run/munge/global.socket.2").   The   default   value   is   NULL.   Also  see
              DefaultStoragePass.

       AccountingStoragePort
              The listening port of the  accounting  storage  database  server.   Only  used  for
              database type storage plugins, ignored otherwise.  Also see DefaultStoragePort.

       AccountingStorageType
              The  accounting  storage  mechanism  type.   Acceptable  values  at present include
              "accounting_storage/filetxt",                           "accounting_storage/mysql",
              "accounting_storage/none",              "accounting_storage/pgsql",             and
              "accounting_storage/slurmdbd".  The  "accounting_storage/filetxt"  value  indicates
              that   accounting   records   will   be  written  to  the  file  specified  by  the
              AccountingStorageLoc parameter.   The  "accounting_storage/mysql"  value  indicates
              that  accounting  records  will  be  written  to  a MySQL database specified by the
              AccountingStorageLoc parameter.   The  "accounting_storage/pgsql"  value  indicates
              that  accounting  records will be written to a PostgreSQL database specified by the
              AccountingStorageLoc parameter.  The "accounting_storage/slurmdbd" value  indicates
              that  accounting  records  will  be  written  to  the  SLURM  DBD, which manages an
              underlying MySQL or PostgreSQL database. See "man slurmdbd" for  more  information.
              The  default  value is "accounting_storage/none" and indicates that account records
              are not maintained.  Note: the PostgreSQL plugin is not complete and should not  be
              used if wanting to use associations.  It will however work with basic accounting of
              jobs  and  job  steps.   If  interested  in   completing,   please   email   slurm-
              dev@lists.llnl.gov.  Also see DefaultStorageType.

       AccountingStorageUser
              The  user  account  for  accessing  the accounting storage database.  Only used for
              database type storage plugins, ignored otherwise.  Also see DefaultStorageUser.

       AccountingStoreJobComment
              If set to "YES" then include the job's comment field in the  job  complete  message
              sent to the Accounting Storage database.  The default is "YES".

       AuthType
              The  authentication method for communications between SLURM components.  Acceptable
              values at present include "auth/none", "auth/authd", and "auth/munge".  The default
              value  is "auth/munge".  "auth/none" includes the UID in each communication, but it
              is not verified.  This may be fine for testing purposes, but do not use "auth/none"
              if  you  desire any security.  "auth/authd" indicates that Brett Chun's authd is to
              be used (see "http://www.theether.org/authd/" for more information. Note that authd
              is  no  longer actively supported).  "auth/munge" indicates that LLNL's MUNGE is to
              be used (this is  the  best  supported  authentication  mechanism  for  SLURM,  see
              "http://munge.googlecode.com/"  for  more  information).   All  SLURM  daemons  and
              commands must be terminated prior to changing  the  value  of  AuthType  and  later
              restarted (SLURM jobs can be preserved).

       BackupAddr
              The   name   that   BackupController  should  be  referred  to  in  establishing  a
              communications path. This name will be used as an argument to  the  gethostbyname()
              function  for identification. For example, "elx0000" might be used to designate the
              Ethernet address for node "lx0000".  By default the BackupAddr will be identical in
              value to BackupController.

       BackupController
              The  name  of  the  machine where SLURM control functions are to be executed in the
              event that ControlMachine fails. This node may also be used as a compute server  if
              so  desired.  It  will  come  into service as a controller only upon the failure of
              ControlMachine and will revert to a "standby" mode when the ControlMachine  becomes
              available  once  again.   This  should be a node name without the full domain name.
              I.e., the hostname returned by the gethostname() function  cut  at  the  first  dot
              (e.g.  use  "tux001"  rather  than  "tux001.my.com").   While  not essential, it is
              recommended that you specify a backup controller.  See  the RELOCATING  CONTROLLERS
              section if you change this.

       BatchStartTimeout
              The  maximum  time  (in seconds) that a batch job is permitted for launching before
              being considered missing and releasing the allocation.  The  default  value  is  10
              (seconds).  Larger  values  may be required if more time is required to execute the
              Prolog, load user environment variables (for Moab spawned jobs), or if  the  slurmd
              daemon gets paged from memory.

       CacheGroups
              If  set  to  1, the slurmd daemon will cache /etc/groups entries.  This can improve
              performance for highly parallel jobs if NIS servers are used and unable to  respond
              very quickly.  The default value is 0 to disable caching group data.

       CheckpointType
              The  system-initiated  checkpoint  method  to be used for user jobs.  The slurmctld
              daemon must be restarted for a change in CheckpointType to take effect.   Supported
              values presently include:

              checkpoint/aix    for AIX systems only

              checkpoint/blcr   Berkeley Lab Checkpoint Restart (BLCR).  NOTE: If a file is found
                                at sbin/scch (relative to the SLURM  installation  location),  it
                                will be executed upon completion of the checkpoint. This can be a
                                script used for managing the checkpoint files.

              checkpoint/none   no checkpoint support (default)

              checkpoint/ompi   OpenMPI (version 1.3 or higher)

       ClusterName
              The name by which this SLURM managed cluster is known in the  accounting  database.
              This  is needed distinguish accounting records when multiple clusters report to the
              same database.

       CompleteWait
              The time, in seconds, given for a job to remain  in  COMPLETING  state  before  any
              additional  jobs  are  scheduled.   If set to zero, pending jobs will be started as
              soon as possible.  Since a COMPLETING job's resources are released for use by other
              jobs  as  soon  as the Epilog completes on each individual node, this can result in
              very fragmented resource allocations.  To provide jobs with  the  minimum  response
              time,  a  value  of zero is recommended (no waiting).  To minimize fragmentation of
              resources, a value equal to KillWait  plus  two  is  recommended.   In  that  case,
              setting  KillWait  to  a  small  value  may  be  beneficial.   The default value of
              CompleteWait is zero seconds.  The value may not exceed 65533.

       ControlAddr
              Name that ControlMachine should be referred to  in  establishing  a  communications
              path.  This  name  will  be used as an argument to the gethostbyname() function for
              identification. For example, "elx0000" might be  used  to  designate  the  Ethernet
              address  for  node "lx0000".  By default the ControlAddr will be identical in value
              to ControlMachine.

       ControlMachine
              The short hostname of the machine where SLURM control functions are executed  (i.e.
              the  name  returned  by  the  command  "hostname  -s",  use  "tux001"  rather  than
              "tux001.my.com").  This value must be specified.  In order  to  support  some  high
              availability  architectures, multiple hostnames may be listed with comma separators
              and one ControlAddr must be specified. The high  availability  system  must  insure
              that the slurmctld daemon is running on only one of these hosts at a time.  See the
              RELOCATING CONTROLLERS section if you change this.

       CryptoType
              The  cryptographic  signature  tool  to  be  used  in  the  creation  of  job  step
              credentials.   The slurmctld daemon must be restarted for a change in CryptoType to
              take  effect.   Acceptable   values   at   present   include   "crypto/munge"   and
              "crypto/openssl".  The default value is "crypto/munge".

       DebugFlags
              Defines  specific  subsystems  which  should  provide  more detailed event logging.
              Multiple subsystems can be  specified  with  comma  separators.   Valid  subsystems
              available today (with more to come) include:

              Backfill         Backfill scheduler details

              BGBlockAlgo      BlueGene block selection details

              BGBlockAlgoDeep  BlueGene block selection, more details

              BGBlockPick      BlueGene block selection for jobs

              BGBlockWires     BlueGene block wiring (switch state details)

              CPU_Bind         CPU binding details for jobs and steps

              FrontEnd         Front end node details

              Gres             Generic resource details

              Gang             Gang scheduling details

              NO_CONF_HASH     Do not log when the slurm.conf files differs between SLURM daemons

              Priority         Job prioritization

              Reservation      Advanced reservations

              SelectType       Resource selection plugin

              Steps            Slurmctld resource allocation for job steps

              Triggers         Slurmctld triggers

              Wiki             Sched/wiki and wiki2 communications

       DefMemPerCPU
              Default  real  memory size available per allocated CPU in MegaBytes.  Used to avoid
              over-subscribing memory and causing paging.  DefMemPerCPU would generally  be  used
              if  individual  processors are allocated to jobs (SelectType=select/cons_res).  The
              default  value  is  0  (unlimited).   Also  see  DefMemPerNode  and   MaxMemPerCPU.
              DefMemPerCPU and DefMemPerNode are mutually exclusive.  NOTE: Enforcement of memory
              limits currently requires enabling of accounting, which samples  memory  use  on  a
              periodic basis (data need not be stored, just collected).

       DefMemPerNode
              Default  real memory size available per allocated node in MegaBytes.  Used to avoid
              over-subscribing memory and causing paging.  DefMemPerNode would generally be  used
              if  whole  nodes are allocated to jobs (SelectType=select/linear) and resources are
              shared (Shared=yes or Shared=force).  The default value is 0 (unlimited).  Also see
              DefMemPerCPU  and  MaxMemPerNode.   DefMemPerCPU  and  DefMemPerNode  are  mutually
              exclusive.  NOTE: Enforcement of  memory  limits  currently  requires  enabling  of
              accounting,  which samples memory use on a periodic basis (data need not be stored,
              just collected).

       DefaultStorageHost
              The default name of the machine hosting the accounting storage and  job  completion
              databases.    Only   used   for   database   type  storage  plugins  and  when  the
              AccountingStorageHost and JobCompHost have not been defined.

       DefaultStorageLoc
              The fully qualified file  name  where  accounting  records  and/or  job  completion
              records  are  written  when  the DefaultStorageType is "filetxt" or the name of the
              database where accounting records and/or job completion records are stored when the
              DefaultStorageType is a database.  Also see AccountingStorageLoc and JobCompLoc.

       DefaultStoragePass
              The  password  used  to gain access to the database to store the accounting and job
              completion data.  Only used for database type storage plugins,  ignored  otherwise.
              Also see AccountingStoragePass and JobCompPass.

       DefaultStoragePort
              The listening port of the accounting storage and/or job completion database server.
              Only  used  for  database  type  storage  plugins,  ignored  otherwise.   Also  see
              AccountingStoragePort and JobCompPort.

       DefaultStorageType
              The  accounting  and  job  completion storage mechanism type.  Acceptable values at
              present include "filetxt", "mysql", "none", "pgsql",  and  "slurmdbd".   The  value
              "filetxt"  indicates  that  records  will  be written to a file.  The value "mysql"
              indicates that accounting records will be written to a mysql database.  The default
              value  is  "none",  which means that records are not maintained.  The value "pgsql"
              indicates that records will  be  written  to  a  PostgreSQL  database.   The  value
              "slurmdbd" indicates that records will be written to the SLURM DBD, which maintains
              its  own  database.  See  "man  slurmdbd"   for   more   information.    Also   see
              AccountingStorageType and JobCompType.

       DefaultStorageUser
              The  user  account  for  accessing  the  accounting  storage  and/or job completion
              database.  Only used for database type storage plugins,  ignored  otherwise.   Also
              see AccountingStorageUser and JobCompUser.

       DisableRootJobs
              If  set  to  "YES"  then  user  root  will be prevented from running any jobs.  The
              default  value  is  "NO",  meaning  user  root  will  be  able  to  execute   jobs.
              DisableRootJobs may also be set by partition.

       EnforcePartLimits
              If  set  to "YES" then jobs which exceed a partition's size and/or time limits will
              be rejected at submission time. If set to "NO" then the job will  be  accepted  and
              remain  queued  until the partition limits are altered.  The default value is "NO".
              NOTE: If set, then a job's QOS can not be used to exceed partition limits.

       Epilog Fully qualified pathname of a script to execute as user root on every node  when  a
              user's  job  completes  (e.g. "/usr/local/slurm/epilog"). This may be used to purge
              files, disable user login, etc.  By default there is no  epilog.   See  Prolog  and
              Epilog Scripts for more information.

       EpilogMsgTime
              The  number of microseconds that the slurmctld daemon requires to process an epilog
              completion message from the slurmd dameons. This parameter can be used to prevent a
              burst  of  epilog completion messages from being sent at the same time which should
              help prevent lost messages and improve throughput  for  large  jobs.   The  default
              value  is  2000  microseconds.   For  a  1000  node  job,  this  spreads the epilog
              completion messages out over two seconds.

       EpilogSlurmctld
              Fully qualified pathname of a program for the slurmctld to execute upon termination
              of  a  job  allocation  (e.g.   "/usr/local/slurm/epilog_controller").  The program
              executes as SlurmUser, which gives it permission to drain nodes and requeue the job
              if  a  failure occurs or cancel the job if appropriate.  The program can be used to
              reboot nodes or perform other work to prepare resources for use.   See  Prolog  and
              Epilog Scripts for more information.

       FastSchedule
              Controls  how a node's configuration specifications in slurm.conf are used.  If the
              number of node configuration entries in the  configuration  file  is  significantly
              lower  than  the number of nodes, setting FastSchedule to 1 will permit much faster
              scheduling decisions to be made.  (The scheduler can just check the values in a few
              configuration records instead of possibly thousands of node records.)  Note that on
              systems with hyper-threading, the processor count reported  by  the  node  will  be
              twice  the  actual  processor  count.  Consider which value you want to be used for
              scheduling purposes.

              1 (default)
                   Consider the configuration of each node to be that specified in the slurm.conf
                   configuration  file  and any node with less than the configured resources will
                   be set DOWN.

              0    Base scheduling decisions upon the actual  configuration  of  each  individual
                   node  except  that  the  node's  processor count in SLURM's configuration must
                   match  the  actual  hardware  configuration  if  SchedulerType=sched/gang   or
                   SelectType=select/cons_res  are  configured  (both  of  those plugins maintain
                   resource allocation information using bitmaps for the cores in the system  and
                   must  remain static, while the node's memory and disk space can be established
                   later).

              2    Consider the configuration of each node to be that specified in the slurm.conf
                   configuration  file  and any node with less than the configured resources will
                   not be set DOWN.  This can be useful for testing purposes.

       FirstJobId
              The job id to be used for the first submitted to SLURM without a specific requested
              value.  Job id values generated will incremented by 1 for each subsequent job. This
              may be used to provide a meta-scheduler with a job id space which is disjoint  from
              the interactive jobs.  The default value is 1.  Also see MaxJobId

       GetEnvTimeout
              Used for Moab scheduled jobs only. Controls how long job should wait in seconds for
              loading the user's environment before attempting to load  it  from  a  cache  file.
              Applies  when  the  srun  or sbatch --get-user-env option is used. If set to 0 then
              always load the user's environment from the cache file.  The  default  value  is  2
              seconds.

       GresTypes
              A comma delimited list of generic resources to be managed.  These generic resources
              may have an associated plugin available to provide  additional  functionality.   No
              generic  resources  are  managed  by  default.  Insure this parameter is consistent
              across all nodes in the cluster for proper operation.  The slurmctld daemon must be
              restarted for changes to this parameter to become effective.

       GroupUpdateForce
              If  set  to  a  non-zero  value,  then information about which users are members of
              groups allowed to use a partition will be updated  periodically,  even  when  there
              have  been  no  changes to the /etc/group file.  Otherwise group member information
              will be updated periodically only after the /etc/group file is updated The  default
              vaue is 0.  Also see the GroupUpdateTime parameter.

       GroupUpdateTime
              Controls how frequently information about which users are members of groups allowed
              to use a partition will be updated.  The time interval is given in seconds  with  a
              default  value of 600 seconds and a maximum value of 4095 seconds.  A value of zero
              will prevent periodic updating of  group  membership  information.   Also  see  the
              GroupUpdateForce parameter.

       HealthCheckInterval
              The  interval  in  seconds  between  executions of HealthCheckProgram.  The default
              value is zero, which disables execution.

       HealthCheckProgram
              Fully qualified pathname of a script to execute as user root  periodically  on  all
              compute  nodes that are not in the NOT_RESPONDING state. This may be used to verify
              the node is fully operational and DRAIN the node or send  email  if  a  problem  is
              detected.  Any action to be taken must be explicitly performed by the program (e.g.
              execute "scontrol update NodeName=foo State=drain  Reason=tmp_file_system_full"  to
              drain a node).  The interval is controlled using the HealthCheckInterval parameter.
              Note that the HealthCheckProgram will be executed at the same time on all nodes  to
              minimize  its  impact upon parallel programs.  This program is will be killed if it
              does not terminate normally within 60 seconds.  By  default,  no  program  will  be
              executed.

       InactiveLimit
              The interval, in seconds, after which a non-responsive job allocation command (e.g.
              srun or salloc) will result in the job being terminated. If the node on  which  the
              command is executed fails or the command abnormally terminates, this will terminate
              its job allocation.  This option has no effect upon batch  jobs.   When  setting  a
              value,  take into consideration that a debugger using srun to launch an application
              may leave the srun command in a stopped state for extended periods of  time.   This
              limit  is  ignored  for  jobs running in partitions with the RootOnly flag set (the
              scheduler running as root will be responsible for the job).  The default  value  is
              unlimited (zero) and may not exceed 65533 seconds.

       JobAcctGatherType
              The   job   accounting  mechanism  type.   Acceptable  values  at  present  include
              "jobacct_gather/aix" (for AIX operating system), "jobacct_gather/linux" (for  Linux
              operating  system)  and  "jobacct_gather/none" (no accounting data collected).  The
              default  value  is  "jobacct_gather/none".   In  order  to  use  the  sstat   tool,
              "jobacct_gather/aix" or "jobacct_gather/linux" must be configured.

       JobAcctGatherFrequency
              The  job  accounting  sampling interval.  For jobacct_gather/none this parameter is
              ignored.  For  jobacct_gather/aix  and  jobacct_gather/linux  the  parameter  is  a
              number  is seconds between sampling job state.  The default value is 30 seconds.  A
              value of zero disables real the  periodic  job  sampling  and  provides  accounting
              information  only  on  job  termination (reducing SLURM interference with the job).
              Smaller (non-zero) values have a greater impact upon job performance, but  a  value
              of  30  seconds  is  not  likely to be noticeable for applications having less than
              10,000 tasks.  Users can  override  this  value  on  a  per  job  basis  using  the
              --acctg-freq option when submitting the job.

       JobCheckpointDir
              Specifies  the default directory for storing or reading job checkpoint information.
              The data stored here is only a few thousand bytes per job and includes  information
              needed  to  resubmit the job request, not job's memory image. The directory must be
              readable and writable by SlurmUser, but not writable  by  regular  users.  The  job
              memory  images  may  be  in  a  different location as specified by --checkpoint-dir
              option at job submit time or scontrol's ImageDir option.

       JobCompHost
              The name of the machine  hosting  the  job  completion  database.   Only  used  for
              database type storage plugins, ignored otherwise.  Also see DefaultStorageHost.

       JobCompLoc
              The  fully  qualified  file  name where job completion records are written when the
              JobCompType is "jobcomp/filetxt" or the database where job completion  records  are
              stored when the JobCompType is a database.  Also see DefaultStorageLoc.

       JobCompPass
              The  password used to gain access to the database to store the job completion data.
              Only  used  for  database  type  storage  plugins,  ignored  otherwise.   Also  see
              DefaultStoragePass.

       JobCompPort
              The  listening  port of the job completion database server.  Only used for database
              type storage plugins, ignored otherwise.  Also see DefaultStoragePort.

       JobCompType
              The job completion logging mechanism type.  Acceptable values  at  present  include
              "jobcomp/none",    "jobcomp/filetxt",    "jobcomp/mysql",    "jobcomp/pgsql",   and
              "jobcomp/script"".  The default value is "jobcomp/none", which means that upon  job
              completion  the  record  of  the  job  is  purged  from  the  system.  If using the
              accounting infrastructure this plugin may not be of interest since the  information
              here  is redundant.  The value "jobcomp/filetxt" indicates that a record of the job
              should be written to a text file specified by the JobCompLoc parameter.  The  value
              "jobcomp/mysql"  indicates  that  a  record of the job should be written to a mysql
              database  specified  by  the  JobCompLoc  parameter.   The  value   "jobcomp/pgsql"
              indicates  that  a  record  of  the  job should be written to a PostgreSQL database
              specified by the JobCompLoc parameter.  The value "jobcomp/script" indicates that a
              script  specified  by  the  JobCompLoc parameter is to be executed with environment
              variables indicating the job information.

       JobCompUser
              The user account for accessing the job completion database.  Only used for database
              type storage plugins, ignored otherwise.  Also see DefaultStorageUser.

       JobCredentialPrivateKey
              Fully qualified pathname of a file containing a private key used for authentication
              by SLURM daemons.  This parameter is ignored if CryptoType=crypto/munge.

       JobCredentialPublicCertificate
              Fully qualified pathname of a file containing a public key used for  authentication
              by SLURM daemons.  This parameter is ignored if CryptoType=crypto/munge.

       JobFileAppend
              This  option controls what to do if a job's output or error file exist when the job
              is started.  If JobFileAppend is set to a value of 1, then append to  the  existing
              file.  By default, any existing file is truncated.

       JobRequeue
              This  option controls what to do by default after a node failure.  If JobRequeue is
              set to a value of 1, then any batch job running on the failed node will be requeued
              for  execution  on different nodes.  If JobRequeue is set to a value of 0, then any
              job running on the failed node will be terminated.  Use the sbatch --no-requeue  or
              --requeue  option  to change the default behavior for individual jobs.  The default
              value is 1.

       JobSubmitPlugins
              A comma delimited list of job submission plugins to be used.  The specified plugins
              will  be  executed  in  the  order  listed.  These are intended to be site-specific
              plugins which can be used to set default  job  parameters  and/or  logging  events.
              Sample   plugins   available  in  the  distribution  include  "cnode",  "defaults",
              "logging", "lua", and "partition".  For examples of use,  see  the  SLURM  code  in
              "src/plugins/job_submit" and "contribs/lua/job_submit*.lua" then modify the code to
              satisfy your needs.  No job submission plugins are used by default.

       KillOnBadExit
              If set to 1, the job will be terminated immediately when one of  the  processes  is
              crashed or aborted. With the default value of 0, if one of the processes is crashed
              or aborted the other processes will continue to run. The  user  can  override  this
              configuration parameter by using srun's -K, --kill-on-bad-exit.

       KillWait
              The  interval,  in  seconds,  given  to  a  job's processes between the SIGTERM and
              SIGKILL signals upon reaching its time  limit.   If  the  job  fails  to  terminate
              gracefully  in the interval specified, it will be forcibly terminated.  The default
              value is 30 seconds.  The value may not exceed 65533.

       Licenses
              Specification of licenses (or  other  resources  available  on  all  nodes  of  the
              cluster)  which can be allocated to jobs.  License names can optionally be followed
              by an asterisk and count with a default  count  of  one.   Multiple  license  names
              should  be  comma separated (e.g.  "Licenses=foo*4,bar").  Note that SLURM prevents
              jobs from being scheduled if their required license specification is not available.
              SLURM  does  not prevent jobs from using licenses that are not explicitly listed in
              the job submission specification.

       MailProg
              Fully qualified pathname to the program used to send email per user  request.   The
              default value is "/usr/bin/mail".

       MaxJobCount
              The  maximum  number of jobs SLURM can have in its active database at one time. Set
              the values of MaxJobCount and MinJobAge to insure the  slurmctld  daemon  does  not
              exhaust  its  memory  or  other  resources. Once this limit is reached, requests to
              submit additional jobs will fail. The default value is 10000 jobs. This  value  may
              not  be  reset  via  "scontrol  reconfig". It only takes effect upon restart of the
              slurmctld daemon.

       MaxJobId
              The maximum job id to be used for  jobs  submitted  to  SLURM  without  a  specific
              requested  value. Job id values generated will incremented by 1 for each subsequent
              job. This may be used to provide a meta-scheduler with a  job  id  space  which  is
              disjoint from the interactive jobs.  Once MaxJobId is reached, the next job will be
              assigned FirstJobId.  The default  value  is  4294901760  (0xffff0000).   Also  see
              FirstJobId.

       MaxMemPerCPU
              Maximum  real  memory size available per allocated CPU in MegaBytes.  Used to avoid
              over-subscribing memory and causing paging.  MaxMemPerCPU would generally  be  used
              if  individual  processors are allocated to jobs (SelectType=select/cons_res).  The
              default  value  is  0  (unlimited).   Also  see  DefMemPerCPU  and   MaxMemPerNode.
              MaxMemPerCPU and MaxMemPerNode are mutually exclusive.  NOTE: Enforcement of memory
              limits currently requires enabling of accounting, which samples  memory  use  on  a
              periodic basis (data need not be stored, just collected).

       MaxMemPerNode
              Maximum  real memory size available per allocated node in MegaBytes.  Used to avoid
              over-subscribing memory and causing paging.  MaxMemPerNode would generally be  used
              if  whole  nodes are allocated to jobs (SelectType=select/linear) and resources are
              shared (Shared=yes or Shared=force).  The default value is 0 (unlimited).  Also see
              DefMemPerNode  and  MaxMemPerCPU.   MaxMemPerCPU  and  MaxMemPerNode  are  mutually
              exclusive.  NOTE: Enforcement of  memory  limits  currently  requires  enabling  of
              accounting,  which samples memory use on a periodic basis (data need not be stored,
              just collected).

       MaxStepCount
              The maximum number of steps that any job can initiate. This parameter  is  intended
              to limit the effect of bad batch scripts.  The default value is 40000 steps.

       MaxTasksPerNode
              Maximum  number of tasks SLURM will allow a job step to spawn on a single node. The
              default MaxTasksPerNode is 128.

       MessageTimeout
              Time permitted for a round-trip communication to complete in seconds. Default value
              is  10 seconds. For systems with shared nodes, the slurmd daemon could be paged out
              and necessitate higher values.

       MinJobAge
              The minimum age of a completed job before its record is purged from SLURM's  active
              database.  Set  the  values  of  MaxJobCount  and MinJobAge to insure the slurmctld
              daemon does not exhaust its memory or other resources. The  default  value  is  300
              seconds.  A value of zero prevents any job record purging.  May not exceed 65533.

       MpiDefault
              Identifies   the  default  type  of  MPI  to  be  used.   Srun  may  override  this
              configuration parameter in any case.  Currently supported  versions  include:  lam,
              mpich1_p4,  mpich1_shmem, mpichgm, mpichmx, mvapich, none (default, which works for
              many other versions of MPI)  and  openmpi.   More  information  about  MPI  use  is
              available here <http://www.schedmd.com/slurmdocs/mpi_guide.html>.

       MpiParams
              MPI  parameters.   Used to identify ports used by OpenMPI only and the input format
              is "ports=12000-12999" to identify a range of communication ports to be used.

       OverTimeLimit
              Number of minutes by which a job can exceed its time limit before  being  canceled.
              The  configured job time limit is treated as a soft limit.  Adding OverTimeLimit to
              the soft limit provides a hard limit, at which point the job is canceled.  This  is
              particularly  useful for backfill scheduling, which bases upon each job's soft time
              limit.  The default value is zero.  May not exceed exceed 65533 minutes.   A  value
              of "UNLIMITED" is also supported.

       PluginDir
              Identifies   the   places   in  which  to  look  for  SLURM  plugins.   This  is  a
              colon-separated list of directories,  like  the  PATH  environment  variable.   The
              default value is "/usr/local/lib/slurm".

       PlugStackConfig
              Location  of  the  config  file  for SLURM stackable plugins that use the Stackable
              Plugin Architecture for Node job (K)control (SPANK).  This provides support  for  a
              highly  configurable  set  of plugins to be called before and/or after execution of
              each  task  spawned  as  part  of  a  user's  job  step.    Default   location   is
              "plugstack.conf"  in  the  same  directory  as  the  system  slurm.conf.  For  more
              information on SPANK plugins, see the spank(8) manual.

       PreemptMode
              Enables gang scheduling and/or controls the mechanism used to preempt  jobs.   When
              the  PreemptType parameter is set to enable preemption, the PreemptMode selects the
              mechanism used to preempt the lower priority jobs.  The  GANG  option  is  used  to
              enable   gang   scheduling  independent  of  whether  preemption  is  enabled  (the
              PreemptType  setting).   The  GANG  option  can  be  specified  in  addition  to  a
              PreemptMode  setting  with  the  two  options  comma separated.  The SUSPEND option
              requires that gang scheduling be enabled (i.e, "PreemptMode=SUSPEND,GANG").

              OFF         is the default value and disables job preemption and  gang  scheduling.
                          This  is  the  only  option compatible with SchedulerType=sched/wiki or
                          SchedulerType=sched/wiki2 (used by Maui and  Moab  respectively,  which
                          provide their own job preemption functionality).

              CANCEL      always cancel the job.

              CHECKPOINT  preempts jobs by checkpointing them (if possible) or canceling them.

              GANG        enables  gang  scheduling (time slicing) of jobs in the same partition.
                          NOTE: Gang scheduling is performed independently for each partition, so
                          configuring  partitions  with  overlapping nodes and gang scheduling is
                          generally not recommended.

              REQUEUE     preempts jobs by requeuing them (if possible) or canceling them.

              SUSPEND     preempts  jobs  by  suspending  them.   A  suspended  job  will  resume
                          execution  once  the  high  priority  job preempting it completes.  The
                          SUSPEND may only be used with  the  GANG  option  (the  gang  scheduler
                          module    performs    the    job    resume    operation)    and    with
                          PreemptType=preempt/partition_prio (the logic  to  suspend  and  resume
                          jobs current only has the data structures to support partitions).

       PreemptType
              This  specifies the plugin used to identify which jobs can be preempted in order to
              start a pending job.

              preempt/none
                     Job preemption is disabled.  This is the default.

              preempt/partition_prio
                     Job preemption is based upon partition priority.  Jobs  in  higher  priority
                     partitions (queues) may preempt jobs from lower priority partitions.

              preempt/qos
                     Job   preemption   rules   are   specified   by  Quality  Of  Service  (QOS)
                     specifications in the SLURM database a database.   This  is  not  compatible
                     with  PreemptMode=OFF  or  PreemptMode=SUSPEND  (i.e. preempted jobs must be
                     removed from the resources).

       PriorityDecayHalfLife
              This controls how long prior resource use is considered in determining how over- or
              under-serviced  an  association  is (user, bank account and cluster) in determining
              job priority.  If set to 0 no decay will be applied.  This is helpful if  you  want
              to  enforce hard time limits per association.  If set to 0 PriorityUsageResetPeriod
              must     be     set     to     some     interval.      Applicable      only      if
              PriorityType=priority/multifactor.  The unit is a time string (i.e. min, hr:min:00,
              days-hr:min:00, or days-hr).  The default value is 7-0 (7 days).

       PriorityCalcPeriod
              The period of time in minutes in which the half-life decay will  be  re-calculated.
              Applicable  only  if  PriorityType=priority/multifactor.   The  default  value is 5
              (minutes).

       PriorityFavorSmall
              Specifies that  small  jobs  should  be  given  preferential  scheduling  priority.
              Applicable  only  if PriorityType=priority/multifactor.  Supported values are "YES"
              and "NO".  The default value is "NO".

       PriorityMaxAge
              Specifies the job age which will be given  the  maximum  age  factor  in  computing
              priority.  For  example,  a  value  of  30 minutes would result in all jobs over 30
              minutes  old  would  get  the  same  age-based  priority.    Applicable   only   if
              PriorityType=priority/multifactor.  The unit is a time string (i.e. min, hr:min:00,
              days-hr:min:00, or days-hr).  The default value is 7-0 (7 days).

       PriorityUsageResetPeriod
              At this interval the usage of associations will be reset to 0.  This is used if you
              want    to   enforce   hard   limits   of   time   usage   per   association.    If
              PriorityDecayHalfLife is set to be 0 no decay will happen and this is the only  way
              to  reset the usage accumulated by running jobs.  By default this is turned off and
              it is advised to use the PriorityDecayHalfLife option to avoid not having  anything
              running on your cluster, but if your schema is set up to only allow certain amounts
              of  time  on  your  system  this  is  the  way  to  do  it.   Applicable  only   if
              PriorityType=priority/multifactor.

              NONE        Never clear historic usage. The default value.

              NOW         Clear  the historic usage now.  Executed at startup and reconfiguration
                          time.

              DAILY       Cleared every day at midnight.

              WEEKLY      Cleared every week on Sunday at time 00:00.

              MONTHLY     Cleared on the first day of each month at time 00:00.

              QUARTERLY   Cleared on the first day of each quarter at time 00:00.

              YEARLY      Cleared on the first day of each year at time 00:00.

       PriorityType
              This specifies the plugin to be used in establishing a job's  scheduling  priority.
              Supported  values  are  "priority/basic" (jobs are prioritized by order of arrival,
              also suitable for sched/wiki and sched/wiki2) and "priority/multifactor" (jobs  are
              prioritized  based  upon  size,  age,  fair-share of allocation, etc).  The default
              value is "priority/basic".

       PriorityWeightAge
              An integer value that sets the degree  to  which  the  queue  wait  time  component
              contributes      to     the     job's     priority.      Applicable     only     if
              PriorityType=priority/multifactor.  The default value is 0.

       PriorityWeightFairshare
              An integer value that sets the degree to which the fair-share component contributes
              to  the job's priority.  Applicable only if PriorityType=priority/multifactor.  The
              default value is 0.

       PriorityWeightJobSize
              An integer value that sets the degree to which the job size  component  contributes
              to  the job's priority.  Applicable only if PriorityType=priority/multifactor.  The
              default value is 0.

       PriorityWeightPartition
              An integer value that sets  the  degree  to  which  the  node  partition  component
              contributes      to     the     job's     priority.      Applicable     only     if
              PriorityType=priority/multifactor.  The default value is 0.

       PriorityWeightQOS
              An integer value that sets the degree to which the  Quality  Of  Service  component
              contributes      to     the     job's     priority.      Applicable     only     if
              PriorityType=priority/multifactor.  The default value is 0.

       PrivateData
              This controls what type of information is hidden from regular users.   By  default,
              all  information  is visible to all users.  User SlurmUser and root can always view
              all information.   Multiple  values  may  be  specified  with  a  comma  separator.
              Acceptable values include:

              accounts
                     (NON-SLURMDBD  ACCOUNTING  ONLY)  prevents  users  from  viewing any account
                     definitions unless they are coordinators of them.

              jobs   prevents users from viewing jobs or job  steps  belonging  to  other  users.
                     (NON-SLURMDBD  ACCOUNTING  ONLY)  prevents  users  from  viewing job records
                     belonging to other users unless they are  coordinators  of  the  association
                     running the job when using sacct.

              nodes  prevents users from viewing node state information.

              partitions
                     prevents users from viewing partition state information.

              reservations
                     prevents regular users from viewing reservations.

              usage  (NON-SLURMDBD  ACCOUNTING  ONLY)  prevents  users  from viewing usage of any
                     other user.  This applies to sreport.

              users  (NON-SLURMDBD ACCOUNTING ONLY) prevents users from  viewing  information  of
                     any  user  other  than  themselves, this also makes it so users can only see
                     associations they deal with.  Coordinators can see associations of all users
                     they are coordinator of, but can only see themselves when listing users.

       ProctrackType
              Identifies the plugin to be used for process tracking.  The slurmd daemon uses this
              mechanism to identify all processes which are children of processes it spawns for a
              user  job.   The  slurmd  daemon must be restarted for a change in ProctrackType to
              take effect.  NOTE: "proctrack/linuxproc" and "proctrack/pgid" can fail to identify
              all  processes associated with a job since processes can become a child of the init
              process (when the parent process terminates) or change  their  process  group.   To
              reliably  track  all  processes,  one  of  the  other  mechanisms  utilizing kernel
              modifications is preferable.  NOTE: "proctrack/linuxproc" is  not  compatible  with
              "switch/elan."  Acceptable values at present include:

              proctrack/aix       which  uses  an AIX kernel extension and is the default for AIX
                                  systems

              proctrack/cgroup    which uses linux cgroups  to  constrain  and  track  processes.
                                  NOTE: see "man cgroup.conf" for configuration details

              proctrack/linuxproc which uses linux process tree using parent process IDs

              proctrack/lua       which uses a site-specific LUA script to track processes

              proctrack/rms       which  uses  Quadrics  kernel  patch  and  is  the  default  if
                                  "SwitchType=switch/elan"

              proctrack/sgi_job   which uses SGI's Process Aggregates (PAGG) kernel  module,  see
                                  http://oss.sgi.com/projects/pagg/ for more information

              proctrack/pgid      which  uses  process group IDs and is the default for all other
                                  systems

       Prolog Fully qualified pathname of a program for the slurmd  to  execute  whenever  it  is
              asked    to    run    a    job    step    from   a   new   job   allocation   (e.g.
              "/usr/local/slurm/prolog").  The slurmd executes the  script  before  starting  the
              first  job  step.   This  may  be  used to purge files, enable user login, etc.  By
              default there is no prolog. Any configured script is expected to complete execution
              quickly (in less time than MessageTimeout).  See Prolog and Epilog Scripts for more
              information.

       PrologSlurmctld
              Fully qualified pathname of a program for the slurmctld to execute before  granting
              a  new  job  allocation  (e.g.  "/usr/local/slurm/prolog_controller").  The program
              executes as SlurmUser, which gives it permission to drain nodes and requeue the job
              if  a  failure occurs or cancel the job if appropriate.  The program can be used to
              reboot nodes or perform other work  to  prepare  resources  for  use.   While  this
              program   is   running,   the  nodes  associated  with  the  job  will  be  have  a
              POWER_UP/CONFIGURING flag set in their state,  which  can  be  readily  viewed.   A
              non-zero  exit  code  will  result  in  the  job being requeued (where possible) or
              killed.  See Prolog and Epilog Scripts for more information.

       PropagatePrioProcess
              Controls the scheduling priority (nice value) of user spawned tasks.

              0    The tasks will inherit the scheduling priority from the slurm daemon.  This is
                   the default value.

              1    The  tasks  will inherit the scheduling priority of the command used to submit
                   them (e.g. srun or sbatch).  Unless the job is submitted  by  user  root,  the
                   tasks will have a scheduling priority no higher than the slurm daemon spawning
                   them.

              2    The tasks will inherit the scheduling priority of the command used  to  submit
                   them  (e.g.  srun  or  sbatch) with the restriction that their nice value will
                   always be one higher  than  the  slurm  daemon  (i.e.   the  tasks  scheduling
                   priority will be lower than the slurm daemon).

       PropagateResourceLimits
              A list of comma separated resource limit names.  The slurmd daemon uses these names
              to obtain the associated (soft) limit values from the users process environment  on
              the  submit  node.   These  limits are then propagated and applied to the jobs that
              will run on the compute nodes.  This parameter can be  useful  when  system  limits
              vary  among  nodes.   Any  resource  limits  that do not appear in the list are not
              propagated.  However, the user can  override  this  by  specifying  which  resource
              limits to propagate with the srun commands "--propagate" option.  If neither of the
              'propagate resource limit' parameters are specified, then the default action is  to
              propagate  all  limits.  Only one of the parameters, either PropagateResourceLimits
              or PropagateResourceLimitsExcept, may be specified.  The following limit names  are
              supported by SLURM (although some options may not be supported on some systems):

              ALL       All limits listed below

              NONE      No limits listed below

              AS        The maximum address space for a process

              CORE      The maximum size of core file

              CPU       The maximum amount of CPU time

              DATA      The maximum size of a process's data segment

              FSIZE     The  maximum  size  of files created. Note that if the user sets FSIZE to
                        less than the current size of the slurmd.log, job launches will fail with
                        a 'File size limit exceeded' error.

              MEMLOCK   The maximum size that may be locked into memory

              NOFILE    The maximum number of open files

              NPROC     The maximum number of processes available

              RSS       The maximum resident set size

              STACK     The maximum stack size

       PropagateResourceLimitsExcept
              A  list  of  comma separated resource limit names.  By default, all resource limits
              will be propagated, (as described by the PropagateResourceLimits parameter), except
              for  the  limits appearing in this list.   The user can override this by specifying
              which resource limits to propagate with the  srun  commands  "--propagate"  option.
              See PropagateResourceLimits above for a list of valid limit names.

       ResumeProgram
              SLURM  supports  a  mechanism to reduce power consumption on nodes that remain idle
              for an extended period of time.  This is typically accomplished by reducing voltage
              and frequency or powering the node down.  ResumeProgram is the program that will be
              executed when a node in power save mode is assigned work to perform.   For  reasons
              of  reliability,  ResumeProgram  may  execute  more  than  once for a node when the
              slurmctld daemon crashes and is restarted.  If ResumeProgram is unable to restore a
              node  to  service,  it should requeue any node associated with the node and set the
              node state to DRAIN.  The program executes  as  SlurmUser.   The  argument  to  the
              program  will  be  the  names of nodes to be removed from power savings mode (using
              SLURM's hostlist expression format).   By  default  no  program  is  run.   Related
              configuration  options include ResumeTimeout, ResumeRate, SuspendRate, SuspendTime,
              SuspendTimeout,  SuspendProgram,  SuspendExcNodes,   and   SuspendExcParts.    More
              information      is      available      at     the     SLURM     web     site     (
              http://www.schedmd.com/slurmdocs/power_save.html ).

       ResumeRate
              The rate at which nodes in power save mode are  returned  to  normal  operation  by
              ResumeProgram.   The  value  is  number  of  nodes per minute and it can be used to
              prevent power surges if a large number of nodes in power  save  mode  are  assigned
              work  at  the  same  time (e.g. a large job starts).  A value of zero results in no
              limits being imposed.   The  default  value  is  300  nodes  per  minute.   Related
              configuration    options   include   ResumeTimeout,   ResumeProgram,   SuspendRate,
              SuspendTime, SuspendTimeout, SuspendProgram, SuspendExcNodes, and SuspendExcParts.

       ResumeTimeout
              Maximum time permitted (in second) between when a node is resume request is  issued
              and  when  the  node is actually available for use.  Nodes which fail to respond in
              this time frame may be marked DOWN and the jobs scheduled  on  the  node  requeued.
              The   default   value   is  60  seconds.   Related  configuration  options  include
              ResumeProgram,    ResumeRate,     SuspendRate,     SuspendTime,     SuspendTimeout,
              SuspendProgram, SuspendExcNodes and SuspendExcParts.  More information is available
              at the SLURM web site ( http://www.schedmd.com/slurmdocs/power_save.html ).

       ResvOverRun
              Describes how long a job already running in a reservation should  be  permitted  to
              execute after the end time of the reservation has been reached.  The time period is
              specified in minutes and the default value is 0 (kill the  job  immediately).   The
              value may not exceed 65533 minutes, although a value of "UNLIMITED" is supported to
              permit a job to run indefinitely after its reservation is terminated.

       ReturnToService
              Controls when a DOWN node will be returned to service.  The  default  value  is  0.
              Supported values include

              0   A  node  will  remain in the DOWN state until a system administrator explicitly
                  changes  its  state  (even  if  the  slurmd  daemon   registers   and   resumes
                  communications).

              1   A  DOWN  node  will  become  available  for  use upon registration with a valid
                  configuration only if it was set DOWN due to being non-responsive.  If the node
                  was  set DOWN for any other reason (low memory, prolog failure, epilog failure,
                  unexpected reboot, etc.), its state will not automatically be changed.

              2   A DOWN node will become available  for  use  upon  registration  with  a  valid
                  configuration.  The node could have been set DOWN for any reason.  (Disabled on
                  Cray systems.)

       SallocDefaultCommand
              Normally, salloc(1) will run the user's default shell when a command to execute  is
              not  specified  on  the salloc command line.  If SallocDefaultCommand is specified,
              salloc will instead run the configured command. The command is passed  to  '/bin/sh
              -c',  so  shell  metacharacters  are  allowed, and commands with multiple arguments
              should be quoted. For instance:

                  SallocDefaultCommand = "$SHELL"

              would run the shell in the user's $SHELL environment variable.  and

                  SallocDefaultCommand = "xterm -T Job_$SLURM_JOB_ID"

              would run xterm with the title set to the SLURM jobid.

       SchedulerParameters
              The interpretation of this parameter varies by SchedulerType.  Multiple options may
              be comma separated.

              default_queue_depth=#
                     The default number of jobs to attempt scheduling (i.e. the queue depth) when
                     a running job completes or other routine actions occur. The full queue  will
                     be  tested  on a less frequent basis. The default value is 100.  In the case
                     of large clusters (more than 1000 nodes),  configuring  a  relatively  small
                     value may be desirable.

              defer  Setting  this option will avoid attempting to schedule each job individually
                     at job submit time, but defer it until a later time when scheduling multiple
                     jobs  simultaneously  may  be  possible.   This  option  may  improve system
                     responsiveness when large numbers of jobs (many hundreds) are  submitted  at
                     the  same  time,  but  it will delay the initiation time of individual jobs.
                     Also see default_queue_depth above.

              bf_interval=#
                     The number of seconds between iterations.   Higher  values  result  in  less
                     overhead  and better responsiveness.  The default value is 30 seconds.  This
                     option applies only to SchedulerType=sched/backfill.

              bf_resolution=#
                     The number of seconds in the resolution of data maintained about  when  jobs
                     begin   and   end.   Higher  values  result  in  less  overhead  and  better
                     responsiveness.  The default value is 60 seconds.  This option applies  only
                     to SchedulerType=sched/backfill.

              bf_window=#
                     The  number  of  minutes  into  the  future to look when considering jobs to
                     schedule.  Higher values result in more overhead  and  less  responsiveness.
                     The  default  value  is 1440 minutes (one day).  This option applies only to
                     SchedulerType=sched/backfill.

              max_job_bf=#
                     The maximum number of jobs to attempt  backfill  scheduling  for  (i.e.  the
                     queue   depth).    Higher   values   result   in   more  overhead  and  less
                     responsiveness.  Until an attempt is made to backfill schedule  a  job,  its
                     expected  initiation  time  value will not be set.  The default value is 50.
                     In the case of  large  clusters  (more  than  1000  nodes)  configured  with
                     SelectType=select/cons_res,  configuring  a  relatively  small  value may be
                     desirable.  This option applies only to SchedulerType=sched/backfill.

              max_switch_wait=#
                     Maximum number of seconds that a job can delay  execution  waiting  for  the
                     specified desired switch count. The default value is 60 seconds.

       SchedulerPort
              The  port  number  on  which slurmctld should listen for connection requests.  This
              value is only used by the Maui Scheduler (see SchedulerType).  The default value is
              7321.

       SchedulerRootFilter
              Identifies  whether or not RootOnly partitions should be filtered from any external
              scheduling activities. If set to 0, then RootOnly partitions are treated  like  any
              other partition. If set to 1, then RootOnly partitions are exempt from any external
              scheduling activities. The default value is 1. Currently only used by the  built-in
              backfill scheduling module "sched/backfill" (see SchedulerType).

       SchedulerTimeSlice
              Number   of   seconds   in   each  time  slice  when  gang  scheduling  is  enabled
              (PreemptMode=GANG).  The value must be between 5 seconds and  65533  seconds.   The
              default value is 30 seconds.

       SchedulerType
              Identifies  the  type  of  scheduler to be used.  Note the slurmctld daemon must be
              restarted for a change in scheduler  type  to  become  effective  (reconfiguring  a
              running daemon has no effect for this parameter).  The scontrol command can be used
              to manually change job priorities if desired.  Acceptable values include:

              sched/builtin
                     for the built-in FIFO (First In First Out) scheduler.  This is the default.

              sched/backfill
                     for a backfill scheduling module to augment  the  default  FIFO  scheduling.
                     Backfill  scheduling  will initiate lower-priority jobs if doing so does not
                     delay  the  expected  initiation  time   of   any   higher   priority   job.
                     Effectiveness  of backfill scheduling is dependent upon users specifying job
                     time  limits,  otherwise  all  jobs  will  have  the  same  time  limit  and
                     backfilling  is  impossible.  Note documentation for the SchedulerParameters
                     option above.

              sched/gang
                     Defunct option. See PreemptType and PreemptMode options.

              sched/hold
                     to hold all newly arriving jobs if a file "/etc/slurm.hold" exists otherwise
                     use the built-in FIFO scheduler

              sched/wiki
                     for the Wiki interface to the Maui Scheduler

              sched/wiki2
                     for the Wiki interface to the Moab Cluster Suite

       SelectType
              Identifies  the  type  of  resource  selection algorithm to be used.  Changing this
              value can only be done by restarting the slurmctld daemon and will  result  in  the
              loss  of  all job information (running and pending) since the job state save format
              used by each plugin is different.  Acceptable values include

              select/linear
                     for allocation of entire nodes assuming a one-dimensional array of nodes  in
                     which  sequentially ordered nodes are preferable.  This is the default value
                     for non-BlueGene systems.

              select/cons_res
                     The resources  within  a  node  are  individually  allocated  as  consumable
                     resources.   Note  that  whole  nodes  can be allocated to jobs for selected
                     partitions by using the Shared=Exclusive option.  See the  partition  Shared
                     parameter for more information.

              select/bluegene
                     for   a   three-dimensional   BlueGene   system.    The   default  value  is
                     "select/bluegene" for BlueGene systems.

              select/cray
                     for a Cray system.  The default value is "select/cray" for all Cray systems.

       SelectTypeParameters
              The permitted values of SelectTypeParameters depend upon the  configured  value  of
              SelectType.  SelectType=select/bluegene supports no SelectTypeParameters.  The only
              supported  option  for  SelectType=select/linear   are   CR_ONE_TASK_PER_CORE   and
              CR_Memory,  which  treats  memory as a consumable resource and prevents memory over
              subscription with job preemption or gang  scheduling.   The  following  values  are
              supported for SelectType=select/cons_res:

              CR_CPU CPUs  are  consumable  resources.   There  is no notion of sockets, cores or
                     threads; do not define those values in the node specification.  If these are
                     defined,  unexpected  results  will  happen  when hyper-threading is enabled
                     CPUs= should be used instead.  On a multi-core system,  each  core  will  be
                     considered  a  CPU.   On a multi-core and hyper-threaded system, each thread
                     will be considered a  CPU.   On  single-core  systems,  each  CPUs  will  be
                     considered a CPU.

              CR_CPU_Memory
                     CPUs  and  memory  are consumable resources.  There is no notion of sockets,
                     cores or threads; do not define those values in the node specification.   If
                     these  are  defined,  unexpected results will happen when hyper-threading is
                     enabled CPUs= should be used instead.  Setting a value for  DefMemPerCPU  is
                     strongly recommended.

              CR_Core
                     Cores are consumable resources.  On nodes with hyper-threads, each thread is
                     counted as a CPU to satisfy a job's resource requirement, but multiple  jobs
                     are not allocated threads on the same core.

              CR_Core_Memory
                     Cores  and  memory  are  consumable resources.  On nodes with hyper-threads,
                     each thread is counted as a CPU to satisfy a job's resource requirement, but
                     multiple  jobs  are not allocated threads on the same core.  Setting a value
                     for DefMemPerCPU is strongly recommended.

              CR_ONE_TASK_PER_CORE
                     Allocate one task per core by default.  Without this option, by default  one
                     task will be allocated per thread on nodes with more than one ThreadsPerCore
                     configured.

              CR_CORE_DEFAULT_DIST_BLOCK
                     Allocate cores using block distribution by default.  This  default  behavior
                     can   be   overridden   specifying   a   particular   "-m"   parameter  with
                     srun/salloc/sbatch.  Without this option, cores will be  allocated  cyclicly
                     across the sockets.

              CR_Socket
                     Sockets  are  consumable resources.  On nodes with multiple cores, each core
                     or thread is counted as a CPU to satisfy a job's resource  requirement,  but
                     multiple  jobs  are  not  allocated resources on the same socket.  Note that
                     jobs requesting one CPU will only be allocated that one CPU,  but  no  other
                     job will share the socket.

              CR_Socket_Memory
                     Memory  and sockets are consumable resources.  On nodes with multiple cores,
                     each core or thread is  counted  as  a  CPU  to  satisfy  a  job's  resource
                     requirement,  but  multiple  jobs  are  not  allocated resources on the same
                     socket.  Note that jobs requesting one CPU will only be allocated  that  one
                     CPU,  but  no  other  job  will  share  the  socket.   Setting  a  value for
                     DefMemPerCPU is strongly recommended.

              CR_Memory
                     Memory  is  a  consumable  resource.   NOTE:  This  implies  Shared=YES   or
                     Shared=FORCE  for  all  partitions.   Setting  a  value  for DefMemPerCPU is
                     strongly recommended.

       SlurmUser
              The name of the user that the slurmctld daemon executes as.  For security purposes,
              a  user other than "root" is recommended.  This user must exist on all nodes of the
              cluster for authentication of communications between SLURM components.  The default
              value is "root".

       SlurmdUser
              The  name  of the user that the slurmd daemon executes as.  This user must exist on
              all nodes of  the  cluster  for  authentication  of  communications  between  SLURM
              components.  The default value is "root".

       SlurmctldDebug
              The  level  of  detail  to provide slurmctld daemon's logs.  Values from 0 to 9 are
              legal, with `0' being "quiet"  operation  and  `9'  being  insanely  verbose.   The
              default value is 3.

       SlurmctldLogFile
              Fully  qualified  pathname  of  a  file  into which the slurmctld daemon's logs are
              written.  The default value is none (performs logging via syslog).
              See the section LOGGING if a pathname is specified.

       SlurmctldPidFile
              Fully qualified pathname of a file into which the  slurmctld daemon may  write  its
              process id. This may be used for automated signal processing.  The default value is
              "/var/run/slurmctld.pid".

       SlurmctldPort
              The port number that the SLURM controller, slurmctld,  listens  to  for  work.  The
              default  value  is  SLURMCTLD_PORT  as established at system build time. If none is
              explicitly specified, it will be set to 6817.  SlurmctldPort may also be configured
              to  support  a  range  of port numbers in order to accept larger bursts of incoming
              messages   by   specifying   two   numbers    separated    by    a    dash    (e.g.
              SlurmctldPort=6817-6818).   NOTE:  Either  slurmctld  and  slurmd  daemons must not
              execute on the same nodes or the values of SlurmctldPort  and  SlurmdPort  must  be
              different.

       SlurmctldTimeout
              The  interval,  in  seconds,  that  the  backup  controller  waits  for the primary
              controller to respond before assuming control.  The default value is  120  seconds.
              May not exceed 65533.

       SlurmdDebug
              The level of detail to provide slurmd daemon's logs.  Values from 0 to 9 are legal,
              with `0' being "quiet" operation and `9' being insanely verbose.  The default value
              is 3.

       SlurmdLogFile
              Fully  qualified  pathname  of  a  file  into  which  the  slurmd daemon's logs are
              written.  The default value is none (performs logging via syslog).  Any "%h" within
              the name is replaced with the hostname on which the slurmd is running.
              See the section LOGGING if a pathname is specified.

       SlurmdPidFile
              Fully  qualified  pathname  of  a  file into which the  slurmd daemon may write its
              process id. This may be used for automated signal processing.  The default value is
              "/var/run/slurmd.pid".

       SlurmdPort
              The  port  number  that the SLURM compute node daemon, slurmd, listens to for work.
              The default value is SLURMD_PORT as established at system build time.  If  none  is
              explicitly  specified,  its  value will be 6818.  NOTE: Either slurmctld and slurmd
              daemons must not execute on the same nodes  or  the  values  of  SlurmctldPort  and
              SlurmdPort must be different.

       SlurmdSpoolDir
              Fully  qualified  pathname  of  a  directory  into  which the slurmd daemon's state
              information and batch job script information are written. This  must  be  a  common
              pathname  for  all  nodes,  but should represent a directory which is local to each
              node (reference a local file system). The  default  value  is  "/var/spool/slurmd."
              NOTE:  This  directory  is  also used to store slurmd's shared memory lockfile, and
              should not be changed unless the system is being cleanly restarted. If the location
              of SlurmdSpoolDir is changed and slurmd is restarted, the new daemon will attach to
              a different shared memory region and lose track of any running jobs.

       SlurmdTimeout
              The interval, in seconds, that the SLURM controller waits  for  slurmd  to  respond
              before  configuring  that node's state to DOWN.  A value of zero indicates the node
              will not be tested by slurmctld to confirm the state of slurmd, the node  will  not
              be  automatically  set to a DOWN state indicating a non-responsive slurmd, and some
              other tool will take responsibility for monitoring the state of each  compute  node
              and  its  slurmd  daemon.   SLURM's hierarchical communication mechanism is used to
              ping the slurmd daemons in order  to  minimize  system  noise  and  overhead.   The
              default value is 300 seconds.  The value may not exceed 65533 seconds.

       SlurmSchedLogFile
              Fully  qualified pathname of the scheduling event logging file.  The syntax of this
              parameter is the same as for SlurmctldLogFile.  In  order  to  configure  scheduler
              logging, set both the SlurmSchedLogFile and SlurmSchedLogLevel parameters.

       SlurmSchedLogLevel
              The  initial  level  of  scheduling  event  logging,  similar  to the SlurmctlDebug
              parameter used to control the initial level of slurmctld logging.  Valid values for
              SlurmSchedLogLevel  are "0" (scheduler logging disabled) and "1" (scheduler logging
              enabled).  If this parameter is omitted, the value defaults to "0" (disabled).   In
              order   to   configure  scheduler  logging,  set  both  the  SlurmSchedLogFile  and
              SlurmSchedLogLevel  parameters.   The  scheduler  logging  level  can  be   changed
              dynamically using scontrol.

       SrunEpilog
              Fully  qualified  pathname  of  an  executable  to  be  run  by  srun following the
              completion of a job step.  The command line arguments for the  executable  will  be
              the  command  and  arguments  of the job step.  This configuration parameter may be
              overridden by srun's  --epilog  parameter.  Note  that  while  the  other  "Epilog"
              executables  (e.g.,  TaskEpilog)  are  run by slurmd on the compute nodes where the
              tasks are executed, the SrunEpilog runs on the node where the "srun" is executing.

       SrunProlog
              Fully qualified pathname of an executable to be run by srun prior to the launch  of
              a  job step.  The command line arguments for the executable will be the command and
              arguments of the job step.  This  configuration  parameter  may  be  overridden  by
              srun's  --prolog  parameter.  Note that while the other "Prolog" executables (e.g.,
              TaskProlog) are run by slurmd on the compute nodes where the  tasks  are  executed,
              the SrunProlog runs on the node where the "srun" is executing.

       StateSaveLocation
              Fully qualified pathname of a directory into which the SLURM controller, slurmctld,
              saves its state (e.g. "/usr/local/slurm/checkpoint").  SLURM state will saved  here
              to  recover  from  system failures.  SlurmUser must be able to create files in this
              directory.  If you have a BackupController  configured,  this  location  should  be
              readable  and  writable  by  both  systems.   Since  all  running  and  pending job
              information is stored here, the use of  a  reliable  file  system  (e.g.  RAID)  is
              recommended.   The  default  value  is  "/tmp".   If  any  slurm  daemons terminate
              abnormally, their core files will also be written into this directory.

       SuspendExcNodes
              Specifies the nodes which are to not be placed in power save mode, even if the node
              remains  idle  for  an extended period of time.  Use SLURM's hostlist expression to
              identify nodes.  By default no nodes are excluded.  Related  configuration  options
              include  ResumeTimeout,  ResumeProgram,  ResumeRate,  SuspendProgram,  SuspendRate,
              SuspendTime, SuspendTimeout, and SuspendExcParts.

       SuspendExcParts
              Specifies the partitions whose nodes are to not be placed in power save mode,  even
              if  the  node remains idle for an extended period of time.  Multiple partitions can
              be identified and separated by commas.  By default no nodes are excluded.   Related
              configuration    options    include   ResumeTimeout,   ResumeProgram,   ResumeRate,
              SuspendProgram, SuspendRate, SuspendTime SuspendTimeout, and SuspendExcNodes.

       SuspendProgram
              SuspendProgram is the program that will be executed when a node remains idle for an
              extended  period  of  time.   This  program is expected to place the node into some
              power save mode.  This can be used to reduce the frequency and voltage of a node or
              completely power the node off.  The program executes as SlurmUser.  The argument to
              the program will be the names of nodes to be placed into power savings mode  (using
              SLURM's  hostlist  expression  format).   By  default,  no program is run.  Related
              configuration   options   include   ResumeTimeout,    ResumeProgram,    ResumeRate,
              SuspendRate, SuspendTime, SuspendTimeout, SuspendExcNodes, and SuspendExcParts.

       SuspendRate
              The  rate  at  which  nodes  are place into power save mode by SuspendProgram.  The
              value is number of nodes per minute and it can be used to prevent a large  drop  in
              power  power  consumption  (e.g.  after  a  large  job completes).  A value of zero
              results in no limits being imposed.  The default value  is  60  nodes  per  minute.
              Related  configuration  options  include  ResumeTimeout, ResumeProgram, ResumeRate,
              SuspendProgram, SuspendTime, SuspendTimeout, SuspendExcNodes, and SuspendExcParts.

       SuspendTime
              Nodes which remain idle for this number of seconds will be placed into  power  save
              mode by SuspendProgram.  A value of -1 disables power save mode and is the default.
              Related configuration options  include  ResumeTimeout,  ResumeProgram,  ResumeRate,
              SuspendProgram, SuspendRate, SuspendTimeout, SuspendExcNodes, and SuspendExcParts.

       SuspendTimeout
              Maximum  time  permitted  (in second) between when a node suspend request is issued
              and when the node shutdown.  At that time the node must ready for a resume  request
              to  be  issued  as  needed for new work.  The default value is 30 seconds.  Related
              configuration   options   include   ResumeProgram,    ResumeRate,    ResumeTimeout,
              SuspendRate,  SuspendTime,  SuspendProgram,  SuspendExcNodes  and  SuspendExcParts.
              More    information    is    available    at    the    SLURM     web     site     (
              http://www.schedmd.com/slurmdocs/power_save.html ).

       SwitchType
              Identifies  the type of switch or interconnect used for application communications.
              Acceptable  values  include  "switch/none"  for  switches  not  requiring   special
              processing  for  job  launch  or  termination  (Myrinet, Ethernet, and InfiniBand),
              "switch/elan" for Quadrics Elan 3 or Elan 4 interconnect.   The  default  value  is
              "switch/none".   All SLURM daemons, commands and running jobs must be restarted for
              a change in SwitchType to take effect.  If running jobs exist at the time slurmctld
              is  restarted  with a new value of SwitchType, records of all jobs in any state may
              be lost.

       TaskEpilog
              Fully qualified pathname of a program to be execute as the slurm job's owner  after
              termination of each task.  See TaskProlog for execution order details.

       TaskPlugin
              Identifies  the  type  of  task  launch  plugin, typically used to provide resource
              management within a node (e.g. pinning tasks to specific processors). More than one
              task  plugin  can  be specified in a comma separated list. The prefix of "task/" is
              optional. Acceptable values include:

              task/affinity  enables  resource  containment  using  CPUSETs.   This  enables  the
                             --cpu_bind   and/or   --mem_bind   srun   options.    If   you   use
                             "task/affinity" and encounter problems, it may be due to the variety
                             of  system  calls  used  to  implement  task  affinity  on different
                             operating systems.  If that is the case, you  may  want  to  install
                             Portable   Linux   Process   Affinity  (PLPA,  see  http://www.open-
                             mpi.org/software/plpa), which is supported by SLURM.

              task/cgroup    enables resource containment  using  Linux  control  cgroups.   This
                             enables  the  --cpu_bind  and/or --mem_bind srun options.  NOTE: see
                             "man cgroup.conf" for configuration details.

              task/none      for systems requiring no special  handling  of  user  tasks.   Lacks
                             support  for  the  --cpu_bind  and/or  --mem_bind srun options.  The
                             default value is "task/none".

       TaskPluginParam
              Optional parameters  for  the  task  plugin.   Multiple  options  should  be  comma
              separated If None, Sockets, Cores, Threads, and/or Verbose are specified, they will
              override the --cpu_bind option specified by the user in the  srun  command.   None,
              Sockets,  Cores  and  Threads  are  mutually  exclusive  and  since  they  decrease
              scheduling flexibility are not generally recommended (select no more  than  one  of
              them).  Cpusets and Sched are mutually exclusive (select only one of them).

              Cores     Always bind to cores.  Overrides user options or automatic binding.

              Cpusets   Use  cpusets  to perform task affinity functions.  By default, Sched task
                        binding is performed.

              None      Perform no task binding.  Overrides user options or automatic binding.

              Sched     Use sched_setaffinity or plpa_sched_setaffinity (if  available)  to  bind
                        tasks to processors.

              Sockets   Always bind to sockets.  Overrides user options or automatic binding.

              Threads   Always bind to threads.  Overrides user options or automatic binding.

              Verbose   Verbosely report binding before tasks run.  Overrides user options.

       TaskProlog
              Fully  qualified pathname of a program to be execute as the slurm job's owner prior
              to initiation of each task.  Besides the normal  environment  variables,  this  has
              SLURM_TASK_PID  available  to  identify  the  process ID of the task being started.
              Standard output from this program can be used to control the environment  variables
              and output for the user program.

              export NAME=value   Will  set  environment  variables  for  the task being spawned.
                                  Everything after the equal sign to the end of the line will  be
                                  used  as  the value for the environment variable.  Exporting of
                                  functions is not currently supported.

              print ...           Will cause that line (without  the  leading  "print  ")  to  be
                                  printed to the job's standard output.

              unset NAME          Will clear environment variables for the task being spawned.

              The order of task prolog/epilog execution is as follows:

              1. pre_launch()     Function in TaskPlugin

              2. TaskProlog       System-wide per task program defined in slurm.conf

              3. user prolog      Job   step   specific   task   program   defined  using  srun's
                                  --task-prolog option or SLURM_TASK_PROLOG environment variable

              4. Execute the job step's task

              5. user epilog      Job  step  specific   task   program   defined   using   srun's
                                  --task-epilog option or SLURM_TASK_EPILOG environment variable

              6. TaskEpilog       System-wide per task program defined in slurm.conf

              7. post_term()      Function in TaskPlugin

       TmpFS  Fully  qualified  pathname  of the file system available to user jobs for temporary
              storage. This parameter is used  in  establishing  a  node's  TmpDisk  space.   The
              default value is "/tmp".

       TopologyPlugin
              Identifies  the  plugin  to  be  used  for  determining  the  network  topology and
              optimizing job allocations to minimize network contention.   See  NETWORK  TOPOLOGY
              below  for  details.  Additional plugins may be provided in the future which gather
              topology information directly from the network.  Acceptable values include:

              topology/3d_torus    default for Sun Constellation  systems,  best-fit  logic  over
                                   three-dimensional topology

              topology/node_rank   orders  nodes  based upon information a node_rank field in the
                                   node record as generated by a select plugin. SLURM performs  a
                                   best-fit algorithm over those ordered nodes

              topology/none        default for other systems, best-fit logic over one-dimensional
                                   topology

              topology/tree        used  for  a  hierarchical   network   as   described   in   a
                                   topology.conf file

       TrackWCKey
              Boolean  yes or no.  Used to set display and track of the Workload Characterization
              Key.  Must be set to track wckey usage.

       TreeWidth
              Slurmd daemons use a virtual tree network for communications.  TreeWidth  specifies
              the  width  of  the tree (i.e. the fanout).  On architectures with a front end node
              running the slurmd daemon, the value must always be equal to or  greater  than  the
              number  of front end nodes which eliminates the need for message forwarding between
              the slurmd daemons.  On other architectures the default value is 50,  meaning  each
              slurmd  daemon  can  communicate  with  up to 50 other slurmd daemons and over 2500
              nodes can be contacted with two message hops.  The default value will work well for
              most  clusters.   Optimal system performance can typically be achieved if TreeWidth
              is set to the square root of the number of nodes in the cluster for systems  having
              no more than 2500 nodes or the cube root for larger systems.

       UnkillableStepProgram
              If the processes in a job step are determined to be unkillable for a period of time
              specified  by  the  UnkillableStepTimeout  variable,  the  program   specified   by
              UnkillableStepProgram  will  be executed.  This program can be used to take special
              actions to clean up the unkillable processes and/or notify computer administrators.
              The program will be run SlurmdUser (usually "root").  By default no program is run.

       UnkillableStepTimeout
              The length of time, in seconds, that SLURM will wait before deciding that processes
              in a job step are unkillable (after they  have  been  signaled  with  SIGKILL)  and
              execute  UnkillableStepProgram as described above.  The default timeout value is 60
              seconds.

       UsePAM If set to 1, PAM (Pluggable Authentication Modules for Linux) will be enabled.  PAM
              is  used  to  establish  the  upper  bounds  for  resource limits. With PAM support
              enabled, local system administrators  can  dynamically  configure  system  resource
              limits.  Changing  the upper bound of a resource limit will not alter the limits of
              running jobs, only jobs started after a change has been made will pick up  the  new
              limits.   The  default  value  is 0 (not to enable PAM support).  Remember that PAM
              also needs to be configured to support SLURM as a service.  For sites  using  PAM's
              directory  based  configuration  option, a configuration file named slurm should be
              created. The module-type, control-flags,  and  module-path  names  that  should  be
              included in the file are:
              auth        required      pam_localuser.so
              auth        required      pam_shells.so
              account     required      pam_unix.so
              account     required      pam_access.so
              session     required      pam_unix.so
              For  sites configuring PAM with a general configuration file, the appropriate lines
              (see above), where slurm is the service-name, should be added.

       VSizeFactor
              Memory specifications in job requests apply to real  memory  size  (also  known  as
              resident  set  size). It is possible to enforce virtual memory limits for both jobs
              and job steps by limiting their virtual memory to some  percentage  of  their  real
              memory  allocation.  The  VSizeFactor  parameter  specifies the job's or job step's
              virtual memory limit as a percentage of its real memory limit. For  example,  if  a
              job's real memory limit is 500MB and VSizeFactor is set to 101 then the job will be
              killed if its real memory exceeds 500MB or its virtual memory  exceeds  505MB  (101
              percent  of  the  real  memory  limit).   The  default  valus  is 0, which disables
              enforcement of virtual memory limits.  The value may not exceed 65533 percent.

       WaitTime
              Specifies how many seconds the srun command should by default wait after the  first
              task  terminates before terminating all remaining tasks. The "--wait" option on the
              srun command line overrides this value.  If set to 0,  this  feature  is  disabled.
              May not exceed 65533 seconds.

       The  configuration  of  nodes  (or  machines)  to be managed by SLURM is also specified in
       /etc/slurm.conf.  Changes  in  node  configuration  (e.g.  adding  nodes,  changing  their
       processor count, etc.) require restarting the slurmctld daemon.  Only the NodeName must be
       supplied in the configuration file.  All other node configuration information is optional.
       It  is  advisable  to establish baseline node configurations, especially if the cluster is
       heterogeneous.  Nodes which register to the system with less than the configured resources
       (e.g.  too  little memory), will be placed in the "DOWN" state to avoid scheduling jobs on
       them.  Establishing baseline configurations will also speed SLURM's scheduling process  by
       permitting  it  to  compare  job requirements against these (relatively few) configuration
       parameters and possibly avoid having to check job requirements  against  every  individual
       node's  configuration.   The  resources  checked  at  node  registration  time  are: CPUs,
       RealMemory and TmpDisk.  While baseline values for each of these can be established in the
       configuration file, the actual values upon node registration are recorded and these actual
       values may be used for scheduling purposes (depending upon the value  of  FastSchedule  in
       the configuration file.

       Default  values  can  be  specified  with  a record in which "NodeName" is "DEFAULT".  The
       default entry values will apply only to lines following it in the configuration  file  and
       the  default  values  can  be reset multiple times in the configuration file with multiple
       entries where "NodeName=DEFAULT".  The "NodeName=" specification must be placed  on  every
       line  describing  the  configuration  of  nodes.   In  fact,  it is generally possible and
       desirable to define the configurations of all nodes in only a few lines.  This  convention
       permits  significant  optimization  in  the  scheduling  of  larger clusters.  In order to
       support the concept of jobs  requiring  consecutive  nodes  on  some  architectures,  node
       specifications should be place in this file in consecutive order.  No single node name may
       be listed more than once in the configuration file.  Use "DownNodes=" to record the  state
       of  nodes  which  are  temporarily  in  a  DOWN,  DRAIN  or FAILING state without altering
       permanent configuration information.  A job step's tasks are allocated to nodes  in  order
       the  nodes appear in the configuration file. There is presently no capability within SLURM
       to arbitrarily order a job step's tasks.

       Multiple node names may be comma separated (e.g. "alpha,beta,gamma") and/or a simple  node
       range  expression  may  optionally  be  used  to  specify numeric ranges of nodes to avoid
       building a configuration file with large numbers of entries.  The  node  range  expression
       can contain one  pair of square brackets with a sequence of comma separated numbers and/or
       ranges of numbers separated by a "-" (e.g. "linux[0-64,128]", or "lx[15,18,32-33]").  Note
       that  the  numeric  ranges  can  include one or more leading zeros to indicate the numeric
       portion has a fixed number of digits (e.g. "linux[0000-1023]").  Up to two numeric  ranges
       can be included in the expression (e.g. "rack[0-63]_blade[0-41]").  If one or more numeric
       expressions  are  included,  one  of  them  must  be  at  the  end  of  the   name   (e.g.
       "unit[0-31]rack"  is invalid), but arbitrary names can always be used in a comma separated
       list.

       On BlueGene systems only, the square brackets should contain pairs of three digit  numbers
       separated  by  a  "x".  These numbers indicate the boundaries of a rectangular prism (e.g.
       "bgl[000x144,400x544]").   See  BlueGene  documentation  for  more  details.    The   node
       configuration specified the following information:

       NodeName
              Name  that  SLURM uses to refer to a node (or base partition for BlueGene systems).
              Typically this would be the string that "/bin/hostname -s" returns.  It may also be
              the   fully   qualified  domain  name  as  returned  by  "/bin/hostname  -f"  (e.g.
              "foo1.bar.com"), or any valid domain name associated with the host through the host
              database (/etc/hosts) or DNS, depending on the resolver settings.  Note that if the
              short form of the hostname is not used, it may prevent use of hostlist  expressions
              (the  numeric  portion  in  brackets must be at the end of the string).  Only short
              hostname forms are compatible with the switch/elan and switch/federation plugins at
              this  time.   It  may also be an arbitrary string if NodeHostname is specified.  If
              the NodeName is "DEFAULT", the values specified with  that  record  will  apply  to
              subsequent  node  specifications unless explicitly set to other values in that node
              record or replaced with a different set of default values.   For  architectures  in
              which  the  node  order is significant, nodes will be considered consecutive in the
              order  defined.   For  example,  if  the   configuration   for   "NodeName=charlie"
              immediately  follows the configuration for "NodeName=baker" they will be considered
              adjacent in the computer.

       NodeHostname
              Typically this would be the string that "/bin/hostname -s" returns.  It may also be
              the   fully   qualified  domain  name  as  returned  by  "/bin/hostname  -f"  (e.g.
              "foo1.bar.com"), or any valid domain name associated with the host through the host
              database (/etc/hosts) or DNS, depending on the resolver settings.  Note that if the
              short form of the hostname is not used, it may prevent use of hostlist  expressions
              (the  numeric  portion  in  brackets must be at the end of the string).  Only short
              hostname forms are compatible with the switch/elan and switch/federation plugins at
              this  time.   A node range expression can be used to specify a set of nodes.  If an
              expression is used, the number of nodes identified by NodeHostname on a line in the
              configuration file must be identical to the number of nodes identified by NodeName.
              By default, the NodeHostname will be identical in value to NodeName.

       NodeAddr
              Name that a node should be referred to in establishing a communications path.  This
              name   will   be   used   as  an  argument  to  the  gethostbyname()  function  for
              identification.  If a node range expression is used to  designate  multiple  nodes,
              they  must  exactly  match  the  entries  in  the  NodeName (e.g. "NodeName=lx[0-7]
              NodeAddr="elx[0-7]").  NodeAddr may also contain IP  addresses.   By  default,  the
              NodeAddr will be identical in value to NodeName.

       CoresPerSocket
              Number   of   cores  in  a  single  physical  processor  socket  (e.g.  "2").   The
              CoresPerSocket value describes physical cores, not the logical number of processors
              per  socket.   NOTE:  If  you  have  multi-core processors, you will likely need to
              specify this parameter in order to optimize scheduling.  The default value is 1.

       CPUs   Number of logical processors on the node (e.g. "2").  If CPUs is omitted,  it  will
              set  equal  to  the  product  of  Sockets, CoresPerSocket, and ThreadsPerCore.  The
              default value is 1.

       Feature
              A comma delimited list of  arbitrary  strings  indicative  of  some  characteristic
              associated  with  the  node.   There  is no value associated with a feature at this
              time, a node either has a feature or it does not.  If desired a feature may contain
              a  numeric  component  indicating, for example, processor speed.  By default a node
              has no features.  Also see Gres.

       Gres   A comma delimited list of  generic  resources  specifications  for  a  node.   Each
              resource  specification  consists  of  a  name followed by an optional colon with a
              numeric value (default value is one) (e.g. "Gres=bandwidth:10000,gpu:2").  A suffix
              of  "K",  "M"  or  "G"  may  be  used  to  mulitply  the number by 1024, 1048576 or
              1073741824 respectively (e.g. "Gres=bandwidth:4G,gpu:4")..  By default a  node  has
              no generic resources.  Also see Feature.

       Port   The  port number that the SLURM compute node daemon, slurmd, listens to for work on
              this particular node. By default there is a  single  port  number  for  all  slurmd
              daemons  on all compute nodes as defined by the SlurmdPort configuration parameter.
              Use of this option is not generally recommended except for development  or  testing
              purposes.

       Procs  See CPUs.

       RealMemory
              Size  of  real memory on the node in MegaBytes (e.g. "2048").  The default value is
              1.

       Reason Identifies the reason for a node  being  in  state  "DOWN",  "DRAINED"  "DRAINING",
              "FAIL" or "FAILING".  Use quotes to enclose a reason having more than one word.

       Sockets
              Number  of  physical processor sockets/chips on the node (e.g. "2").  If Sockets is
              omitted, it will be inferred from CPUs, CoresPerSocket, and ThreadsPerCore.   NOTE:
              If  you  have  multi-core  processors,  you  will  likely  need  to  specify  these
              parameters.  The default value is 1.

       State  State of the node with respect to the initiation of user jobs.   Acceptable  values
              are  "DOWN",  "DRAIN",  "FAIL", "FAILING" and "UNKNOWN".  "DOWN" indicates the node
              failed and is unavailable to be allocated work.   "DRAIN"  indicates  the  node  is
              unavailable  to  be  allocated work.  "FAIL" indicates the node is expected to fail
              soon, has no jobs allocated to it, and will not  be  allocated  to  any  new  jobs.
              "FAILING"  indicates  the  node  is  expected  to  fail  soon, has one or more jobs
              allocated to it, but will not be allocated to any new  jobs.   "UNKNOWN"  indicates
              the  node's  state  is  undefined  (BUSY or IDLE), but will be established when the
              slurmd daemon on that node registers.  The default value is  "UNKNOWN".   Also  see
              the DownNodes parameter below.

       ThreadsPerCore
              Number  of  logical  threads  in  a single physical core (e.g. "2").  Note that the
              SLURM can allocate resources to jobs down to the resolution  of  a  core.  If  your
              system  is  configured with more than one thread per core, execution of a different
              job    on    each    thread    is    not    supported    unless    you    configure
              SelectTypeParameters=CR_CPU  plus CPUs; do not configure Sockets, CoresPerSocket or
              ThreadsPerCore.  A job can execute a one task per thread from within one  job  step
              or  execute  a  distinct  job  step  on  each of the threads.  Note also if you are
              running with more than 1 thread per core and running the select/cons_res plugin you
              will  want  to set the SelectTypeParameters variable to something other than CR_CPU
              to avoid unexpected results.  The default value is 1.

       TmpDisk
              Total size of temporary disk storage in TmpFS in MegaBytes  (e.g.  "16384").  TmpFS
              (for  "Temporary  File  System")  identifies the location which jobs should use for
              temporary storage.  Note this does not indicate the amount of free space  available
              to the user on the node, only the total file system size. The system administration
              should insure this file system is purged as needed so that user jobs have access to
              most  of  this  space.   The  Prolog  and/or  Epilog  programs  (specified  in  the
              configuration file) might be used to insure the file system  is  kept  clean.   The
              default value is 0.

       Weight The  priority  of  the  node for scheduling purposes.  All things being equal, jobs
              will  be  allocated  the  nodes  with  the  lowest  weight  which  satisfies  their
              requirements.   For  example,  a  heterogeneous collection of nodes might be placed
              into  a  single  partition  for  greater  system  utilization,  responsiveness  and
              capability.  It  would  be  preferable to allocate smaller memory nodes rather than
              larger memory nodes if either will satisfy a  job's  requirements.   The  units  of
              weight  are  arbitrary,  but  larger  weights should be assigned to nodes with more
              processors, memory, disk space, higher processor speed, etc.  Note that  if  a  job
              allocation request can not be satisfied using the nodes with the lowest weight, the
              set of nodes with the next lowest weight  is  added  to  the  set  of  nodes  under
              consideration  for  use  (repeat  as  needed  for  higher  weight  values).  If you
              absolutely want to minimize the number of higher weight nodes allocated  to  a  job
              (at  a  cost of higher scheduling overhead), give each node a distinct Weight value
              and they will be added to  the  pool  of  nodes  being  considered  for  scheduling
              individually.  The default value is 1.

       The  "DownNodes="  configuration  permits  you  to mark certain nodes as in a DOWN, DRAIN,
       FAIL, or FAILING state without altering the  permanent  configuration  information  listed
       under a "NodeName=" specification.

       DownNodes
              Any node name, or list of node names, from the "NodeName=" specifications.

       Reason Identifies  the  reason  for  a  node  being  in  state  "DOWN", "DRAIN", "FAIL" or
              "FAILING.  Use quotes to enclose a reason having more than one word.

       State  State of the node with respect to the initiation of user jobs.   Acceptable  values
              are "BUSY", "DOWN", "DRAIN", "FAIL", "FAILING, "IDLE", and "UNKNOWN".

              DOWN      Indicates the node failed and is unavailable to be allocated work.

              DRAIN     Indicates the node is unavailable to be allocated work.on.

              FAIL      Indicates the node is expected to fail soon, has no jobs allocated to it,
                        and will not be allocated to any new jobs.

              FAILING   Indicates the node is expected  to  fail  soon,  has  one  or  more  jobs
                        allocated to it, but will not be allocated to any new jobs.

              FUTURE    Indicates  the node is defined for future use and need not exist when the
                        SLURM daemons are started. These nodes can  be  made  available  for  use
                        simply  by updating the node state using the scontrol command rather than
                        restarting the slurmctld daemon. After these nodes  are  made  available,
                        change  their  State  in  the slurm.conf file. Until these nodes are made
                        available, they will not be seen using any SLURM commands or Is nor  will
                        any attempt be made to contact them.

              UNKNOWN   Indicates  the  node's  state  is  undefined  (BUSY or IDLE), but will be
                        established when the slurmd daemon on that node registers.   The  default
                        value is "UNKNOWN".

       On  computers  where  frontend nodes are used to execute batch scripts rather than compute
       nodes (BlueGene or Cray systems), one may configure one or more frontend nodes  using  the
       configuration  parameters  defined  below. These options are very similar to those used in
       configuring compute nodes. These options may only be used on systems configured and  built
       with  the  appropriate  parameters  (--have-front-end,  --enable-bluegene-emulation)  or a
       system determined to have the appropriate architecture by the configure  script  (BlueGene
       or Cray systems).  The front end configuration specifies the following information:

       FrontendName
              Name  that  SLURM  uses  to  refer to a frontend node.  Typically this would be the
              string that "/bin/hostname -s" returns.  It may also be the fully qualified  domain
              name  as  returned by "/bin/hostname -f" (e.g. "foo1.bar.com"), or any valid domain
              name associated with the host  through  the  host  database  (/etc/hosts)  or  DNS,
              depending on the resolver settings.  Note that if the short form of the hostname is
              not used, it may prevent use  of  hostlist  expressions  (the  numeric  portion  in
              brackets  must be at the end of the string).  If the FrontendName is "DEFAULT", the
              values specified with that record will  apply  to  subsequent  node  specifications
              unless explicitly set to other values in that frontend node record or replaced with
              a different set of default values.  Note that since the naming of front  end  nodes
              would  typically  not  follow  that  of  the compute nodes (e.g. lacking X, Y and Z
              coordinates found in the compute node naming scheme),  each  front  end  node  name
              should   be   listed   separately   and   without   a   hostlist  expression  (i.e.
              frontend00,frontend01" rather than "frontend[00-01]").</p>

       FrontendAddr
              Name that a frontend node should be referred to in  establishing  a  communications
              path.  This  name  will  be used as an argument to the gethostbyname() function for
              identification.  As with FrontendName, list the individual  node  addresses  rather
              than using a hostlist expression.  The number of FrontendAddr records per line must
              equal the number of FrontendName records per line (i.e. you can't map to node names
              to  one  address).   FrontendAddr  may  also contain IP addresses.  By default, the
              FrontendAddr will be identical in value to FrontendName.

       Port   The port number that the SLURM compute node daemon, slurmd, listens to for work  on
              this  particular  frontend  node.  By default there is a single port number for all
              slurmd daemons on all frontend nodes as defined  by  the  SlurmdPort  configuration
              parameter.  Use  of this option is not generally recommended except for development
              or testing purposes.

       Reason Identifies the reason  for  a  frontend  node  being  in  state  "DOWN",  "DRAINED"
              "DRAINING",  "FAIL"  or "FAILING".  Use quotes to enclose a reason having more than
              one word.

       State  State of the frontend node with respect to the initiation of user jobs.  Acceptable
              values  are "DOWN", "DRAIN", "FAIL", "FAILING" and "UNKNOWN".  "DOWN" indicates the
              frontend node has  failed  and  is  unavailable  to  be  allocated  work.   "DRAIN"
              indicates  the frontend node is unavailable to be allocated work.  "FAIL" indicates
              the frontend node is expected to fail soon, has no jobs allocated to it,  and  will
              not  be  allocated  to  any  new  jobs.   "FAILING"  indicates the frontend node is
              expected to fail soon, has one or more jobs  allocated  to  it,  but  will  not  be
              allocated  to  any  new  jobs.   "UNKNOWN"  indicates  the frontend node's state is
              undefined (BUSY or IDLE), but will be established when the slurmd  daemon  on  that
              node  registers.  The default value is "UNKNOWN".  Also see the DownNodes parameter
              below.

       The partition configuration permits you  to  establish  different  job  limits  or  access
       controls  for  various  groups  (or  partitions)  of nodes.  Nodes may be in more than one
       partition, making partitions serve as general purpose queues.  For example one may put the
       same  set  of  nodes  into two different partitions, each with different constraints (time
       limit, job sizes, groups  allowed  to  use  the  partition,  etc.).   Jobs  are  allocated
       resources  within  a  single  partition.  Default values can be specified with a record in
       which "PartitionName" is "DEFAULT".  The default entry values will  apply  only  to  lines
       following  it in the configuration file and the default values can be reset multiple times
       in the configuration  file  with  multiple  entries  where  "PartitionName=DEFAULT".   The
       "PartitionName="  specification  must be placed on every line describing the configuration
       of partitions.  If a partition that is in use is deleted from the configuration and  slurm
       is  restarted  or  reconfigured  (scontrol  reconfigure),  jobs  using  the  partition are
       canceled.  NOTE: Put all parameters for each partition on a single  line.   Each  line  of
       partition configuration information should represent a different partition.  The partition
       configuration file contains the following information:

       AllocNodes
              Comma separated list of nodes from which users can execute jobs in  the  partition.
              Node names may be specified using the node range expression syntax described above.
              The default value is "ALL".

       AllowGroups
              Comma separated list of group names which may execute jobs in the partition.  If at
              least  one  group  associated  with  the  user  attempting to execute the job is in
              AllowGroups, he will be permitted to use this partition.   Jobs  executed  as  user
              root  can  use  any  partition without regard to the value of AllowGroups.  If user
              root attempts to execute a job as another user (e.g. using  srun's  --uid  option),
              this  other  user must be in one of groups identified by AllowGroups for the job to
              successfully execute.  The default value is "ALL".  NOTE: For performance  reasons,
              SLURM  maintains  a  list  of  user  IDs  allowed to use each partition and this is
              checked at job submission time.   This  list  of  user  IDs  is  updated  when  the
              slurmctld  daemon  is  restarted,  reconfigured  (e.g.  "scontrol reconfig") or the
              partition's AllowGroups value is  reset,  even  if  is  value  is  unchanged  (e.g.
              "scontrol  update PartitionName=name AllowGroups=group").  For a user's access to a
              partition to change, both his group membership must  change  and  SLURM's  internal
              user ID list must change using one of the methods described above.

       Alternate
              Partition  name of alternate partition to be used if the state of this partition is
              "DRAIN" or "INACTIVE."

       Default
              If this keyword is set, jobs  submitted  without  a  partition  specification  will
              utilize  this partition.  Possible values are "YES" and "NO".  The default value is
              "NO".

       DefMemPerCPU
              Default real memory size available per allocated CPU in MegaBytes.  Used  to  avoid
              over-subscribing  memory  and causing paging.  DefMemPerCPU would generally be used
              if individual processors are allocated to  jobs  (SelectType=select/cons_res).   If
              not  set,  the  DefMemPerCPU  value  for the entire cluster will be used.  Also see
              DefMemPerNode  and  MaxMemPerCPU.   DefMemPerCPU  and  DefMemPerNode  are  mutually
              exclusive.   NOTE:  Enforcement  of  memory  limits  currently requires enabling of
              accounting, which samples memory use on a periodic basis (data need not be  stored,
              just collected).

       DefMemPerNode
              Default  real memory size available per allocated node in MegaBytes.  Used to avoid
              over-subscribing memory and causing paging.  DefMemPerNode would generally be  used
              if  whole  nodes are allocated to jobs (SelectType=select/linear) and resources are
              shared (Shared=yes or Shared=force).  If not set, the DefMemPerNode value  for  the
              entire   cluster   will   be   used.   Also  see  DefMemPerCPU  and  MaxMemPerNode.
              DefMemPerCPU and DefMemPerNode are mutually exclusive.  NOTE: Enforcement of memory
              limits  currently  requires  enabling  of accounting, which samples memory use on a
              periodic basis (data need not be stored, just collected).

       DefaultTime
              Run time limit used for jobs that don't specify a value. If not  set  then  MaxTime
              will be used.  Format is the same as for MaxTime.

       DisableRootJobs
              If  set  to  "YES"  then  user root will be prevented from running any jobs on this
              partition.  The default value will be the value of DisableRootJobs set outside of a
              partition specification (which is "NO", allowing user root to execute jobs).

       GraceTime
              Specifies,  in  units of seconds, the preemption grace time to be extended to a job
              which has been selected for preemption.  The default value is zero,  no  preemption
              grace time is allowed on this partition.  (Meaningful only for PreemptMode=CANCEL)

       Hidden Specifies  if  the  partition  and  its  jobs  are to be hidden by default.  Hidden
              partitions will by default not be reported by the SLURM APIs or commands.  Possible
              values are "YES" and "NO".  The default value is "NO".  Note that partitions that a
              user lacks access to by virtue of the AllowGroups parameter will also be hidden  by
              default.

       MaxMemPerCPU
              Maximum  real  memory size available per allocated CPU in MegaBytes.  Used to avoid
              over-subscribing memory and causing paging.  MaxMemPerCPU would generally  be  used
              if  individual  processors  are allocated to jobs (SelectType=select/cons_res).  If
              not set, the MaxMemPerCPU value for the entire cluster  will  be  used.   Also  see
              DefMemPerCPU  and  MaxMemPerNode.   MaxMemPerCPU  and  MaxMemPerNode  are  mutually
              exclusive.  NOTE: Enforcement of  memory  limits  currently  requires  enabling  of
              accounting,  which samples memory use on a periodic basis (data need not be stored,
              just collected).

       MaxMemPerNode
              Maximum real memory size available per allocated node in MegaBytes.  Used to  avoid
              over-subscribing  memory and causing paging.  MaxMemPerNode would generally be used
              if whole nodes are allocated to jobs (SelectType=select/linear) and  resources  are
              shared  (Shared=yes  or Shared=force).  If not set, the MaxMemPerNode value for the
              entire  cluster  will  be  used.   Also   see   DefMemPerNode   and   MaxMemPerCPU.
              MaxMemPerCPU and MaxMemPerNode are mutually exclusive.  NOTE: Enforcement of memory
              limits currently requires enabling of accounting, which samples  memory  use  on  a
              periodic basis (data need not be stored, just collected).

       MaxNodes
              Maximum  count  of  nodes  which  may be allocated to any single job.  For BlueGene
              systems this will be a  c-nodes count and will be converted  to  a  midplane  count
              with  a  reduction  in  resolution.   The  default  value  is "UNLIMITED", which is
              represented internally as -1.  This limit  does  not  apply  to  jobs  executed  by
              SlurmUser or user root.

       MaxTime
              Maximum   run   time   limit   for   jobs.   Format  is  minutes,  minutes:seconds,
              hours:minutes:seconds, days-hours,  days-hours:minutes,  days-hours:minutes:seconds
              or  "UNLIMITED".  Time resolution is one minute and second values are rounded up to
              the next minute.  This limit does not apply to jobs executed by SlurmUser  or  user
              root.

       MinNodes
              Minimum  count  of  nodes  which  may be allocated to any single job.  For BlueGene
              systems this will be a  c-nodes count and will be converted  to  a  midplane  count
              with a reduction in resolution.  The default value is 1.  This limit does not apply
              to jobs executed by SlurmUser or user root.

       Nodes  Comma separated list of nodes (or base partitions for BlueGene systems)  which  are
              associated  with  this partition.  Node names may be specified using the node range
              expression syntax described above. A blank list of nodes (i.e. "Nodes=  ")  can  be
              used  if  one  wants  a  partition  to  exist, but have no resources (possibly on a
              temporary basis).

       PartitionName
              Name by which the partition may be referenced (e.g. "Interactive").  This name  can
              be specified by users when submitting jobs.  If the PartitionName is "DEFAULT", the
              values specified with that record will apply to subsequent partition specifications
              unless  explicitly  set to other values in that partition record or replaced with a
              different set of default values.

       PreemptMode
              Mechanism    used    to    preempt    jobs     from     this     partition     when
              PreemptType=preempt/partition_prio   is   configured.    This   partition  specific
              PreemptMode configuration parameter will  override  the  PreemptMode  configuration
              parameter  set  for  the  cluster  as  a whole.  The cluster-level PreemptMode must
              include the GANG option if PreemptMode is configured to SUSPEND for any  partition.
              The  cluster-level  PreemptMode  must  not be OFF if PreemptMode is enabled for any
              partition.  See the description  of  the  cluster-level  PreemptMode  configuration
              parameter above for further information.

       Priority
              Jobs  submitted  to  a  higher priority partition will be dispatched before pending
              jobs in lower priority partitions and if possible they will  preempt  running  jobs
              from  lower priority partitions.  Note that a partition's priority takes precedence
              over a job's priority.  The value may not exceed 65533.

       RootOnly
              Specifies if only user ID zero (i.e. user root)  may  allocate  resources  in  this
              partition.  User  root  may  allocate resources for any other user, but the request
              must be initiated by user root.  This option can be useful for a  partition  to  be
              managed  by  some  external  entity  (e.g. a higher-level job manager) and prevents
              users from directly using those resources.  Possible values  are  "YES"  and  "NO".
              The default value is "NO".

       Shared Controls  the  ability  of  the partition to execute more than one job at a time on
              each   resource   (node,   socket   or   core   depending   upon   the   value   of
              SelectTypeParameters).    If   resources   are   to   be  shared,  avoiding  memory
              over-subscription is very important.  SelectTypeParameters should be configured  to
              treat  memory  as a consumable resource and the --mem option should be used for job
              allocations.  Sharing of  resources  is  typically  useful  only  when  using  gang
              scheduling  (PreemptMode=suspend  or PreemptMode=kill).  Possible values for Shared
              are "EXCLUSIVE", "FORCE", "YES", and "NO".  The default value is  "NO".   For  more
              information see the following web pages:
              http://www.schedmd.com/slurmdocs/cons_res.html,
              http://www.schedmd.com/slurmdocs/cons_res_share.html,
              http://www.schedmd.com/slurmdocs/gang_scheduling.html, and
              http://www.schedmd.com/slurmdocs/preempt.html.

              EXCLUSIVE   Allocates  entire  nodes  to jobs even with select/cons_res configured.
                          Jobs that run in partitions with "Shared=EXCLUSIVE" will have exclusive
                          access to all allocated nodes.

              FORCE       Makes  all resources in the partition available for sharing without any
                          means for users to disable it.   May  be  followed  with  a  colon  and
                          maximum  number  of  jobs  in  running or suspended state.  For example
                          "Shared=FORCE:4" enables each node, socket or core  to  execute  up  to
                          four  jobs  at  once.  Recommended only for BlueGene systems configured
                          with  small  blocks  or  for  systems  running  with  gang   scheduling
                          (SchedulerType=sched/gang).

              YES         Makes  all resources in the partition available for sharing, but honors
                          a     user's     request     for     dedicated      resources.       If
                          SelectType=select/cons_res,  then  resources  will  be  over-subscribed
                          unless  explicitly  disabled  in  the  job  submit  request  using  the
                          "--exclusive"     option.      With    SelectType=select/bluegene    or
                          SelectType=select/linear, resources will only be  over-subscribed  when
                          explicitly  requested  by  the  user  using the "--share" option on job
                          submission.  May be followed with a colon and maximum number of jobs in
                          running  or  suspended  state.  For example "Shared=YES:4" enables each
                          node, socket or core to execute up to four jobs at  once.   Recommended
                          only      for      systems     running     with     gang     scheduling
                          (SchedulerType=sched/gang).

              NO          Selected resources are allocated to a single job. No resource  will  be
                          allocated to more than one job.

       State  State  of  partition  or  availability  for use.  Possible values are "UP", "DOWN",
              "DRAIN"  and  "INACTIVE".  The  default  value  is  "UP".   See  also  the  related
              "Alternate" keyword.

              UP        Designates  that  new jobs may queued on the partition, and that jobs may
                        be allocated nodes and run from the partition.

              DOWN      Designates that new jobs may be queued on the partition, but queued  jobs
                        may  not  be  allocated  nodes  and  run from the partition. Jobs already
                        running on the partition continue to run. The  jobs  must  be  explicitly
                        canceled to force their termination.

              DRAIN     Designates  that  no  new  jobs  may  be  queued  on  the  partition (job
                        submission requests will be denied  with  an  error  message),  but  jobs
                        already queued on the partition may be allocated nodes and run.  See also
                        the "Alternate" partition specification.

              INACTIVE  Designates that no new jobs may be queued  on  the  partition,  and  jobs
                        already  queued  may  not  be  allocated  nodes  and  run.   See also the
                        "Alternate" partition specification.

Prolog and Epilog Scripts

       There are a variety of prolog  and  epilog  program  options  that  execute  with  various
       permissions and at various times.  The four options most likely to be used are: Prolog and
       Epilog (executed once on  each  compute  node  for  each  job)  plus  PrologSlurmctld  and
       EpilogSlurmctld (executed once on the ControlMachine for each job).

       NOTE:   Standard  output  and error messages are normally not preserved.  Explicitly write
       output and error messages to  an  appropriate  location  if  you  wish  to  preserve  that
       information.

       NOTE:   The Prolog script is ONLY run on any individual node when it first sees a job step
       from a new allocation; it does not run  the  Prolog  immediately  when  an  allocation  is
       granted.   If  no  job  steps  from an allocation are run on a node, it will never run the
       Prolog for that allocation.  The Epilog, on the other hand, always runs on every  node  of
       an allocation when the allocation is released.

       Information  about  the  job  is passed to the script using environment variables.  Unless
       otherwise specified, these environment variables are available to all of the programs.

       BASIL_RESERVATION_ID
              Basil reservation ID.  Available on Cray XT/XE systems only.

       MPIRUN_PARTITION
              BlueGene partition name.  Available on BlueGene systems only.

       SLURM_JOB_ACCOUNT
              Account name used for the job.  Available in  PrologSlurmctld  and  EpilogSlurmctld
              only.

       SLURM_JOB_CONSTRAINTS
              Features required to run the job.  Available in PrologSlurmctld and EpilogSlurmctld
              only.

       SLURM_JOB_DERIVED_EC
              The highest exit code of all of the job steps.  Available in EpilogSlurmctld only.

       SLURM_JOB_EXIT_CODE
              The exit code of the job script (or salloc).  Available in EpilogSlurmctld only.

       SLURM_JOB_GID
              Group ID of the job's owner.   Available  in  PrologSlurmctld  and  EpilogSlurmctld
              only.

       SLURM_JOB_GROUP
              Group  name  of  the job's owner.  Available in PrologSlurmctld and EpilogSlurmctld
              only.

       SLURM_JOB_ID
              Job ID.

       SLURM_JOB_NAME
              Name of the job.  Available in PrologSlurmctld and EpilogSlurmctld only.

       SLURM_JOB_NODELIST
              Nodes assigned to job. A SLURM hostlist expression.  "scontrol show hostnames"  can
              be  used  to  convert  this  to  a  list  of  individual  host names.  Available in
              PrologSlurmctld and EpilogSlurmctld only.

       SLURM_JOB_PARTITION
              Partition that job runs in.  Available in PrologSlurmctld and EpilogSlurmctld only.

       SLURM_JOB_UID
              User ID of the job's owner.

       SLURM_JOB_USER
              User name of the job's owner.

NETWORK TOPOLOGY

       SLURM is able to optimize job allocations to minimize network contention.   Special  SLURM
       logic  is  used  to  optimize allocations on systems with a three-dimensional interconnect
       (BlueGene, Sun Constellation, etc.)  and information about configuring those  systems  are
       available  on  web  pages  available  here:  <http://www.schedmd.com/slurmdocs/>.   For  a
       hierarchical network, SLURM needs  to  have  detailed  information  about  how  nodes  are
       configured on the network switches.

       Given network topology information, SLURM allocates all of a job's resources onto a single
       leaf of the network (if possible) using a best-fit algorithm.  Otherwise it will  allocate
       a  job's  resources  onto multiple leaf switches so as to minimize the use of higher-level
       switches.  The TopologyPlugin parameter controls which plugin is used to  collect  network
       topology  information.   The  only  values  presently  supported  are  "topology/3d_torus"
       (default for IBM BlueGene, Sun Constellation and Cray  XT/XE  systems,  performs  best-fit
       logic  over  three-dimensional  topology),  "topology/none"  (default  for  other systems,
       best-fit logic over one-dimensional  topology),  "topology/tree"  (determine  the  network
       topology based upon information contained in a topology.conf file, see "man topology.conf"
       for more information).  Future plugins may gather topology information directly  from  the
       network.   The  topology  information  is optional.  If not provided, SLURM will perform a
       best-fit algorithm assuming the nodes are in a one-dimensional array as configured and the
       communications cost is related to the node distance in this array.

RELOCATING CONTROLLERS

       If  the  cluster's  computers  used  for  the  primary or backup controller will be out of
       service for an extended period of time, it may be desirable to relocate them.  In order to
       do so, follow this procedure:

       1. Stop the SLURM daemons
       2. Modify the slurm.conf file appropriately
       3. Distribute the updated slurm.conf file to all nodes
       4. Restart the SLURM daemons

       There  should  be  no loss of any running or pending jobs.  Insure that any nodes added to
       the cluster have the current slurm.conf file installed.

       CAUTION: If two nodes are simultaneously configured as the primary controller  (two  nodes
       on  which  ControlMachine  specify the local host and the slurmctld daemon is executing on
       each), system  behavior  will  be  destructive.   If  a  compute  node  has  an  incorrect
       ControlMachine  or  BackupController parameter, that node may be rendered unusable, but no
       other harm will result.

EXAMPLE

       #
       # Sample /etc/slurm.conf for dev[0-25].llnl.gov
       # Author: John Doe
       # Date: 11/06/2001
       #
       ControlMachine=dev0
       ControlAddr=edev0
       BackupController=dev1
       BackupAddr=edev1
       #
       AuthType=auth/munge
       Epilog=/usr/local/slurm/epilog
       Prolog=/usr/local/slurm/prolog
       FastSchedule=1
       FirstJobId=65536
       InactiveLimit=120
       JobCompType=jobcomp/filetxt
       JobCompLoc=/var/log/slurm/jobcomp
       KillWait=30
       MaxJobCount=10000
       MinJobAge=3600
       PluginDir=/usr/local/lib:/usr/local/slurm/lib
       ReturnToService=0
       SchedulerType=sched/backfill
       SlurmctldLogFile=/var/log/slurm/slurmctld.log
       SlurmdLogFile=/var/log/slurm/slurmd.log
       SlurmctldPort=7002
       SlurmdPort=7003
       SlurmdSpoolDir=/usr/local/slurm/slurmd.spool
       StateSaveLocation=/usr/local/slurm/slurm.state
       SwitchType=switch/elan
       TmpFS=/tmp
       WaitTime=30
       JobCredentialPrivateKey=/usr/local/slurm/private.key
       JobCredentialPublicCertificate=/usr/local/slurm/public.cert
       #
       # Node Configurations
       #
       NodeName=DEFAULT CPUs=2 RealMemory=2000 TmpDisk=64000
       NodeName=DEFAULT State=UNKNOWN
       NodeName=dev[0-25] NodeAddr=edev[0-25] Weight=16
       # Update records for specific DOWN nodes
       DownNodes=dev20 State=DOWN Reason="power,ETA=Dec25"
       #
       # Partition Configurations
       #
       PartitionName=DEFAULT MaxTime=30 MaxNodes=10 State=UP
       PartitionName=debug Nodes=dev[0-8,18-25] Default=YES
       PartitionName=batch Nodes=dev[9-17]  MinNodes=4
       PartitionName=long Nodes=dev[9-17] MaxTime=120 AllowGroups=admin

FILE AND DIRECTORY PERMISSIONS

       There are three classes of files: Files used by  slurmctld  must  be  accessible  by  user
       SlurmUser and accessible by the primary and backup control machines.  Files used by slurmd
       must be accessible by user root and accessible from every compute node.  A few files  need
       to  be  accessible  by  normal users on all login and compute nodes.  While many files and
       directories are listed below, most of them will not be used with most configurations.

       AccountingStorageLoc
              If this specifies a file, it must be writable by user SlurmUser.  The file must  be
              accessible  by the primary and backup control machines.  It is recommended that the
              file be readable by all users from login and compute nodes.

       Epilog Must be executable by user root.  It is recommended that the file  be  readable  by
              all users.  The file must exist on every compute node.

       EpilogSlurmctld
              Must  be executable by user SlurmUser.  It is recommended that the file be readable
              by all users.  The file must be  accessible  by  the  primary  and  backup  control
              machines.

       HealthCheckProgram
              Must  be  executable  by user root.  It is recommended that the file be readable by
              all users.  The file must exist on every compute node.

       JobCheckpointDir
              Must be writable by user SlurmUser and no other users.  The file must be accessible
              by the primary and backup control machines.

       JobCompLoc
              If  this specifies a file, it must be writable by user SlurmUser.  The file must be
              accessible by the primary and backup control machines.

       JobCredentialPrivateKey
              Must be readable only by user SlurmUser and writable by no other users.   The  file
              must be accessible by the primary and backup control machines.

       JobCredentialPublicCertificate
              Readable to all users on all nodes.  Must not be writable by regular users.

       MailProg
              Must  be executable by user SlurmUser.  Must not be writable by regular users.  The
              file must be accessible by the primary and backup control machines.

       Prolog Must be executable by user root.  It is recommended that the file  be  readable  by
              all users.  The file must exist on every compute node.

       PrologSlurmctld
              Must  be executable by user SlurmUser.  It is recommended that the file be readable
              by all users.  The file must be  accessible  by  the  primary  and  backup  control
              machines.

       ResumeProgram
              Must  be  executable by user SlurmUser.  The file must be accessible by the primary
              and backup control machines.

       SallocDefaultCommand
              Must be executable by all users.  The file must exist on every  login  and  compute
              node.

       slurm.conf
              Readable to all users on all nodes.  Must not be writable by regular users.

       SlurmctldLogFile
              Must be writable by user SlurmUser.  The file must be accessible by the primary and
              backup control machines.

       SlurmctldPidFile
              Must be writable by user root.  Preferably writable  and  removable  by  SlurmUser.
              The file must be accessible by the primary and backup control machines.

       SlurmdLogFile
              Must be writable by user root.  A distinct file must exist on each compute node.

       SlurmdPidFile
              Must be writable by user root.  A distinct file must exist on each compute node.

       SlurmdSpoolDir
              Must be writable by user root.  A distinct file must exist on each compute node.

       SrunEpilog
              Must  be  executable  by all users.  The file must exist on every login and compute
              node.

       SrunProlog
              Must be executable by all users.  The file must exist on every  login  and  compute
              node.

       StateSaveLocation
              Must be writable by user SlurmUser.  The file must be accessible by the primary and
              backup control machines.

       SuspendProgram
              Must be executable by user SlurmUser.  The file must be accessible by  the  primary
              and backup control machines.

       TaskEpilog
              Must be executable by all users.  The file must exist on every compute node.

       TaskProlog
              Must be executable by all users.  The file must exist on every compute node.

       UnkillableStepProgram
              Must  be  executable by user SlurmUser.  The file must be accessible by the primary
              and backup control machines.

LOGGING

       Note that while SLURM daemons create log files and other files as needed,  it  treats  the
       lack  of  parent  directories as a fatal error.  This prevents the daemons from running if
       critical file systems are  not  mounted  and  will  minimize  the  risk  of  cold-starting
       (starting without preserving jobs).

       Log files and job accounting files, may need to be created/owned by the "SlurmUser" uid to
       be successfully accessed.  Use the "chown" and "chmod" commands to set the  ownership  and
       permissions appropriately.  See the section FILE AND DIRECTORY PERMISSIONS for information
       about the various files and directories used by SLURM.

       It is recommended that the logrotate utility be used to insure that various log  files  do
       not  become  too  large.   This  also  applies  to text files used for accounting, process
       tracking, and the slurmdbd log if they are used.

       Here is a sample logrotate configuration. Make appropriate site modifications and save  as
       /etc/logrotate.d/slurm on all nodes.  See the logrotate man page for more details.

       ##
       # SLURM Logrotate Configuration
       ##
       /var/log/slurm/*log {
           compress
           missingok
           nocopytruncate
           nocreate
           nodelaycompress
           nomail
           notifempty
           noolddir
           rotate 5
           sharedscripts
           size=5M
           create 640 slurm root
           postrotate
               /etc/init.d/slurm reconfig
           endscript
       }

COPYING

       Copyright  (C)  2002-2007  The  Regents  of  the  University of California.  Copyright (C)
       2008-2010 Lawrence Livermore National  Security.   Portions  Copyright  (C)  2010  SchedMD
       <http://www.sched-md.com>.   Produced  at  Lawrence  Livermore  National  Laboratory  (cf,
       DISCLAIMER).  CODE-OCEC-09-009. All rights reserved.

       This  file  is  part  of  SLURM,  a  resource  management  program.   For   details,   see
       <http://www.schedmd.com/slurmdocs/>.

       SLURM  is  free  software; you can redistribute it and/or modify it under the terms of the
       GNU General Public License as published by the Free Software Foundation; either version  2
       of the License, or (at your option) any later version.

       SLURM is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without
       even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
       GNU General Public License for more details.

FILES

       /etc/slurm.conf

SEE ALSO

       bluegene.conf(5),  cgroup.conf(5),  gethostbyname  (3), getrlimit (2), gres.conf(5), group
       (5), hostname (1), scontrol(1), slurmctld(8),  slurmd(8),  slurmdbd(8),  slurmdbd.conf(5),
       srun(1), spank(8), syslog (2), topology.conf(5), wiki.conf(5)