Provided by: slurm-llnl_2.2.7-1_i386 bug

NAME

       scontrol - Used view and modify Slurm configuration and state.

SYNOPSIS

       scontrol [OPTIONS...] [COMMAND...]

DESCRIPTION

       scontrol  is used to view or modify Slurm configuration including: job,
       job  step,   node,   partition,   reservation,   and   overall   system
       configuration.  Most of the commands can only be executed by user root.
       If an attempt to view or modify configuration information is made by an
       unauthorized  user,  an error message will be printed and the requested
       action will not occur. If no command is entered on  the  execute  line,
       scontrol  will  operate in an interactive mode and prompt for input. It
       will  continue  prompting  for  input  and  executing  commands   until
       explicitly  terminated.  If  a  command is entered on the execute line,
       scontrol will execute that command  and  terminate.  All  commands  and
       options are case-insensitive, although node names, partition names, and
       reservation names are case-sensitive (node  names  "LX"  and  "lx"  are
       distinct).   All  commands and options can be abbreviated to the extent
       that the specification is unique.

OPTIONS

       -a, --all
              When the show command is  used,  then  display  all  partitions,
              their  jobs  and  jobs  steps.  This  causes  information  to be
              displayed about partitions that are  configured  as  hidden  and
              partitions that are unavailable to user's group.

       -d, --details
              Causes  the  show  command  to  provide additional details where
              available.

       -h, --help
              Print a help message describing the usage of scontrol.

       --hide Do not display information about hidden partitions,  their  jobs
              and   job  steps.   By  default,  neither  partitions  that  are
              configured as hidden nor those partitions unavailable to  user's
              group will be displayed (i.e. this is the default behavior).

       -M, --clusters=<string>
              Cluster to issue commands to.

       -o, --oneliner
              Print information one line per record.

       -Q, --quiet
              Print  no  warning  or  informational messages, only fatal error
              messages.

       -v, --verbose
              Print  detailed  event  logging.  Multiple  -v's  will   further
              increase  the  verbosity of logging. By default only errors will
              be displayed.

       -V , --version
              Print version information and exit.

       COMMANDS

       all    Show all partitions, their jobs  and  jobs  steps.  This  causes
              information to be displayed about partitions that are configured
              as hidden and partitions that are unavailable to user's group.

       abort  Instruct the  Slurm  controller  to  terminate  immediately  and
              generate a core file.  See "man slurmctld" for information about
              where the core file will be written.

       checkpoint CKPT_OP ID
              Perform a checkpoint  activity  on  the  job  step(s)  with  the
              specified identification.  ID can be used to identify a specific
              job (e.g. "<job_id>", which  applies  to  all  of  its  existing
              steps)  or  a  specific  job  step  (e.g. "<job_id>.<step_id>").
              Acceptable values for CKPT_OP include:

              able        Test if presently not disabled, report start time if
                          checkpoint in progress

              create      Create a checkpoint and continue the job or job step

              disable     Disable future checkpoints

              enable      Enable future checkpoints

              error       Report  the  result for the last checkpoint request,
                          error code and message

              restart     Restart execution of the previously checkpointed job
                          or job step

              requeue     Create  a  checkpoint  and  requeue  the  batch job,
                          combines vacate and restart operations

              vacate      Create a checkpoint and terminate  the  job  or  job
                          step
       Acceptable values for CKPT_OP include:

              MaxWait=<seconds>   Maximum  time  for checkpoint to be written.
                                  Default value is  10  seconds.   Valid  with
                                  create and vacate options only.

              ImageDir=<directory_name>
                                  Location  of  checkpoint  file.   Valid with
                                  create, vacate  and  restart  options  only.
                                  This   value   takes   precedent   over  any
                                  --checkpoint-dir  value  specified  at   job
                                  submission time.

              StickToNodes        If  set,  resume  job  on the same nodes are
                                  previously used.   Valid  with  the  restart
                                  option only.

       create SPECIFICATION
              Create  a  new  partition  or reservation.  See the full list of
              parameters below.  Include the tag "res" to create a reservation
              without specifying a reservation name.

       completing
              Display  all  jobs  in  a COMPLETING state along with associated
              nodes in either a COMPLETING or DOWN state.

       delete SPECIFICATION
              Delete the entry with  the  specified  SPECIFICATION.   The  two
              SPECIFICATION     choices     are    PartitionName=<name>    and
              Reservation=<name>.  On Dynamically laid  out  Bluegene  systems
              BlockName=<name>  also works. Reservations and partitions should
              have no associated jobs at the time of  their  deletion  (modify
              the job's first).

       details
              Causes  the  show  command  to  provide additional details where
              available, namely the specific CPUs and NUMA memory allocated on
              each  node.   Note that on computers with hyperthreading enabled
              and  SLURM  configured  to  allocate  cores,  each  listed   CPU
              represents one physical core.  Each hyperthread on that core can
              be allocated a separate task, so a  job's  CPU  count  and  task
              count  may  differ.   See  the  --cpu_bind and --mem_bind option
              descriptions in  srun  man  pages  for  more  information.   The
              details  option  is  currently  only  supported for the show job
              command.

       exit   Terminate the execution of scontrol.   This  is  an  independent
              command with no options meant for use in interactive mode.

       help   Display a description of scontrol options and commands.

       hide   Do  not  display  partition,  job  or  jobs step information for
              partitions that are configured as hidden or partitions that  are
              unavailable to the user's group.  This is the default behavior.

       hold job_id
              Prevent a pending job from beginning started (sets it's priority
              to 0).  Use  the  release  command  to  permit  the  job  to  be
              scheduled.    Note   that  when  a  job  is  held  by  a  system
              administrator  using   the   hold   command,   only   a   system
              administrator  may  release  the job for execution (also see the
              uhold command). When the job is held by its owner, it  may  also
              be released by the job's owner.

       notify job_id message
              Send  a  message to standard error of the salloc or srun command
              or batch job associated with the specified job_id.

       oneliner
              Print information one line per record.

       pidinfo proc_id
              Print  the  Slurm  job  id  and   scheduled   termination   time
              corresponding  to  the  supplied  process  id,  proc_id,  on the
              current node.  This will work only with  processes  on  node  on
              which  scontrol  is run, and only for those processes spawned by
              SLURM and their descendants.

       listpids [job_id[.step_id]] [NodeName]
              Print  a  listing  of  the  process  IDs  in  a  job  step   (if
              JOBID.STEPID  is provided), or all of the job steps in a job (if
              job_id is provided), or all of the job steps in all of the  jobs
              on  the local node (if job_id is not provided or job_id is "*").
              This will work only with processes on the node on which scontrol
              is  run, and only for those processes spawned by SLURM and their
              descendants. Note that some SLURM configurations  (ProctrackType
              value  of  pgid  or  aix)  are  unable to identify all processes
              associated with a job or job step.

              Note that the NodeName option is only  really  useful  when  you
              have  multiple  slurmd daemons running on the same host machine.
              Multiple slurmd daemons on one host are, in general,  only  used
              by SLURM developers.

       ping   Ping  the  primary  and secondary slurmctld daemon and report if
              they are responding.

       quiet  Print no warning or informational  messages,  only  fatal  error
              messages.

       quit   Terminate the execution of scontrol.

       reconfigure
              Instruct  all  Slurm  daemons to re-read the configuration file.
              This command does not restart the daemons.  This mechanism would
              be  used  to  modify  configuration  parameters (Epilog, Prolog,
              SlurmctldLogFile, SlurmdLogFile,  etc.)  register  the  physical
              addition  or  removal of nodes from the cluster or recognize the
              change of a node's configuration, such as the addition of memory
              or  processors.   The  Slurm controller (slurmctld) forwards the
              request all other daemons (slurmd daemon on each compute  node).
              Running  jobs continue execution.  Most configuration parameters
              can be changed by just  running  this  command,  however,  SLURM
              daemons  should  be  shutdown  and  restarted  if  any  of these
              parameters   are   to   be   changed:   AuthType,    BackupAddr,
              BackupController,     ControlAddr,    ControlMach,    PluginDir,
              StateSaveLocation, SlurmctldPort or SlurmdPort.

       release job_id
              Release a previously held job to begin execution. Also see hold.

       requeue job_id
              Requeue a running or pending SLURM batch job.

       resume job_id
              Resume a previously suspended job. Also see suspend.

       schedloglevel LEVEL
              Enable or disable scheduler logging.  LEVEL  may  be  "0",  "1",
              "disable" or "enable". "0" has the same effect as "disable". "1"
              has the same effect as "enable".  This value  is  temporary  and
              will   be  overwritten  when  the  slurmctld  daemon  reads  the
              slurm.conf configuration file (e.g. when the daemon is restarted
              or  scontrol  reconfigure is executed) if the SlurmSchedLogLevel
              parameter is present.

       setdebug LEVEL
              Change the debug level of the slurmctld daemon.  LEVEL may be an
              integer  value  between  zero and nine (using the same values as
              SlurmctldDebug in the slurm.conf file) or the name of  the  most
              detailed  message type to be printed: "quiet", "fatal", "error",
              "info", "verbose", "debug",  "debug2",  "debug3",  "debug4",  or
              "debug5".   This  value  is  temporary  and  will be overwritten
              whenever the slurmctld daemon reads the slurm.conf configuration
              file  (e.g. when the daemon is restarted or scontrol reconfigure
              is executed).

       show ENTITY ID
              Display the state of the specified  entity  with  the  specified
              identification.   ENTITY  may  be aliases, config, daemons, job,
              node, partition, reservation, slurmd, step,  topology,  hostlist
              or  hostnames (also block or subbp on BlueGene systems).  ID can
              be used to identify a specific element of the identified entity:
              the  configuration  parameter name, job ID, node name, partition
              name, reservation name, or job step ID for  config,  job,  node,
              partition, or step respectively.  For an ENTITY of topology, the
              ID may be a node or switch name.  If one node name is specified,
              all  switches connected to that node (and their parent switches)
              will be shown.  If more than one node name  is  specified,  only
              switches that connect to all named nodes will be shown.  aliases
              will  return  all  NodeName  values  associated   to   a   given
              NodeHostname (useful to get the list of virtual nodes associated
              with a real  node  in  a  configuration  where  multiple  slurmd
              daemons  execute  on a single compute node).  hostnames takes an
              optional hostlist expression as  input  and  writes  a  list  of
              individual  host  names to standard output (one per line). If no
              hostlist  expression  is   supplied,   the   contents   of   the
              SLURM_NODELIST   environment   variable  is  used.  For  example
              "tux[1-3]" is mapped to "tux1","tux2" and "tux3"  (one  hostname
              per  line).   hostlist takes a list of host names and prints the
              hostlist  expression  for  them  (the  inverse  of   hostnames).
              hostlist   can  also  take  the  absolute  pathname  of  a  file
              (beginning  with  the  character  '/')  containing  a  list   of
              hostnames.   Multiple  node  names may be specified using simple
              node range expressions (e.g. "lx[10-20]"). All other  ID  values
              must  identify  a single element. The job step ID is of the form
              "job_id.step_id", (e.g. "1234.1").  slurmd reports  the  current
              status  of  the  slurmd  daemon  executing on the same node from
              which the scontrol command is executed (the local host). It  can
              be useful to diagnose problems.  By default, all elements of the
              entity type specified are printed.  For an ENTITY of job, if the
              job   does  not  specify  socket-per-node,  cores-per-socket  or
              threads-per-core then it  will  display  '*'  in  ReqS:C:T=*:*:*
              field.

       shutdown OPTION
              Instruct  Slurm daemons to save current state and terminate.  By
              default, the Slurm controller (slurmctld) forwards  the  request
              all  other  daemons  (slurmd  daemon  on each compute node).  An
              OPTION of slurmctld or controller results in only the  slurmctld
              daemon being shutdown and the slurmd daemons remaining active.

       suspend job_id
              Suspend  a  running  job.   Use the resume command to resume its
              execution.  User processes  must  stop  on  receipt  of  SIGSTOP
              signal  and resume upon receipt of SIGCONT for this operation to
              be effective.  Not all architectures and configurations  support
              job suspension.

       takeover
              Instruct  SLURM's  backup  controller  (slurmctld)  to take over
              system control.  SLURM's backup controller requests control from
              the  primary  and  waits  for  its  termination.  After that, it
              switches  from  backup  mode  to  controller  mode.  If  primary
              controller  can  not  be  contacted,  it  directly  switches  to
              controller mode.  This  can  be  used  to  speed  up  the  SLURM
              controller  fail-over  mechanism  when the primary node is down.
              This  can  be  used  to  minimize  disruption  if  the  computer
              executing  the  primary  SLURM  controller  is  scheduled  down.
              (Note: SLURM's primary controller will take the control back  at
              startup.)

       uhold job_id
              Prevent  a pending job from being started (sets it's priority to
              0).  Use the release command to permit the job to be  scheduled.
              This  command  is  designed for a system administrator to hold a
              job so that the job owner may release it rather  than  requiring
              the  interventon  of  a  system administrator (also see the hold
              command).

       update SPECIFICATION
              Update job, step, node, partition, or reservation  configuration
              per  the  supplied  specification.  SPECIFICATION is in the same
              format as the Slurm configuration file and  the  output  of  the
              show command described above. It may be desirable to execute the
              show command (described above) on the specific entity you  which
              to  update,  then  use  cut-and-paste  tools  to  enter  updated
              configuration  values  to  the  update.  Note  that  while  most
              configuration  values can be changed using this command, not all
              can be changed using this mechanism. In particular, the hardware
              configuration  of  a node or the physical addition or removal of
              nodes from the cluster may only be accomplished through  editing
              the  Slurm  configuration  file  and  executing  the reconfigure
              command (described above).

       verbose
              Print detailed event logging.  This includes time-stamps on data
              structures, record counts, etc.

       version
              Display the version number of scontrol being executed.

       wait_job job_id
              Wait  until  a  job andall of its nodes are ready for use or the
              job  has  entered  some  termination  state.  This   option   is
              particularly  useful  in the SLURM Prolog or in the batch script
              itself if nodes are powered down and restarted automatically  as
              needed.

       !!     Repeat the last command executed.

       SPECIFICATIONS FOR UPDATE COMMAND, JOBS

       Account=<account>
              Account  name  to be changed for this job's resource use.  Value
              may be cleared with blank data value, "Account=".

       Conn-Type=<type>
              Reset the node connection type.  Possible values  on  Blue  Gene
              are "MESH", "TORUS" and "NAV" (mesh else torus).

       Contiguous=<yes|no>
              Set  the job's requirement for contiguous (consecutive) nodes to
              be allocated.  Possible values are "YES" and "NO".

       Dependency=<dependency_list>
              Defer  job's   initiation   until   specified   job   dependency
              specification  is  satisfied.   Cancel  dependency with an empty
              dependency_list (e.g. "Dependency=").  <dependency_list>  is  of
              the  form  <type:job_id[:job_id][,type:job_id[:job_id]]>.   Many
              jobs can share the same  dependency  and  these  jobs  may  even
              belong to different  users.

              after:job_id[:jobid...]
                     This  job  can  begin  execution after the specified jobs
                     have begun execution.

              afterany:job_id[:jobid...]
                     This job can begin execution  after  the  specified  jobs
                     have terminated.

              afternotok:job_id[:jobid...]
                     This  job  can  begin  execution after the specified jobs
                     have terminated in some failed state (non-zero exit code,
                     node failure, timed out, etc).

              afterok:job_id[:jobid...]
                     This  job  can  begin  execution after the specified jobs
                     have successfully executed (ran  to  completion  with  an
                     exit code of zero).

              singleton
                     This   job  can  begin  execution  after  any  previously
                     launched jobs sharing the same job  name  and  user  have
                     terminated.

       EligibleTime=<time_spec>
              See StartTime.

       ExcNodeList=<nodes>
              Set  the job's list of excluded node. Multiple node names may be
              specified   using   simple   node   range   expressions    (e.g.
              "lx[10-20]").   Value  may  be  cleared  with  blank data value,
              "ExcNodeList=".

       Features=<features>
              Set the job's required node features.  The list of features  may
              include  multiple  feature  names  separated  by ampersand (AND)
              and/or   vertical   bar   (OR)    operators.     For    example:
              Features="opteron&video"   or  Features="fast|faster".   In  the
              first example, only nodes having both the feature "opteron"  AND
              the  feature  "video"  will  be  used.  There is no mechanism to
              specify that you  want  one  node  with  feature  "opteron"  and
              another  node  with  feature  "video"  in  case no node has both
              features.  If only one of a set of possible  options  should  be
              used  for  all  allocated  nodes,  then  use the OR operator and
              enclose  the  options  within  square  brackets.   For  example:
              "Features=[rack1|rack2|rack3|rack4]"  might  be  used to specify
              that all nodes must  be  allocated  on  a  single  rack  of  the
              cluster, but any of those four racks can be used.  A request can
              also specify the number of nodes needed  with  some  feature  by
              appending  an  asterisk  and  count after the feature name.  For
              example  "Features=graphics*4"  indicates  that  at  least  four
              allocated  nodes  must have the feature "graphics."  Constraints
              with node counts may only be combined with AND operators.  Value
              may be cleared with blank data value, for example "Features=".

       Geometry=<geo>
              Reset  the required job geometry.  On Blue Gene the value should
              be three digits separated by "x" or ",".  The  digits  represent
              the allocation size in X, Y and Z dimensions (e.g. "2x3x4").

       Gres=<list>
              Specifies   a   comma   delimited  list  of  generic  consumable
              resources.   The  format  of  each  entry   on   the   list   is
              "name[:count[*cpu]]".   The  name  is  that  of  the  consumable
              resource.  The count is the number of  those  resources  with  a
              default  value  of 1.  The specified resources will be allocated
              to the job on each node allocated unless "*cpu" is appended,  in
              which  case  the resources will be allocated on a per cpu basis.
              The available generic consumable resources  is  configurable  by
              the   system   administrator.    A  list  of  available  generic
              consumable resources will be printed and the command  will  exit
              if  the  option  argument  is  "help".   Examples of use include
              "Gres=gpus:2*cpu,disk=40G" and "Gres=help".

       JobId=<id>
              Identify the job to be updated. This specification is required.

       Licenses=<name>
              Specification of licenses (or other resources available  on  all
              nodes  of  the  cluster)  as described in salloc/sbatch/srun man
              pages.

       MinCPUsNode=<count>
              Set the job's minimum number of CPUs per node to  the  specified
              value.

       MinMemoryCPU=<megabytes>
              Set  the job's minimum real memory required per allocated CPU to
              the specified value.  Either MinMemoryCPU or  MinMemoryNode  may
              be set, but not both.

       MinMemoryNode=<megabytes>
              Set  the  job's  minimum  real  memory  required per node to the
              specified value.  Either MinMemoryCPU or  MinMemoryNode  may  be
              set, but not both.

       MinTmpDiskNode=<megabytes>
              Set  the job's minimum temporary disk space required per node to
              the specified value.

       Name=<name>
              Set the job's name to the specified value.

       Nice[=delta]
              Adjust job's priority by the specified value. Default  value  is
              100.   The adjustment range is from -10000 (highest priority) to
              10000 (lowest priority).  Nice value changes are  not  additive,
              but  overwrite any prior nice value and are applied to the job's
              base priority.  Only privileged users  can  specify  a  negative
              adjustment.

       NodeList=<nodes>
              Change the nodes allocated to a running job to shrink it's size.
              The specified list of nodes  must  be  a  subset  of  the  nodes
              currently  allocated  to  the  job.  Multiple  node names may be
              specified   using   simple   node   range   expressions    (e.g.
              "lx[10-20]").  After  a  job's allocation is reduced, subsequent
              srun commands must explicitly specify node and task counts which
              are valid for the new allocation.

       NumCPUs=<min_count>[-<max_count>]
              Set the job's minimum and optionally maximum count of CPUs to be
              allocated.

       NumNodes=<min_count>[-<max_count>]
              Set the job's minimum and optionally maximum count of  nodes  to
              be  allocated.   If  the  job  is  already  running, use this to
              specify a node count less than currently allocated and resources
              previously  allocated  to  the job will be relinquished. After a
              job's allocation  is  reduced,  subsequent  srun  commands  must
              explicitly  specify node and task counts which are valid for the
              new allocation. Also see the NodeList parameter above.

       NumTasks=<count>
              Set the job's count of required tasks to the specified value.

       Partition=<name>
              Set the job's partition to the specified value.

       Priority=<number>
              Set the job's priority to the specified value.  Note that a  job
              priority of zero prevents the job from ever being scheduled.  By
              setting a job's priority to zero it is held.  Set  the  priority
              to  a  non-zero value to permit it to run.  Explicitly setting a
              job's priority clears any previously set nice value.

       QOS=<name>
              Set the job's QOS (Quality Of Service) to the  specified  value.
              Value may be cleared with blank data value, "QOS=".

       ReqCores=<count>
              Set the job's count of cores per socket to the specified value.

       ReqNodeList=<nodes>
              Set  the job's list of required node. Multiple node names may be
              specified   using   simple   node   range   expressions    (e.g.
              "lx[10-20]").   Value  may  be  cleared  with  blank data value,
              "ReqNodeList=".

       ReqSockets=<count>
              Set the job's count of sockets per node to the specified value.

       ReqThreads=<count>
              Set the job's count of threads per core to the specified value.

       Requeue=<0|1>
              Stipulates whether  a  job  should  be  requeued  after  a  node
              failure: 0 for no, 1 for yes.

       ReservationName=<name>
              Set  the job's reservation to the specified value.  Value may be
              cleared with blank data value, "ReservationName=".

       Rotate=<yes|no>
              Permit the job's geometry to be rotated.   Possible  values  are
              "YES" and "NO".

       Shared=<yes|no>
              Set  the  job's ability to share nodes with other jobs. Possible
              values are "YES" and "NO".

       StartTime=<time_spec>
              Set the job's earliest initiation time.  It accepts times of the
              form  HH:MM:SS  to  run a job at a specific time of day (seconds
              are optional).  (If that time is already past, the next  day  is
              assumed.)  You may also specify midnight, noon, or teatime (4pm)
              and you can have a  time-of-day  suffixed  with  AM  or  PM  for
              running  in  the  morning or the evening.  You can also say what
              day the job will be run, by specifying a date of the form MMDDYY
              or   MM/DD/YY   or   MM.DD.YY,   or   a   date   and   time   as
              YYYY-MM-DD[THH:MM[:SS]].  You can also give  times  like  now  +
              count  time-units,  where  the time-units can be minutes, hours,
              days, or weeks and you can tell SLURM to run the job today  with
              the  keyword  today and to run the job tomorrow with the keyword
              tomorrow.

              Notes on date/time specifications:
               -  although  the  'seconds'  field   of   the   HH:MM:SS   time
              specification is allowed by the code, note that the poll time of
              the SLURM scheduler is not precise enough to guarantee  dispatch
              of  the  job  on  the exact second.  The job will be eligible to
              start on the next poll following the specified time.  The  exact
              poll  interval  depends on the SLURM scheduler (e.g., 60 seconds
              with the default sched/builtin).
               -  if  no  time  (HH:MM:SS)  is  specified,  the   default   is
              (00:00:00).
               -  if a date is specified without a year (e.g., MM/DD) then the
              current year is assumed, unless the  combination  of  MM/DD  and
              HH:MM:SS  has  already  passed  for that year, in which case the
              next year is used.

       TimeLimit=<time>
              The     job's     time     limit.      Output     format      is
              [days-]hours:minutes:seconds  or "UNLIMITED".  Input format (for
              update    command)    set    is    minutes,     minutes:seconds,
              hours:minutes:seconds,    days-hours,    days-hours:minutes   or
              days-hours:minutes:seconds.  Time resolution is one  minute  and
              second values are rounded up to the next minute.

       WCKey=<key>
              Set  the  job's  workload  characterization key to the specified
              value.

       NOTE: The "show" command, when used with the "job" or "job <jobid>"
              entity displays detailed information about a job or jobs.   Much
              of  this  information  may  be  modified  using the "update job"
              command as  described  above.   However,  the  following  fields
              displayed  by  the  show job command are read-only and cannot be
              modified:

       AllocNode:Sid
              Local node and system id making the resource allocation.

       EndTime
              The time the job is expected to terminate  based  on  the  job's
              time  limit.   When  the  job  ends  sooner,  this field will be
              updated with the actual end time.

       ExitCode=<exit>:<sig>
              Exit status reported for the job by the  wait()  function.   The
              first  number  is  the exit code, typically as set by the exit()
              function.  The second number  of  the  signal  that  caused  the
              process to terminate if it was terminated by a signal.

       JobState
              The current state of the job.

       NodeList
              The list of nodes allocated to the job.

       NodeListIndices
              The  NodeIndices expose the internal indices into the node table
              associated with the node(s) allocated to the job.

       PreSusTime
              Time the job ran prior to last suspend.

       Reason The reason job is not running: e.g., waiting "Resources".

       SuspendTime
              Time the job was last suspended or resumed.

       UserId  GroupId
              The user and group under which the job was submitted.

       NOTE on information displayed for various job states:
              When you submit a  request  for  the  "show  job"  function  the
              scontrol  process  makes an RPC request call to slurmctld with a
              REQUEST_JOB_INFO message type.  If  the  state  of  the  job  is
              PENDING,  then  it  returns  some  detail  information  such as:
              min_nodes, min_procs, cpus_per_task, etc. If the state is  other
              than PENDING the code assumes that it is in a further state such
              as RUNNING, COMPLETE, etc. In these cases  the  code  explicitly
              returns zero for these values. These values are meaningless once
              the job resources have been allocated and the job has started.

       SPECIFICATIONS FOR UPDATE COMMAND, STEPS

       StepId=<job_id>[.<step_id>]
              Identify the step to be updated.  If the job_id is given, but no
              step_id  is  specified then all steps of the identified job will
              be modified.  This specification is required.

       TimeLimit=<time>
              The     job's     time     limit.      Output     format      is
              [days-]hours:minutes:seconds  or "UNLIMITED".  Input format (for
              update    command)    set    is    minutes,     minutes:seconds,
              hours:minutes:seconds,    days-hours,    days-hours:minutes   or
              days-hours:minutes:seconds.  Time resolution is one  minute  and
              second values are rounded up to the next minute.

       SPECIFICATIONS FOR UPDATE COMMAND, NODES

       NodeName=<name>
              Identify  the  node(s) to be updated. Multiple node names may be
              specified   using   simple   node   range   expressions    (e.g.
              "lx[10-20]"). This specification is required.

       Features=<features>
              Identify  feature(s)  to  be associated with the specified node.
              Any previously defined feature(s) will be overwritten  with  the
              new  value.   Features  assigned  via scontrol will only persist
              across the restart of the slurmctld daemon with  the  -R  option
              and  state  files  preserved or slurmctld's receipt of a SIGHUP.
              Update slurm.conf with any changes meant to be persistent across
              normal  restarts  of  slurmctld  or  the  execution  of scontrol
              reconfig.

       Gres=<gres>
              Identify generic resources to be associated with  the  specified
              node.    Any   previously  defined  generic  resources  will  be
              overwritten with the new  value.   Specifications  for  multiple
              generic  resources  should  be  comma  separated.  Each resource
              specification consists of a name followed by an  optional  colon
              with   a   numeric   value   (default   value   is   one)  (e.g.
              "Gres=bandwidth:10000,gpus").  Generic  resources  assigned  via
              scontrol  will  only persist across the restart of the slurmctld
              daemon  with  the  -R  option  and  state  files  preserved   or
              slurmctld's  receipt  of  a  SIGHUP.  Update slurm.conf with any
              changes  meant  to  be  persistent  across  normal  restarts  of
              slurmctld or the execution of scontrol reconfig.

       Reason=<reason>
              Identify  the  reason  the  node  is  in  a "DOWN" or "DRAINED",
              "DRAINING", "FAILING" or "FAIL" state.  Use quotes to enclose  a
              reason having more than one word.

       State=<state>
              Identify  the  state to be assigned to the node. Possible values
              are  "NoResp", "ALLOC", "ALLOCATED",  "DOWN",  "DRAIN",  "FAIL",
              "FAILING",  "IDLE",  "MIXED", "MAINT", "POWER_DOWN", "POWER_UP",
              or "RESUME".  If a node is in a "MIXED" state it  usually  means
              the  node  is  in multiple states.  For instance if only part of
              the node is "ALLOCATED" and the rest of the node is  "IDLE"  the
              state  will  be  "MIXED".   If  you  want  to remove a node from
              service, you typically  want  to  set  it's  state  to  "DRAIN".
              "FAILING"  is  similar  to "DRAIN" except that some applications
              will seek to relinquish those nodes before  the  job  completes.
              "RESUME"  is  not  an  actual  node  state,  but  will  return a
              "DRAINED", "DRAINING", or "DOWN" node to service, either  "IDLE"
              or "ALLOCATED" state as appropriate.  Setting a node "DOWN" will
              cause all  running  and  suspended  jobs  on  that  node  to  be
              terminated.  "POWER_DOWN" and "POWER_UP" will use the configured
              SuspendProg and ResumeProg programs to explicitly place  a  node
              in  or out of a power saving mode.  The "NoResp" state will only
              set the "NoResp" flag for a node without changing its underlying
              state.   While  all  of the above states are valid, some of them
              are  not  valid  new  node  states  given  their  prior   state.
              Generally only "DRAIN", "FAIL" and "RESUME" should be used.

       Weight=<weight>
              Identify  weight  to  be  associated  with specified nodes. This
              allows dynamic changes to weight associated  with  nodes,  which
              will  be  used  for  the  subsequent  node allocation decisions.
              Weight assigned  via  scontrol  will  only  persist  across  the
              restart  of  the  slurmctld  daemon with the -R option and state
              files preserved or slurmctld's  receipt  of  a  SIGHUP.   Update
              slurm.conf with any changes meant to be persistent across normal
              restarts of slurmctld or the execution of scontrol reconfig.

       SPECIFICATIONS FOR CREATE, UPDATE, AND DELETE COMMANDS, PARTITIONS

       AllowGroups=<name>
              Identify the user groups which may use this partition.  Multiple
              groups  may  be  specified in a comma separated list.  To permit
              all groups to use the partition specify "AllowGroups=ALL".

       AllocNodes=<name>
              Comma separated list of nodes from which users can execute  jobs
              in  the  partition.   Node names may be specified using the node
              range expression syntax described above.  The default  value  is
              "ALL".

       Alternate=<partition name>
              Alternate partition to be used if the state of this partition is
              "DRAIN" or "INACTIVE."  The value "NONE" will clear a previously
              set alternate partition.

       Default=<yes|no>
              Specify  if  this  partition  is to be used by jobs which do not
              explicitly identify a partition to use.  Possible output  values
              are "YES" and "NO".  In order to change the default partition of
              a running system,  use  the  scontrol  update  command  and  set
              Default=yes  for  the  partition that you want to become the new
              default.

       DefaultTime=<time>
              Run time limit used for jobs that don't specify a value. If  not
              set  then  MaxTime  will  be  used.   Format  is the same as for
              MaxTime.

       DisableRootJobs=<yes|no>
              Specify if jobs can be executed as user root.   Possible  values
              are "YES" and "NO".

       Hidden=<yes|no>
              Specify  if  the  partition  and  its jobs should be hidden from
              view.  Hidden partitions will by  default  not  be  reported  by
              SLURM APIs or commands.  Possible values are "YES" and "NO".

       MaxNodes=<count>
              Set  the  maximum number of nodes which will be allocated to any
              single job in the partition. Specify  a  number,  "INFINITE"  or
              "UNLIMITED".   (On  a  Bluegene  type  system  this represents a
              c-node count.)

       MaxTime=<time>
              The   maximum   run   time   for   jobs.    Output   format   is
              [days-]hours:minutes:seconds  or "UNLIMITED".  Input format (for
              update      command)      is      minutes,      minutes:seconds,
              hours:minutes:seconds,    days-hours,    days-hours:minutes   or
              days-hours:minutes:seconds.  Time resolution is one  minute  and
              second values are rounded up to the next minute.

       MinNodes=<count>
              Set  the  minimum number of nodes which will be allocated to any
              single job in the partition.   (On a Bluegene type  system  this
              represents a c-node count.)

       Nodes=<name>
              Identify  the  node(s)  to  be  associated  with this partition.
              Multiple node names may be specified  using  simple  node  range
              expressions  (e.g.  "lx[10-20]").   Note  that  jobs may only be
              associated with one partition at any time.  Specify a blank data
              value to remove all nodes from a partition: "Nodes=".

       PartitionName=<name>
              Identify  the  partition  to  be  updated. This specification is
              required.

       PreemptMode=<mode>
              Reset the mechanism used to preempt jobs in  this  partition  if
              PreemptType is configured to preempt/partition_prio. The default
              preemption  mechanism   is   specified   by   the   cluster-wide
              PreemptMode configuration parameter.  Possible values are "OFF",
              "CANCEL", "CHECKPOINT", "REQUEUE" and "SUSPEND".

       Priority=<count>
              Jobs submitted to a higher priority partition will be dispatched
              before pending jobs in lower priority partitions and if possible
              they will preempt running jobs from lower  priority  partitions.
              Note  that  a partition's priority takes precedence over a job's
              priority.  The value may not exceed 65533.

       RootOnly=<yes|no>
              Specify if only allocation requests initiated by user root  will
              be  satisfied.   This  can  be  used  to restrict control of the
              partition to some meta-scheduler.  Possible values are "YES" and
              "NO".

       Shared=<yes|no|exclusive|force>[:<job_count>]
              Specify  if  nodes  in  this partition can be shared by multiple
              jobs.  Possible values are "YES", "NO", "EXCLUSIVE" and "FORCE".
              An  optional  job count specifies how many jobs can be allocated
              to use each resource.

       State=<up|down|drain|inactive>
              Specify if jobs  can  be  allocated  nodes  or  queued  in  this
              partition.   Possible  values  are  "UP",  "DOWN",  "DRAIN"  and
              "INACTIVE".

              UP        Designates that new jobs may queued on the  partition,
                        and  that jobs may be allocated nodes and run from the
                        partition.

              DOWN      Designates  that  new  jobs  may  be  queued  on   the
                        partition,  but queued jobs may not be allocated nodes
                        and run from the partition. Jobs  already  running  on
                        the  partition  continue  to  run.  The  jobs  must be
                        explicitly canceled to force their termination.

              DRAIN     Designates that no new  jobs  may  be  queued  on  the
                        partition (job submission requests will be denied with
                        an error message), but  jobs  already  queued  on  the
                        partition  may  be  allocated nodes and run.  See also
                        the "Alternate" partition specification.

              INACTIVE  Designates that no new  jobs  may  be  queued  on  the
                        partition,   and   jobs  already  queued  may  not  be
                        allocated nodes and run.   See  also  the  "Alternate"
                        partition specification.

       SPECIFICATIONS FOR CREATE, UPDATE, AND DELETE COMMANDS, RESERVATIONS

       Reservation=<name>
              Identify  the  name  of  the  reservation  to be created,
              updated, or deleted.   This  parameter  is  required  for
              update and is the only parameter for delete.  For create,
              if you do not  want  to  give  a  reservation  name,  use
              "scontrol  create  res  ..."  and  a name will be created
              automatically.

       Accounts=<account list>
              List of accounts permitted to  use  the  reserved  nodes.
              E.g.  Accounts=physcode1,physcode2.  A user in any of the
              accounts may use the reserved nodes.  A  new  reservation
              must specify Users and/or Accounts.

       Licenses=<license>
              Specification  of  licenses (or other resources available
              on all nodes of the cluster) which are  to  be  reserved.
              License  names  can  be followed by an asterisk and count
              (the default  count  is  one).   Multiple  license  names
              should be comma separated (e.g. "Licenses=foo*4,bar").  A
              new reservation must specify one or more resource  to  be
              included: NodeCnt, Nodes and/or Licenses.

       NodeCnt=<num>
              Identify  number  of  nodes  to be reserved.  On BlueGene
              systems, this number represents a  cnode  (compute  node)
              count and will be rounded up as needed to represent whole
              nodes (midplanes).  A new reservation must specify one or
              more  resource  to  be  included:  NodeCnt,  Nodes and/or
              Licenses.

       Nodes=<name>
              Identify the node(s) to be reserved. Multiple node  names
              may  be  specified  using  simple  node range expressions
              (e.g. "Nodes=lx[10-20]").  Specify a blank data value  to
              remove  all  nodes  from  a reservation: "Nodes=".  A new
              reservation must specify  one  or  more  resource  to  be
              included: NodeCnt, Nodes and/or Licenses.

       StartTime=<time_spec>
              The  start  time  for the reservation.  A new reservation
              must specify a start time.  It accepts times of the  form
              HH:MM:SS   for  a  specific  time  of  day  (seconds  are
              optional).  (If that time is already past, the  next  day
              is  assumed.)   You  may  also specify midnight, noon, or
              teatime (4pm) and you can  have  a  time-of-day  suffixed
              with  AM or PM for running in the morning or the evening.
              You can also say  what  day  the  job  will  be  run,  by
              specifying  a  date  of  the  form  MMDDYY or MM/DD/YY or
              MM.DD.YY, or a date and time as  YYYY-MM-DD[THH:MM[:SS]].
              You  can  also  give  times  like now + count time-units,
              where the time-units can  be  minutes,  hours,  days,  or
              weeks  and  you  can tell SLURM to run the job today with
              the keyword today and to run the job  tomorrow  with  the
              keyword tomorrow.

       EndTime=<time_spec>
              The end time for the reservation.  A new reservation must
              specify an end time or a duration.  Valid formats are the
              same as for StartTime.

       Duration=<time>
              The  length  of  a  reservation.   A new reservation must
              specify an end time or a  duration.   Valid  formats  are
              minutes,      minutes:seconds,     hours:minutes:seconds,
              days-hours,                           days-hours:minutes,
              days-hours:minutes:seconds,     or    UNLIMITED.     Time
              resolution is one minute and second values are rounded up
              to   the   next   minute.   Output   format   is   always
              [days-]hours:minutes:seconds.

       PartitionName=<name>
              Identify the partition to be reserved.

       Flags=<flags>
              Flags associated  with  the  reservation.   In  order  to
              remove  a  flag  with the update option, precede the name
              with a minus sign. For example: Flags=-DAILY (NOTE:  this
              option  is  not  supported  for  all  flags).   Currently
              supported flags include:

              MAINT       Maintenance mode, receives special accounting
                          treatment.   This  partition  is permitted to
                          use resources that  are  already  in  another
                          reservation.

              OVERLAP     This  reservation  can be allocated resources
                          that are already in another reservation.

              IGNORE_JOBS Ignore currently running jobs  when  creating
                          the  reservation.   This  can  be  especially
                          useful when reserving all nodes in the system
                          for maintenance.

              DAILY       Repeat the reservation at the same time every
                          day

              WEEKLY      Repeat the reservation at the same time every
                          week

              SPEC_NODES  Reservation  is  for  specific  nodes (output
                          only)

       Features=<features>
              Set the reservation's required  node  features.  Multiple
              values  may be "&" separated if all features are required
              (AND operation)  or  separated  by  "|"  if  any  of  the
              specified  features  are  required (OR operation).  Value
              may be cleared with blank data value, "Features=".

       Users=<user list>
              List of users permitted to use the reserved nodes.   E.g.
              Users=jones1,smith2.   A  new  reservation  must  specify
              Users and/or Accounts.

       SPECIFICATIONS FOR UPDATE, BLOCK

       Bluegene systems only!

       BlockName=<name>
              Identify  the  bluegene  block  to   be   updated.   This
              specification is required.

       State=<free|error|remove>
              This  will update the state of a bluegene block to either
              FREE or ERROR.  (i.e. update BlockName=RMP0  STATE=ERROR)
              State  error  will  not  allow  jobs to run on the block.
              WARNING!!!! This will  cancel  any  running  job  on  the
              block!   On dynamically laid out systems REMOVE will free
              and remove the block from the system.  If  the  block  is
              smaller than a midplane every block on that midplane will
              be removed.

       SubBPName=<name>
              Identify  the  bluegene  ionodes  to  be  updated   (i.e.
              bg000[0-3]). This specification is required.

       ENVIRONMENT VARIABLES

       Some  scontrol  options  may  be  set via environment variables.
       These environment  variables,  along  with  their  corresponding
       options,  are  listed  below.  (Note:  Commandline  options will
       always override these settings.)

       SCONTROL_ALL        -a, --all

       SLURM_CONF          The  location  of  the  SLURM  configuration
                           file.

AUTHORIZATION

       When  using  the  SLURM  db, users who have AdminLevel's defined
       (Operator or Admin) and users who are account  coordinators  are
       given  the  authority  to  view  and  modify jobs, reservations,
       nodes, etc., as defined in the following table -  regardless  of
       whether  a  PrivateData  restriction  has  been  defined  in the
       slurm.conf file.

       scontrol show job(s):        Admin, Operator, Coordinator
       scontrol update job:         Admin, Operator, Coordinator
       scontrol requeue:            Admin, Operator, Coordinator
       scontrol show step(s):       Admin, Operator, Coordinator
       scontrol update step:        Admin, Operator, Coordinator

       scontrol show block:         Admin, Operator
       scontrol update block:       Admin

       scontrol show node:          Admin, Operator
       scontrol update node:        Admin

       scontrol create partition:   Admin
       scontrol show partition:     Admin, Operator
       scontrol update partition:   Admin
       scontrol delete partition:   Admin

       scontrol create reservation: Admin, Operator
       scontrol show reservation:   Admin, Operator
       scontrol update reservation: Admin, Operator
       scontrol delete reservation: Admin, Operator

       scontrol reconfig:           Admin
       scontrol shutdown:           Admin
       scontrol takeover:           Admin

EXAMPLES

       # scontrol
       scontrol: show part debug
       PartitionName=debug
          AllocNodes=ALL AllowGroups=ALL Default=YES
          DefaultTime=NONE DisableRootJobs=NO Hidden=NO
          MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=1
          Nodes=snowflake[0-48]
          Priority=1 RootOnly=NO Shared=YES:4
          State=UP TotalCPUs=694 TotalNodes=49
       scontrol: update PartitionName=debug MaxTime=60:00 MaxNodes=4
       scontrol: show job 71701
       JobId=71701 Name=hostname
          UserId=da(1000) GroupId=da(1000)
          Priority=66264 Account=none QOS=normal WCKey=*123
          JobState=COMPLETED Reason=None Dependency=(null)
          TimeLimit=UNLIMITED    Requeue=1    Restarts=0    BatchFlag=0
       ExitCode=0:0
          SubmitTime=2010-01-05T10:58:40
       EligibleTime=2010-01-05T10:58:40
          StartTime=2010-01-05T10:58:40 EndTime=2010-01-05T10:58:40
          SuspendTime=None SecsPreSuspend=0
          Partition=debug AllocNode:Sid=snowflake:4702
          ReqNodeList=(null) ExcNodeList=(null)
          NodeList=snowflake0
          NumNodes=1 NumCPUs=10 CPUs/Task=2 ReqS:C:T=1:1:1
          MinCPUsNode=2 MinMemoryNode=0 MinTmpDiskNode=0
          Features=(null) Reservation=(null)
          Shared=OK Contiguous=0 Licenses=(null) Network=(null)
       scontrol: update JobId=71701 TimeLimit=30:00 Priority=500
       scontrol: show hostnames tux[1-3]
       tux1
       tux2
       tux3
       scontrol:     create      res      StartTime=2009-04-01T08:00:00
       Duration=5:00:00 Users=dbremer NodeCnt=10
       Reservation created: dbremer_1
       scontrol: update Reservation=dbremer_1 Flags=Maint NodeCnt=20
       scontrol: delete Reservation=dbremer_1
       scontrol: quit

COPYING

       Copyright  (C)  2002-2007  The  Regents  of  the  University  of
       California.  Copyright (C) 2008-2010 Lawrence Livermore National
       Security.      Portions     Copyright     (C)    2010    SchedMD
       <http://www.schedmd.com>.   Produced   at   Lawrence   Livermore
       National  Laboratory  (cf,  DISCLAIMER).   CODE-OCEC-09-009. All
       rights reserved.

       This file is part of SLURM, a resource management program.   For
       details, see <https://computing.llnl.gov/linux/slurm/>.

       SLURM is free software; you can redistribute it and/or modify it
       under the terms of the GNU General Public License  as  published
       by  the  Free  Software  Foundation;  either  version  2  of the
       License, or (at your option) any later version.

       SLURM is distributed in the hope that it  will  be  useful,  but
       WITHOUT  ANY  WARRANTY;  without  even  the  implied warranty of
       MERCHANTABILITY or FITNESS FOR A PARTICULAR  PURPOSE.   See  the
       GNU General Public License for more details.

FILES

       /etc/slurm.conf

SEE ALSO

       scancel(1),     sinfo(1),     squeue(1),    slurm_checkpoint(3),
       slurm_create_partition(3),            slurm_delete_partition(3),
       slurm_load_ctl_conf(3),  slurm_load_jobs(3), slurm_load_node(3),
       slurm_load_partitions(3),                  slurm_reconfigure(3),
       slurm_requeue(3),       slurm_resume(3),      slurm_shutdown(3),
       slurm_suspend(3),    slurm_takeover(3),     slurm_update_job(3),
       slurm_update_node(3),  slurm_update_partition(3), slurm.conf(5),
       slurmctld(8)