Provided by: torque-scheduler_2.4.16+dfsg-1.3ubuntu1.1_amd64 bug

NAME

       pbs_sched_tcl - pbs Tcl scheduler

SYNOPSIS

       pbs_sched  [-a alarm]  [-b file]  [-d home]  [-i file]  [-L logfile]  [-p file]  [-S port]
       [-t file] [-v] [-c file]

DESCRIPTION

       The pbs_sched program runs in conjunction with the PBS  server.   It  queries  the  server
       about  the  state of PBS and communicates with pbs_mom to get information about the status
       of running jobs, memory available etc.  It then makes decisions as to what jobs to run.

       pbs_sched must be executed with root permission.

OPTIONS

       -a alarm       This specifies the time in seconds to wait for a schedule  run  to  finish.
                      If  a  script  takes  too  long to finish, an alarm signal is sent, and the
                      scheduler is restarted.  If a core file  does  not  exist  in  the  current
                      directory, abort() is called and a core file is generated.  The default for
                      alarm is 180 seconds.

       -b file        This specifies the "body" file.  The file given is read into memory once at
                      program start or after the program receives a SIGHUP and executed each time
                      the scheduler is awakened by the server.  If this option is not given,  the
                      file  "sched_tcl" in the directory PBS_HOME/sched_priv is read for the body
                      code.

       -d home        This specifies the PBS  home  directory,  PBS_HOME.   The  current  working
                      directory  of  the scheduler is PBS_HOME/sched_priv.  If this option is not
                      given, PBS_HOME defaults to $PBS_SERVER_HOME  as  defined  during  the  PBS
                      build procedure.

       -i file        This  specifies  the  "initialize"  file.   The file given is executed once
                      before the main processing loop is entered.  If this option is  not  given,
                      no initialization code is executed.

       -L logfile     Specifies an absolute path name of the file to use as the log file.  If not
                      specified, the scheduler will open a file named for the current date in the
                      PBS_HOME/sched_logs directory (see the -d option).

       -p file        This  specifies  the  "print"  file.  Any output from the Tcl code which is
                      written to standard out or standard error will be written to this file.  If
                      this     option    is    not    given,    the    file    used    will    be
                      PBS_HOME/sched_priv/sched_out.  See the -d option.

       -S port        This specifies the port to use.  If this option is not given,  the  default
                      port for the PBS scheduler is used.

       -t file        This  specifies  the "terminator" file.  If a QUIT command is sent from the
                      server, this code is executed before the scheduler exits.  If  this  option
                      is not given, no special termination handling is done.

       -v             This  puts  the scheduler into "verbose" mode.  Any errors will be shown no
                      matter what this may be set to, but  some  "uninteresting"  events  may  be
                      logged  by  using  this flag.  An example is a message each time the server
                      contacts the scheduler.

       -c file        Specify a configuration file, see description below.  If this is a relative
                      file  name  it  will be relative to PBS_HOME/sched_priv, see the -d option.
                      If the -c option is not supplied, pbs_sched will  not  attempt  to  open  a
                      configuration file.

       The  options  that  specify file names may be absolute or relative.  If they are relative,
       their root directory will be PBS_HOME/sched_priv.

USAGE

       This version of the scheduler requires knowledge of the Tcl language.  A set of  functions
       to  communicate with the PBS server and resource monitor have been added to those normally
       available with Tcl.  All these calls will set the Tcl variable "pbs_errno" to a  value  to
       indicate if an error occured.  In all cases, the value "0" means no error.  If a call to a
       Resource Monitor function is made, any error value will  come  from  the  system  supplied
       errno  variable.   If  the function call communicates with the PBS Server, any error value
       will come from the error number returned by the server.

       openrm host ?port?
             Creates a connection to the PBS Resource Monitor on host  using  port  as  the  port
             number  or  the  standard  port  for  the  resource  monitor  if it is not given.  A
             connection handle is returned.  If the open is  successful,  this  will  be  a  non-
             negative integer.  If not, an error occurred.

       closerm connection
             The  parameter  connection  is  a  handle to a resource monitor which was previously
             returned from openrm.  This connection is closed.  Nothing is returned.

       downrm connection
             Sends a command to the connected resource monitor to shutdown.  Nothing is returned.

       configrm connection filename
             Sends a command to the connected resource monitor to  read  the  configuration  file
             given  by  filename.   If  this is successful, a "0" is returned, otherwise, "-1" is
             returned.

       addreq connection request
             A resource  request  is  sent  to  the  connected  resource  monitor.   If  this  is
             successful, a "0" is returned, otherwise, "-1" is returned.

       getreq connection
             One  resource  request response from the connected resource monitor is returned.  If
             an error occurred or there are no more responses, an empty string is returned.

       allreq request
             A resource request is sent to  all  connected  resource  monitors.   The  number  of
             streams acted upon is returned.

       flushreq
             All resource requests previously sent to all connected resource monitors are flushed
             out to the network.  Nothing is returned.

       activereq
             The connection number of the next stream with something to  read  is  returned.   If
             there is nothing to read from any of the connections, a negative number is returned.

       fullresp flag
             Evaluates  flag as a boolean value and sets the response mode used by getreq to full
             if flag evaluates to "true".  The full return from a resource monitor  includes  the
             original  request  followed  by an equal sign followed by the response.  The default
             situation is only to return the response following the  equal  sign.   If  a  script
             needs to "see" the entire line, this function may be used.

       pbsstatserv
             The server is sent a status request for information about the server itself.  If the
             request succeeds, a list with three elements is returned, otherwise an empty  string
             is  returned.   The  first  element  is  the server's name.  The second is a list of
             attributes.  The third is the "text" associated with the server (usually blank).

       pbsstatjob
             The server is sent a status request for information  about  the  all  jobs  resident
             within  the server.  If the request succeeds, a list is returned, otherwise an empty
             string is returned.  The list contains an entry for each job.   Each  element  is  a
             list  with  three  elements.  The first is the job's jobid.  The second is a list of
             attributes.  The attribute names which specify resources will have  a  name  of  the
             form  "Resource_List:name"  where  "name"  is  the  resource name.  The third is the
             "text" associated with the job (usually blank).

       pbsstatque
             The server is sent a status request for information about all queues resident within
             the  server.  If the request succeeds, a list is returned, otherwise an empty string
             is returned.  The list contains an entry for each queue.  Each  element  is  a  list
             with  three  elements.   This  first  is  the queue's name.  The second is a list of
             attributes similar to pbsstatjob.  The third is the "text" associated with the queue
             (usually blank).

       pbsstatnode
             The  server  is sent a status request for information about all nodes defined within
             the server.  If the request succeeds, a list is returned, otherwise an empty  string
             is returned.  The list contains an entry for each node.  Each element is a list with
             three elements.  This first is the nodes's name.  The second is a list of attributes
             similar  to  pbsstatjob.   The third is the "text" associated with the node (usually
             blank).

       pbsselstat
             The server is sent a status request for information  about  the  all  runnable  jobs
             resident  within  the server.  If the request succeeds, a list similar to pbsstatjob
             is returned, otherwise an empty string is returned.

       pbsrunjob jobid ?location?
             Run the job given by jobid at the location given by location.  If  location  is  not
             given,  the  default  location  is  used.  If this is successful, a "0" is returned,
             otherwise, "-1" is returned.

       pbsasyrunjob jobid ?location?
             Run the job given by jobid at the location given by location without waiting  for  a
             positive  response that the job has actually started.  If location is not given, the
             default location is used.  If this is successful, a "0" is returned, otherwise, "-1"
             is returned.

       pbsrerunjob jobid
             Re-runs  the  job  given  by  jobid.   If  this  is  successful,  a "0" is returned,
             otherwise, "-1" is returned.

       pbsdeljob jobid
             Delete the job given by jobid.  If this is successful, a "0" is returned, otherwise,
             "-1" is returned.

       pbsholdjob jobid
             Place  a  hold on the job given by jobid.  If this is successful, a "0" is returned,
             otherwise, "-1" is returned.

       pbsmovejob jobid ?location?
             Move the job given by jobid to the location given by location.  If location  is  not
             given,  the  default  location  is  used.  If this is successful, a "0" is returned,
             otherwise, "-1" is returned.

       pbsqenable queue
             Set the "enabled" attribute for the queue given  by  queue  to  true.   If  this  is
             successful, a "0" is returned, otherwise, "-1" is returned.

       pbsqdisable queue
             Set  the  "enabled"  attribute  for  the  queue given by queue to false.  If this is
             successful, a "0" is returned, otherwise, "-1" is returned.

       pbsqstart queue
             Set the "started" attribute for the queue given  by  queue  to  true.   If  this  is
             successful, a "0" is returned, otherwise, "-1" is returned.

       pbsqstop queue
             Set  the  "started"  attribute  for  the  queue given by queue to false.  If this is
             successful, a "0" is returned, otherwise, "-1" is returned.

       pbsalterjob jobid attribute_list
             Alter the attributes for a job specified by jobid.  The parameter attribute_list  is
             the  list  of attributes to be altered.  There can be more than one.  Each attribute
             consists of a list of three elements.   The  first  is  the  name,  the  second  the
             resource  and  the  third  is  the  new value.  If the alter is successful, a "0" is
             returned, otherwise, "-1" is returned.

       pbsrescquery resource_list
             Obtain information about the resources specified by resource_list.  This will  be  a
             list  of  strings.  If the request succeeds, a list with the same number of elements
             as resource_list is returned.  Each element in this list will be a  list  with  four
             numbers.   The  numbers  specify  available,  allocated,  reserved, and down in that
             order.

       pbsrescreserve resource_id resource_list
             Make (or extend) a reservation for the resources specified  by  resource_list  which
             will  be  given  as  a list of strings.  The parameter resource_id is a number which
             provides a unique identifier for a reservation being  tracked  by  the  server.   If
             resource_id  is  given  as  "0",  a new reservation is created.  In this case, a new
             identifier is generated and returned by the function.  If an old identifier is used,
             that  same  number  will  be  returned.  The Tcl variable "pbs_errno" will be set to
             indicate the success or failure of the reservation.

       pbsrescrelease resource_id
             The reservation specified by resource_id is released.

       The two following commands are not normally used by the scheduler.  They are included here
       because there could be a need for a scheduler to contact a server other than the one which
       it normally communicates with.  Also, these commands are used by the Tcl tools.

       pbsconnect ?server?
             Make a connection to the named server or the default server if a  parameter  is  not
             given.  Only one connection to a server is allowed at any one time.

       pbsdisconnect
             Disconnect from the currently connected server.

       The  above Tcl functions use PBS interface library calls for communication with the server
       and the PBS resource monitor library to communicate with pbs_mom.

       datetime ?day? ?time?
             The number of arguments used determine the type of date to be calculated.   With  no
             arguments, the current POSIX date is returned.  This is an integer in seconds.

             With  one  argument  there  are  two  possible formats.  The first is a 12 (or more)
             character string specifying a complete date in the following format:
             YYMMDDhhmmss

             All characters must be digits.  The year (YY) is given by the first  two  (or  more)
             characters  and  is the number of years since 1900.  The month (MM) is the number of
             the month [01-12].  The day (DD) is the day of the month [01-32].  The hour (hh)  is
             the  hour  of  the  day [00-23].  The minute (mm) is minutes after the hour [00-59].
             The second (ss) is seconds after the minute [00-59].  The POSIX date for  the  given
             date/time is returned.

             The second option with one argument is a relative time.  The format for this is
             HH:MM:SS

             With  hours  (HH), minutes (MM) and seconds (SS) being separated by colons ":".  The
             number returned in this  case  will  be  the  number  of  seconds  in  the  interval
             specified, not an absolute POSIX date.

             With  two  arguments  a relative date is calculated.  The first argument specifies a
             day of the week and must be one of  the  following  strings:  "Sun",  "Mon",  "Tue",
             "Wed",  "Thr",  "Fri",  or  "Sat".   The second argument is a relative time as given
             above.  The POSIX date calculated will be the day of the week  given  which  follows
             the  current  day,  and  the time given in the second argument.  For example, if the
             current day was Monday, and the two arguments were "Fri" and  "04:30:00",  the  date
             calculated  would  be the POSIX date for the Friday following the current Monday, at
             four-thirty in the morning.  If the day specified and the current day are the  same,
             the current day is used, not the day one week later.

       strftime format time
              This function calls the POSIX function strftime().  It requires two arguments.  The
              first is a format string.  The format conventions are the same  as  those  for  the
              POSIX function strftime().  The second argument is POSIX calendar time in second as
              returned by datetime.  It returns a string based on the format given.   This  gives
              the ability to extract information about a time, or format it for printing.

       The Tcl interpreter is started at program initialization and after a reset (the receipt of
       a SIGHUP signal).  It is not deleted between scheduling runs so variables which are set in
       one can be accessed later.

       The "initialize" and "terminator" files are run with no supplied connection to the server.
       This means that none of the above functions which talk to  the  server  will  work  unless
       pbsconnect  is  called  first.   The  "body"  file  is run with a connection to the server
       already established.

CONFIGURATION FILE

       A configuration file may be specified with the -c  option.   This  file  may  be  used  to
       specify  the  hosts  (servers)  which  are allowed to connect to pbs_sched.  The hosts are
       specified in the configuration file in a manor identical to that used in  pbs_mom.   There
       is one line per host with the syntax:
       $clienthost   hostname
       where clienthost and hostname are separated by white space.

       Two  host  names  are  always allowed to connection to pbs_sched, "localhost" and the name
       returned to pbs_sched by the system call gethostname().  These names need not be specified
       in the configuration file.

       The  configuration file must be "secure".  It must be owned by a user id and group id less
       than 10 and not be world writable.

FILES

       $PBS_SERVER_HOME/sched_priv
                 the     default     directory     for     configuration     files,     typically
                 (/usr/spool/pbs)/sched_priv.

Signal Handling

       A C based scheduler will handle the following signals:

       SIGHUP The  server  will  close  and reopen its log file and reread the config file if one
              exists.

       SIGALRM
              If the site supplied scheduling module exceeds the time limit, the Alarm will cause
              the scheduler to attempt to core dump and restart itself.

       SIGINT and SIGTERM
              Will result in an orderly shutdown of the scheduler.

       All other signals have the default action installed.

EXIT STATUS

       Upon normal termination, an exit status of zero is returned.

SEE ALSO

       pbs_scheduler_cc(8B), pbs_scheduler_rule(8B), pbs_server(8B), and pbs_mom(8B).
       PBS Internal Design Specification