Ubuntu Manpage: sched_setscheduler, sched_getscheduler - set and get scheduling policy/parameters

name
synopsis
description
return value
errors
conforming to
notes
bugs
see also
colophon

Provided by: manpages-dev_3.54-1ubuntu1_all

NAME

       sched_setscheduler, sched_getscheduler - set and get scheduling policy/parameters

SYNOPSIS

       #include <sched.h>

       int sched_setscheduler(pid_t pid, int policy,
                              const struct sched_param *param);

       int sched_getscheduler(pid_t pid);

       struct sched_param {
           ...
           int sched_priority;
           ...
       };

DESCRIPTION

       sched_setscheduler()  sets  both the scheduling policy and the associated parameters for the thread whose
       ID is specified in pid.  If pid equals zero, the scheduling policy and parameters of the  calling  thread
       will  be set.  The interpretation of the argument param depends on the selected policy.  Currently, Linux
       supports the following "normal" (i.e., non-real-time) scheduling policies:

       SCHED_OTHER   the standard round-robin time-sharing policy;

       SCHED_BATCH   for "batch" style execution of processes; and

       SCHED_IDLE    for running very low priority background jobs.

       The following "real-time" policies are also supported, for special time-critical applications  that  need
       precise control over the way in which runnable threads are selected for execution:

       SCHED_FIFO    a first-in, first-out policy; and

       SCHED_RR      a round-robin policy.

       The semantics of each of these policies are detailed below.

       sched_getscheduler() queries the scheduling policy currently applied to the thread identified by pid.  If
       pid equals zero, the policy of the calling thread will be retrieved.

   Scheduling policies
       The scheduler is the kernel component that decides which runnable thread will  be  executed  by  the  CPU
       next.   Each thread has an associated scheduling policy and a static scheduling priority, sched_priority;
       these are the settings that are modified by sched_setscheduler().  The scheduler makes it decisions based
       on knowledge of the scheduling policy and static priority of all threads on the system.

       For threads scheduled under one of the normal scheduling policies (SCHED_OTHER, SCHED_IDLE, SCHED_BATCH),
       sched_priority is not used in scheduling decisions (it must be specified as 0).

       Processes scheduled under one of the real-time policies  (SCHED_FIFO,  SCHED_RR)  have  a  sched_priority
       value  in  the  range  1 (low) to 99 (high).  (As the numbers imply, real-time threads always have higher
       priority than normal threads.)  Note well: POSIX.1-2001 requires an  implementation  to  support  only  a
       minimum  32  distinct  priority  levels  for  the  real-time  policies, and some systems supply just this
       minimum.  Portable programs should use sched_get_priority_min(2) and  sched_get_priority_max(2)  to  find
       the range of priorities supported for a particular policy.

       Conceptually,  the scheduler maintains a list of runnable threads for each possible sched_priority value.
       In order to determine which thread runs next, the scheduler looks for the nonempty list with the  highest
       static priority and selects the thread at the head of this list.

       A  thread's  scheduling  policy  determines where it will be inserted into the list of threads with equal
       static priority and how it will move inside this list.

       All scheduling is preemptive: if a thread with a  higher  static  priority  becomes  ready  to  run,  the
       currently  running  thread will be preempted and returned to the wait list for its static priority level.
       The scheduling policy determines the ordering only within the list of runnable threads with equal  static
       priority.

   SCHED_FIFO: First in-first out scheduling
       SCHED_FIFO  can  be  used  only  with static priorities higher than 0, which means that when a SCHED_FIFO
       threads becomes  runnable,  it  will  always  immediately  preempt  any  currently  running  SCHED_OTHER,
       SCHED_BATCH,  or  SCHED_IDLE  thread.   SCHED_FIFO is a simple scheduling algorithm without time slicing.
       For threads scheduled under the SCHED_FIFO policy, the following rules apply:

       *  A SCHED_FIFO thread that has been preempted by another thread of higher priority will stay at the head
          of  the  list for its priority and will resume execution as soon as all threads of higher priority are
          blocked again.

       *  When a SCHED_FIFO thread becomes runnable, it will be  inserted  at  the  end  of  the  list  for  its
          priority.

       *  A  call  to  sched_setscheduler()  or  sched_setparam(2)  will put the SCHED_FIFO (or SCHED_RR) thread
          identified by pid at the start of the list if it was runnable.  As a consequence, it may  preempt  the
          currently  running thread if it has the same priority.  (POSIX.1-2001 specifies that the thread should
          go to the end of the list.)

       *  A thread calling sched_yield(2) will be put at the end of the list.

       No other events will move a thread scheduled under the SCHED_FIFO policy in the  wait  list  of  runnable
       threads with equal static priority.

       A  SCHED_FIFO  thread  runs  until  either  it  is blocked by an I/O request, it is preempted by a higher
       priority thread, or it calls sched_yield(2).

   SCHED_RR: Round-robin scheduling
       SCHED_RR is a simple enhancement of SCHED_FIFO.  Everything described above for SCHED_FIFO  also  applies
       to  SCHED_RR,  except  that each thread is allowed to run only for a maximum time quantum.  If a SCHED_RR
       thread has been running for a time period equal to or longer than the time quantum, it will be put at the
       end  of the list for its priority.  A SCHED_RR thread that has been preempted by a higher priority thread
       and subsequently resumes execution as a running thread will complete the unexpired portion of its  round-
       robin time quantum.  The length of the time quantum can be retrieved using sched_rr_get_interval(2).

   SCHED_OTHER: Default Linux time-sharing scheduling
       SCHED_OTHER  can  be  used  at  only  static  priority 0.  SCHED_OTHER is the standard Linux time-sharing
       scheduler that is intended for all threads that do not require the  special  real-time  mechanisms.   The
       thread  to  run  is chosen from the static priority 0 list based on a dynamic priority that is determined
       only inside  this  list.   The  dynamic  priority  is  based  on  the  nice  value  (set  by  nice(2)  or
       setpriority(2))  and increased for each time quantum the thread is ready to run, but denied to run by the
       scheduler.  This ensures fair progress among all SCHED_OTHER threads.

   SCHED_BATCH: Scheduling batch processes
       (Since Linux 2.6.16.)  SCHED_BATCH can be used only at static priority 0.   This  policy  is  similar  to
       SCHED_OTHER  in that it schedules the thread according to its dynamic priority (based on the nice value).
       The difference is that this policy will cause the scheduler to always assume  that  the  thread  is  CPU-
       intensive.   Consequently,  the  scheduler  will  apply a small scheduling penalty with respect to wakeup
       behaviour, so that this thread is mildly disfavored in scheduling decisions.

       This policy is useful for workloads that are noninteractive, but do not want to lower their  nice  value,
       and  for  workloads  that  want  a  deterministic  scheduling  policy without interactivity causing extra
       preemptions (between the workload's tasks).

   SCHED_IDLE: Scheduling very low priority jobs
       (Since Linux 2.6.23.)  SCHED_IDLE can be used only at static priority 0; the process nice  value  has  no
       influence for this policy.

       This policy is intended for running jobs at extremely low priority (lower even than a +19 nice value with
       the SCHED_OTHER or SCHED_BATCH policies).

   Resetting scheduling policy for child processes
       Since Linux 2.6.32, the SCHED_RESET_ON_FORK flag can be ORed in policy when calling sched_setscheduler().
       As  a  result  of  including  this flag, children created by fork(2) do not inherit privileged scheduling
       policies.  This feature is  intended  for  media-playback  applications,  and  can  be  used  to  prevent
       applications  evading  the  RLIMIT_RTTIME  resource  limit  (see getrlimit(2)) by creating multiple child
       processes.

       More precisely, if the SCHED_RESET_ON_FORK flag is specified, the following rules apply for  subsequently
       created children:

       *  If  the  calling  thread  has  a  scheduling  policy of SCHED_FIFO or SCHED_RR, the policy is reset to
          SCHED_OTHER in child processes.

       *  If the calling process has a negative nice value, the nice value is reset to zero in child processes.

       After the SCHED_RESET_ON_FORK flag has been enabled,  it  can  be  reset  only  if  the  thread  has  the
       CAP_SYS_NICE capability.  This flag is disabled in child processes created by fork(2).

       The SCHED_RESET_ON_FORK flag is visible in the policy value returned by sched_getscheduler()

   Privileges and resource limits
       In  Linux kernels before 2.6.12, only privileged (CAP_SYS_NICE) threads can set a nonzero static priority
       (i.e., set a real-time scheduling policy).  The only change that an unprivileged thread can  make  is  to
       set  the  SCHED_OTHER  policy,  and  this  can  be  done  only  if the effective user ID of the caller of
       sched_setscheduler() matches the real or effective user  ID  of  the  target  thread  (i.e.,  the  thread
       specified by pid) whose policy is being changed.

       Since Linux 2.6.12, the RLIMIT_RTPRIO resource limit defines a ceiling on an unprivileged thread's static
       priority for the SCHED_RR and SCHED_FIFO policies.  The rules for changing scheduling policy and priority
       are as follows:

       *  If  an  unprivileged  thread has a nonzero RLIMIT_RTPRIO soft limit, then it can change its scheduling
          policy and priority, subject to the restriction that the priority cannot be set to a value higher than
          the maximum of its current priority and its RLIMIT_RTPRIO soft limit.

       *  If the RLIMIT_RTPRIO soft limit is 0, then the only permitted changes are to lower the priority, or to
          switch to a non-real-time policy.

       *  Subject to the same rules, another unprivileged thread can also make these changes,  as  long  as  the
          effective  user ID of the thread making the change matches the real or effective user ID of the target
          thread.

       *  Special rules apply for the SCHED_IDLE.  In  Linux  kernels  before  2.6.39,  an  unprivileged  thread
          operating  under  this  policy  cannot change its policy, regardless of the value of its RLIMIT_RTPRIO
          resource limit.  In Linux kernels since 2.6.39, an  unprivileged  thread  can  switch  to  either  the
          SCHED_BATCH  or  the SCHED_NORMAL policy so long as its nice value falls within the range permitted by
          its RLIMIT_NICE resource limit (see getrlimit(2)).

       Privileged (CAP_SYS_NICE) threads ignore the RLIMIT_RTPRIO limit; as with older kernels,  they  can  make
       arbitrary  changes  to  scheduling  policy  and  priority.   See  getrlimit(2) for further information on
       RLIMIT_RTPRIO.

   Response time
       A blocked high priority thread waiting for the I/O has a certain response time  before  it  is  scheduled
       again.   The  device  driver  writer  can  greatly  reduce this response time by using a "slow interrupt"
       interrupt handler.

   Miscellaneous
       Child processes inherit the scheduling policy and parameters across a fork(2).  The scheduling policy and
       parameters are preserved across execve(2).

       Memory  locking  is  usually needed for real-time processes to avoid paging delays; this can be done with
       mlock(2) or mlockall(2).

       Since a nonblocking infinite loop in a thread scheduled under  SCHED_FIFO  or  SCHED_RR  will  block  all
       threads  with  lower priority forever, a software developer should always keep available on the console a
       shell scheduled under a higher static priority than the tested application.  This will allow an emergency
       kill  of  tested  real-time  applications  that  do  not  block  or  terminate as expected.  See also the
       description of the RLIMIT_RTTIME resource limit in getrlimit(2).

       POSIX  systems  on   which   sched_setscheduler()   and   sched_getscheduler()   are   available   define
       _POSIX_PRIORITY_SCHEDULING in <unistd.h>.

RETURN VALUE

       On  success,  sched_setscheduler() returns zero.  On success, sched_getscheduler() returns the policy for
       the thread (a nonnegative integer).  On error, -1 is returned, and errno is set appropriately.

ERRORS

       EINVAL The scheduling policy is not one of the recognized policies, param is NULL, or param does not make
              sense for the policy.

       EPERM  The calling thread does not have appropriate privileges.

       ESRCH  The thread whose ID is pid could not be found.

CONFORMING TO

       POSIX.1-2001 (but see BUGS below).  The SCHED_BATCH and SCHED_IDLE policies are Linux-specific.

NOTES

       POSIX.1  does  not  detail  the  permissions  that  an  unprivileged  thread  requires  in  order to call
       sched_setscheduler(), and details vary across systems.  For example, the Solaris 7 manual page says  that
       the  real  or  effective user ID of the caller must match the real user ID or the save set-user-ID of the
       target.

       The scheduling policy and parameters are in fact per-thread attributes on Linux.  The value returned from
       a call to gettid(2) can be passed in the argument pid.  Specifying pid as 0 will operate on the attribute
       for the calling thread, and passing the value returned from a call  to  getpid(2)  will  operate  on  the
       attribute  for  the  main  thread of the thread group.  (If you are using the POSIX threads API, then use
       pthread_setschedparam(3),  pthread_getschedparam(3),  and   pthread_setschedprio(3),   instead   of   the
       sched_*(2) system calls.)

       Originally,  Standard  Linux  was  intended  as  a  general-purpose operating system being able to handle
       background processes, interactive applications, and less demanding real-time  applications  (applications
       that need to usually meet timing deadlines).  Although the Linux kernel 2.6 allowed for kernel preemption
       and the newly introduced  O(1)  scheduler  ensures  that  the  time  needed  to  schedule  is  fixed  and
       deterministic irrespective of the number of active tasks, true real-time computing was not possible up to
       kernel version 2.6.17.

   Real-time features in the mainline Linux kernel
       From kernel version  2.6.18  onward,  however,  Linux  is  gradually  becoming  equipped  with  real-time
       capabilities,  most  of  which  are  derived  from  the former realtime-preempt patches developed by Ingo
       Molnar, Thomas Gleixner, Steven Rostedt, and others.  Until the patches have been completely merged  into
       the  mainline  kernel  (this  is  expected to be around kernel version 2.6.30), they must be installed to
       achieve the best real-time performance.  These patches are named:

           patch-kernelversion-rtpatchversion

       and can be downloaded from ⟨http://www.kernel.org/pub/linux/kernel/projects/rt/⟩.

       Without the patches and prior to their full inclusion into the mainline kernel, the kernel  configuration
       offers   only   the   three   preemption   classes   CONFIG_PREEMPT_NONE,  CONFIG_PREEMPT_VOLUNTARY,  and
       CONFIG_PREEMPT_DESKTOP which respectively provide no, some, and considerable reduction of the  worst-case
       scheduling latency.

       With  the  patches  applied  or  after  their  full  inclusion  into  the mainline kernel, the additional
       configuration item CONFIG_PREEMPT_RT becomes available.  If this is selected, Linux is transformed into a
       regular  real-time  operating  system.   The  FIFO  and RR scheduling policies that can be selected using
       sched_setscheduler() are then used to run a thread with true real-time priority and a minimum  worst-case
       scheduling latency.

BUGS

       POSIX  says  that  on  success, sched_setscheduler() should return the previous scheduling policy.  Linux
       sched_setscheduler() does not conform to this requirement, since it always returns 0 on success.

COLOPHON

       This page is part of release 3.54 of the Linux man-pages project.  A  description  of  the  project,  and
       information about reporting bugs, can be found at http://www.kernel.org/doc/man-pages/.

NAME

SYNOPSIS

DESCRIPTION

RETURN VALUE

ERRORS

CONFORMING TO

NOTES

BUGS

SEE ALSO

COLOPHON