Ubuntu Manpage: sched_setscheduler, sched_getscheduler - set and get scheduling policy/parameters

Provided by: manpages-dev_3.54-1ubuntu1_all

NAME

       sched_setscheduler, sched_getscheduler - set and get scheduling policy/parameters

SYNOPSIS

       #include <sched.h>

       int sched_setscheduler(pid_t pid, int policy,
                              const struct sched_param *param);

       int sched_getscheduler(pid_t pid);

       struct sched_param {
           ...
           int sched_priority;
           ...
       };

DESCRIPTION

       sched_setscheduler()  sets  both the scheduling policy and the associated parameters for the thread whose
       ID is specified in pid.  If pid equals zero, the scheduling policy and parameters of the  calling  thread
       will  be set.  The interpretation of the argument param depends on the selected policy.  Currently, Linux
       supports the following "normal" (i.e., non-real-time) scheduling policies:

       SCHED_OTHER   the standard round-robin time-sharing policy;

       SCHED_BATCH   for "batch" style execution of processes; and

       SCHED_IDLE    for running very low priority background jobs.

       The following "real-time" policies are also supported, for special time-critical applications  that  need
       precise control over the way in which runnable threads are selected for execution:

       SCHED_FIFO    a first-in, first-out policy; and

       SCHED_RR      a round-robin policy.

       The semantics of each of these policies are detailed below.

       sched_getscheduler() queries the scheduling policy currently applied to the thread identified by pid.  If
       pid equals zero, the policy of the calling thread will be retrieved.

   Scheduling policies
       The  scheduler  is  the  kernel  component that decides which runnable thread will be executed by the CPU
       next.  Each thread has an associated scheduling policy and a static scheduling priority,  sched_priority;
       these are the settings that are modified by sched_setscheduler().  The scheduler makes it decisions based
       on knowledge of the scheduling policy and static priority of all threads on the system.

       For threads scheduled under one of the normal scheduling policies (SCHED_OTHER, SCHED_IDLE, SCHED_BATCH),
       sched_priority is not used in scheduling decisions (it must be specified as 0).

       Processes  scheduled  under  one  of  the real-time policies (SCHED_FIFO, SCHED_RR) have a sched_priority
       value in the range 1 (low) to 99 (high).  (As the numbers imply, real-time  threads  always  have  higher
       priority  than  normal  threads.)   Note  well: POSIX.1-2001 requires an implementation to support only a
       minimum 32 distinct priority levels for the  real-time  policies,  and  some  systems  supply  just  this
       minimum.   Portable  programs  should use sched_get_priority_min(2) and sched_get_priority_max(2) to find
       the range of priorities supported for a particular policy.

       Conceptually, the scheduler maintains a list of runnable threads for each possible sched_priority  value.
       In  order to determine which thread runs next, the scheduler looks for the nonempty list with the highest
       static priority and selects the thread at the head of this list.

       A thread's scheduling policy determines where it will be inserted into the list  of  threads  with  equal
       static priority and how it will move inside this list.

       All  scheduling  is  preemptive:  if  a  thread  with  a higher static priority becomes ready to run, the
       currently running thread will be preempted and returned to the wait list for its static  priority  level.
       The  scheduling policy determines the ordering only within the list of runnable threads with equal static
       priority.

   SCHED_FIFO: First in-first out scheduling
       SCHED_FIFO can be used only with static priorities higher than 0, which  means  that  when  a  SCHED_FIFO
       threads  becomes  runnable,  it  will  always  immediately  preempt  any  currently  running SCHED_OTHER,
       SCHED_BATCH, or SCHED_IDLE thread.  SCHED_FIFO is a simple scheduling  algorithm  without  time  slicing.
       For threads scheduled under the SCHED_FIFO policy, the following rules apply:

       *  A SCHED_FIFO thread that has been preempted by another thread of higher priority will stay at the head
          of  the  list for its priority and will resume execution as soon as all threads of higher priority are
          blocked again.

       *  When a SCHED_FIFO thread becomes runnable, it will be  inserted  at  the  end  of  the  list  for  its
          priority.

       *  A  call  to  sched_setscheduler()  or  sched_setparam(2)  will put the SCHED_FIFO (or SCHED_RR) thread
          identified by pid at the start of the list if it was runnable.  As a consequence, it may  preempt  the
          currently  running thread if it has the same priority.  (POSIX.1-2001 specifies that the thread should
          go to the end of the list.)

       *  A thread calling sched_yield(2) will be put at the end of the list.

       No other events will move a thread scheduled under the SCHED_FIFO policy in the  wait  list  of  runnable
       threads with equal static priority.

       A  SCHED_FIFO  thread  runs  until  either  it  is blocked by an I/O request, it is preempted by a higher
       priority thread, or it calls sched_yield(2).

   SCHED_RR: Round-robin scheduling
       SCHED_RR is a simple enhancement of SCHED_FIFO.  Everything described above for SCHED_FIFO  also  applies
       to  SCHED_RR,  except  that each thread is allowed to run only for a maximum time quantum.  If a SCHED_RR
       thread has been running for a time period equal to or longer than the time quantum, it will be put at the
       end of the list for its priority.  A SCHED_RR thread that has been preempted by a higher priority  thread
       and  subsequently resumes execution as a running thread will complete the unexpired portion of its round-
       robin time quantum.  The length of the time quantum can be retrieved using sched_rr_get_interval(2).

   SCHED_OTHER: Default Linux time-sharing scheduling
       SCHED_OTHER can be used at only static priority  0.   SCHED_OTHER  is  the  standard  Linux  time-sharing
       scheduler  that  is  intended  for all threads that do not require the special real-time mechanisms.  The
       thread to run is chosen from the static priority 0 list based on a dynamic priority  that  is  determined
       only  inside  this  list.   The  dynamic  priority  is  based  on  the  nice  value  (set  by  nice(2) or
       setpriority(2)) and increased for each time quantum the thread is ready to run, but denied to run by  the
       scheduler.  This ensures fair progress among all SCHED_OTHER threads.

   SCHED_BATCH: Scheduling batch processes
       (Since  Linux  2.6.16.)   SCHED_BATCH  can  be used only at static priority 0.  This policy is similar to
       SCHED_OTHER in that it schedules the thread according to its dynamic priority (based on the nice  value).
       The  difference  is  that  this  policy will cause the scheduler to always assume that the thread is CPU-
       intensive.  Consequently, the scheduler will apply a small scheduling  penalty  with  respect  to  wakeup
       behaviour, so that this thread is mildly disfavored in scheduling decisions.

       This  policy  is useful for workloads that are noninteractive, but do not want to lower their nice value,
       and for workloads that want  a  deterministic  scheduling  policy  without  interactivity  causing  extra
       preemptions (between the workload's tasks).

   SCHED_IDLE: Scheduling very low priority jobs
       (Since  Linux  2.6.23.)   SCHED_IDLE can be used only at static priority 0; the process nice value has no
       influence for this policy.

       This policy is intended for running jobs at extremely low priority (lower even than a +19 nice value with
       the SCHED_OTHER or SCHED_BATCH policies).

   Resetting scheduling policy for child processes
       Since Linux 2.6.32, the SCHED_RESET_ON_FORK flag can be ORed in policy when calling sched_setscheduler().
       As a result of including this flag, children created by fork(2)  do  not  inherit  privileged  scheduling
       policies.   This  feature  is  intended  for  media-playback  applications,  and  can  be used to prevent
       applications evading the RLIMIT_RTTIME resource limit  (see  getrlimit(2))  by  creating  multiple  child
       processes.

       More  precisely, if the SCHED_RESET_ON_FORK flag is specified, the following rules apply for subsequently
       created children:

       *  If the calling thread has a scheduling policy of SCHED_FIFO  or  SCHED_RR,  the  policy  is  reset  to
          SCHED_OTHER in child processes.

       *  If the calling process has a negative nice value, the nice value is reset to zero in child processes.

       After  the  SCHED_RESET_ON_FORK  flag  has  been  enabled,  it  can  be  reset only if the thread has the
       CAP_SYS_NICE capability.  This flag is disabled in child processes created by fork(2).

       The SCHED_RESET_ON_FORK flag is visible in the policy value returned by sched_getscheduler()

   Privileges and resource limits
       In Linux kernels before 2.6.12, only privileged (CAP_SYS_NICE) threads can set a nonzero static  priority
       (i.e.,  set  a  real-time scheduling policy).  The only change that an unprivileged thread can make is to
       set the SCHED_OTHER policy, and this can be done  only  if  the  effective  user  ID  of  the  caller  of
       sched_setscheduler()  matches  the  real  or  effective  user  ID  of the target thread (i.e., the thread
       specified by pid) whose policy is being changed.

       Since Linux 2.6.12, the RLIMIT_RTPRIO resource limit defines a ceiling on an unprivileged thread's static
       priority for the SCHED_RR and SCHED_FIFO policies.  The rules for changing scheduling policy and priority
       are as follows:

       *  If an unprivileged thread has a nonzero RLIMIT_RTPRIO soft limit, then it can  change  its  scheduling
          policy and priority, subject to the restriction that the priority cannot be set to a value higher than
          the maximum of its current priority and its RLIMIT_RTPRIO soft limit.

       *  If the RLIMIT_RTPRIO soft limit is 0, then the only permitted changes are to lower the priority, or to
          switch to a non-real-time policy.

       *  Subject  to  the  same  rules, another unprivileged thread can also make these changes, as long as the
          effective user ID of the thread making the change matches the real or effective user ID of the  target
          thread.

       *  Special  rules  apply  for  the  SCHED_IDLE.   In  Linux kernels before 2.6.39, an unprivileged thread
          operating under this policy cannot change its policy, regardless of the  value  of  its  RLIMIT_RTPRIO
          resource  limit.   In  Linux  kernels  since  2.6.39,  an unprivileged thread can switch to either the
          SCHED_BATCH or the SCHED_NORMAL policy so long as its nice value falls within the range  permitted  by
          its RLIMIT_NICE resource limit (see getrlimit(2)).

       Privileged  (CAP_SYS_NICE)  threads  ignore the RLIMIT_RTPRIO limit; as with older kernels, they can make
       arbitrary changes to scheduling policy  and  priority.   See  getrlimit(2)  for  further  information  on
       RLIMIT_RTPRIO.

   Response time
       A  blocked  high  priority  thread waiting for the I/O has a certain response time before it is scheduled
       again.  The device driver writer can greatly reduce this  response  time  by  using  a  "slow  interrupt"
       interrupt handler.

   Miscellaneous
       Child processes inherit the scheduling policy and parameters across a fork(2).  The scheduling policy and
       parameters are preserved across execve(2).

       Memory  locking  is  usually needed for real-time processes to avoid paging delays; this can be done with
       mlock(2) or mlockall(2).

       Since a nonblocking infinite loop in a thread scheduled under  SCHED_FIFO  or  SCHED_RR  will  block  all
       threads  with  lower priority forever, a software developer should always keep available on the console a
       shell scheduled under a higher static priority than the tested application.  This will allow an emergency
       kill of tested real-time applications that  do  not  block  or  terminate  as  expected.   See  also  the
       description of the RLIMIT_RTTIME resource limit in getrlimit(2).

       POSIX   systems   on   which   sched_setscheduler()   and   sched_getscheduler()   are  available  define
       _POSIX_PRIORITY_SCHEDULING in <unistd.h>.

RETURN VALUE

       On success, sched_setscheduler() returns zero.  On success, sched_getscheduler() returns the  policy  for
       the thread (a nonnegative integer).  On error, -1 is returned, and errno is set appropriately.

ERRORS

       EINVAL The scheduling policy is not one of the recognized policies, param is NULL, or param does not make
              sense for the policy.

       EPERM  The calling thread does not have appropriate privileges.

       ESRCH  The thread whose ID is pid could not be found.

CONFORMING TO

       POSIX.1-2001 (but see BUGS below).  The SCHED_BATCH and SCHED_IDLE policies are Linux-specific.

NOTES

       POSIX.1  does  not  detail  the  permissions  that  an  unprivileged  thread  requires  in  order to call
       sched_setscheduler(), and details vary across systems.  For example, the Solaris 7 manual page says  that
       the  real  or  effective user ID of the caller must match the real user ID or the save set-user-ID of the
       target.

       The scheduling policy and parameters are in fact per-thread attributes on Linux.  The value returned from
       a call to gettid(2) can be passed in the argument pid.  Specifying pid as 0 will operate on the attribute
       for the calling thread, and passing the value returned from a call  to  getpid(2)  will  operate  on  the
       attribute  for  the  main  thread of the thread group.  (If you are using the POSIX threads API, then use
       pthread_setschedparam(3),  pthread_getschedparam(3),  and   pthread_setschedprio(3),   instead   of   the
       sched_*(2) system calls.)

       Originally,  Standard  Linux  was  intended  as  a  general-purpose operating system being able to handle
       background processes, interactive applications, and less demanding real-time  applications  (applications
       that need to usually meet timing deadlines).  Although the Linux kernel 2.6 allowed for kernel preemption
       and  the  newly  introduced  O(1)  scheduler  ensures  that  the  time  needed  to  schedule is fixed and
       deterministic irrespective of the number of active tasks, true real-time computing was not possible up to
       kernel version 2.6.17.

   Real-time features in the mainline Linux kernel
       From kernel version  2.6.18  onward,  however,  Linux  is  gradually  becoming  equipped  with  real-time
       capabilities,  most  of  which  are  derived  from  the former realtime-preempt patches developed by Ingo
       Molnar, Thomas Gleixner, Steven Rostedt, and others.  Until the patches have been completely merged  into
       the  mainline  kernel  (this  is  expected to be around kernel version 2.6.30), they must be installed to
       achieve the best real-time performance.  These patches are named:

           patch-kernelversion-rtpatchversion

       and can be downloaded from http://www.kernel.org/pub/linux/kernel/projects/rt/.

       Without the patches and prior to their full inclusion into the mainline kernel, the kernel  configuration
       offers   only   the   three   preemption   classes   CONFIG_PREEMPT_NONE,  CONFIG_PREEMPT_VOLUNTARY,  and
       CONFIG_PREEMPT_DESKTOP which respectively provide no, some, and considerable reduction of the  worst-case
       scheduling latency.

       With  the  patches  applied  or  after  their  full  inclusion  into  the mainline kernel, the additional
       configuration item CONFIG_PREEMPT_RT becomes available.  If this is selected, Linux is transformed into a
       regular real-time operating system.  The FIFO and RR scheduling  policies  that  can  be  selected  using
       sched_setscheduler()  are then used to run a thread with true real-time priority and a minimum worst-case
       scheduling latency.

BUGS

       POSIX says that on success, sched_setscheduler() should return the  previous  scheduling  policy.   Linux
       sched_setscheduler() does not conform to this requirement, since it always returns 0 on success.

COLOPHON

       This page is part of release 3.54 of the Linux man-pages project.  A  description  of  the  project,  and
       information about reporting bugs, can be found at http://www.kernel.org/doc/man-pages/.

Linux                                              2013-09-17                              SCHED_SETSCHEDULER(2)

NAME

SYNOPSIS

DESCRIPTION

RETURN VALUE

ERRORS

CONFORMING TO

NOTES

BUGS

SEE ALSO

COLOPHON