Provided by: manpages-dev_3.54-1ubuntu1_all bug

NAME

       sched_setscheduler, sched_getscheduler - set and get scheduling policy/parameters

SYNOPSIS

       #include <sched.h>

       int sched_setscheduler(pid_t pid, int policy,
                              const struct sched_param *param);

       int sched_getscheduler(pid_t pid);

       struct sched_param {
           ...
           int sched_priority;
           ...
       };

DESCRIPTION

       sched_setscheduler() sets both the scheduling policy and the associated parameters for the
       thread whose ID is specified in pid.  If  pid  equals  zero,  the  scheduling  policy  and
       parameters  of  the  calling thread will be set.  The interpretation of the argument param
       depends on the selected policy.  Currently, Linux supports the following  "normal"  (i.e.,
       non-real-time) scheduling policies:

       SCHED_OTHER   the standard round-robin time-sharing policy;

       SCHED_BATCH   for "batch" style execution of processes; and

       SCHED_IDLE    for running very low priority background jobs.

       The   following  "real-time"  policies  are  also  supported,  for  special  time-critical
       applications that need precise control over the way in which runnable threads are selected
       for execution:

       SCHED_FIFO    a first-in, first-out policy; and

       SCHED_RR      a round-robin policy.

       The semantics of each of these policies are detailed below.

       sched_getscheduler()  queries  the  scheduling  policy  currently  applied  to  the thread
       identified by pid.  If pid  equals  zero,  the  policy  of  the  calling  thread  will  be
       retrieved.

   Scheduling policies
       The  scheduler is the kernel component that decides which runnable thread will be executed
       by the CPU next.  Each thread has an associated scheduling policy and a static  scheduling
       priority,    sched_priority;    these    are   the   settings   that   are   modified   by
       sched_setscheduler().  The  scheduler  makes  it  decisions  based  on  knowledge  of  the
       scheduling policy and static priority of all threads on the system.

       For   threads  scheduled  under  one  of  the  normal  scheduling  policies  (SCHED_OTHER,
       SCHED_IDLE, SCHED_BATCH), sched_priority is not used in scheduling decisions (it  must  be
       specified as 0).

       Processes  scheduled  under  one  of  the real-time policies (SCHED_FIFO, SCHED_RR) have a
       sched_priority value in the range 1 (low) to 99 (high).  (As the numbers imply,  real-time
       threads  always  have  higher  priority  than  normal  threads.)   Note well: POSIX.1-2001
       requires an implementation to support only a minimum 32 distinct priority levels  for  the
       real-time  policies,  and some systems supply just this minimum.  Portable programs should
       use  sched_get_priority_min(2)  and  sched_get_priority_max(2)  to  find  the   range   of
       priorities supported for a particular policy.

       Conceptually,  the  scheduler  maintains  a  list  of  runnable  threads for each possible
       sched_priority value.  In order to determine which thread runs next, the  scheduler  looks
       for  the nonempty list with the highest static priority and selects the thread at the head
       of this list.

       A thread's scheduling policy determines where it will be inserted into the list of threads
       with equal static priority and how it will move inside this list.

       All  scheduling  is preemptive: if a thread with a higher static priority becomes ready to
       run, the currently running thread will be preempted and returned to the wait list for  its
       static priority level.  The scheduling policy determines the ordering only within the list
       of runnable threads with equal static priority.

   SCHED_FIFO: First in-first out scheduling
       SCHED_FIFO can be used only with static priorities higher than 0, which means that when  a
       SCHED_FIFO  threads  becomes  runnable,  it  will always immediately preempt any currently
       running SCHED_OTHER, SCHED_BATCH, or SCHED_IDLE thread.  SCHED_FIFO is a simple scheduling
       algorithm  without  time  slicing.  For threads scheduled under the SCHED_FIFO policy, the
       following rules apply:

       *  A SCHED_FIFO thread that has been preempted by another thread of higher  priority  will
          stay  at the head of the list for its priority and will resume execution as soon as all
          threads of higher priority are blocked again.

       *  When a SCHED_FIFO thread becomes runnable, it will be inserted at the end of  the  list
          for its priority.

       *  A  call  to  sched_setscheduler()  or  sched_setparam(2)  will  put  the SCHED_FIFO (or
          SCHED_RR) thread identified by pid at the start of the list if it was runnable.   As  a
          consequence,  it  may preempt the currently running thread if it has the same priority.
          (POSIX.1-2001 specifies that the thread should go to the end of the list.)

       *  A thread calling sched_yield(2) will be put at the end of the list.

       No other events will move a thread scheduled under the SCHED_FIFO policy in the wait  list
       of runnable threads with equal static priority.

       A  SCHED_FIFO thread runs until either it is blocked by an I/O request, it is preempted by
       a higher priority thread, or it calls sched_yield(2).

   SCHED_RR: Round-robin scheduling
       SCHED_RR is a simple enhancement of SCHED_FIFO.  Everything described above for SCHED_FIFO
       also  applies  to  SCHED_RR,  except that each thread is allowed to run only for a maximum
       time quantum.  If a SCHED_RR thread has been running for a time period equal to or  longer
       than the time quantum, it will be put at the end of the list for its priority.  A SCHED_RR
       thread that has been preempted by  a  higher  priority  thread  and  subsequently  resumes
       execution  as a running thread will complete the unexpired portion of its round-robin time
       quantum.  The length of the time quantum can be retrieved using sched_rr_get_interval(2).

   SCHED_OTHER: Default Linux time-sharing scheduling
       SCHED_OTHER can be used at only static priority 0.   SCHED_OTHER  is  the  standard  Linux
       time-sharing  scheduler  that  is intended for all threads that do not require the special
       real-time mechanisms.  The thread to run is chosen from the static priority 0  list  based
       on  a  dynamic priority that is determined only inside this list.  The dynamic priority is
       based on the nice value (set by nice(2) or setpriority(2)) and  increased  for  each  time
       quantum the thread is ready to run, but denied to run by the scheduler.  This ensures fair
       progress among all SCHED_OTHER threads.

   SCHED_BATCH: Scheduling batch processes
       (Since Linux 2.6.16.)  SCHED_BATCH can be used only at static priority 0.  This policy  is
       similar  to  SCHED_OTHER in that it schedules the thread according to its dynamic priority
       (based on the nice value).  The difference is that this policy will cause the scheduler to
       always  assume that the thread is CPU-intensive.  Consequently, the scheduler will apply a
       small scheduling penalty with respect to wakeup behaviour, so that this thread  is  mildly
       disfavored in scheduling decisions.

       This  policy  is  useful  for  workloads that are noninteractive, but do not want to lower
       their nice value, and for workloads that want a deterministic  scheduling  policy  without
       interactivity causing extra preemptions (between the workload's tasks).

   SCHED_IDLE: Scheduling very low priority jobs
       (Since  Linux 2.6.23.)  SCHED_IDLE can be used only at static priority 0; the process nice
       value has no influence for this policy.

       This policy is intended for running jobs at extremely low priority (lower even than a  +19
       nice value with the SCHED_OTHER or SCHED_BATCH policies).

   Resetting scheduling policy for child processes
       Since  Linux  2.6.32,  the  SCHED_RESET_ON_FORK  flag  can  be ORed in policy when calling
       sched_setscheduler().  As a result of including this flag, children created by fork(2)  do
       not  inherit  privileged scheduling policies.  This feature is intended for media-playback
       applications, and can be used to prevent applications evading the  RLIMIT_RTTIME  resource
       limit (see getrlimit(2)) by creating multiple child processes.

       More  precisely,  if  the SCHED_RESET_ON_FORK flag is specified, the following rules apply
       for subsequently created children:

       *  If the calling thread has a scheduling policy of SCHED_FIFO or SCHED_RR, the policy  is
          reset to SCHED_OTHER in child processes.

       *  If  the  calling  process has a negative nice value, the nice value is reset to zero in
          child processes.

       After the SCHED_RESET_ON_FORK flag has been enabled, it can be reset only  if  the  thread
       has  the  CAP_SYS_NICE  capability.   This  flag is disabled in child processes created by
       fork(2).

       The  SCHED_RESET_ON_FORK   flag   is   visible   in   the   policy   value   returned   by
       sched_getscheduler()

   Privileges and resource limits
       In  Linux  kernels before 2.6.12, only privileged (CAP_SYS_NICE) threads can set a nonzero
       static priority (i.e., set a real-time  scheduling  policy).   The  only  change  that  an
       unprivileged  thread  can make is to set the SCHED_OTHER policy, and this can be done only
       if the effective user ID of  the  caller  of  sched_setscheduler()  matches  the  real  or
       effective user ID of the target thread (i.e., the thread specified by pid) whose policy is
       being changed.

       Since Linux 2.6.12, the RLIMIT_RTPRIO resource limit defines a ceiling on an  unprivileged
       thread's static priority for the SCHED_RR and SCHED_FIFO policies.  The rules for changing
       scheduling policy and priority are as follows:

       *  If an unprivileged thread has a nonzero RLIMIT_RTPRIO soft limit, then  it  can  change
          its scheduling policy and priority, subject to the restriction that the priority cannot
          be set to a value higher than the maximum of its current priority and its RLIMIT_RTPRIO
          soft limit.

       *  If  the RLIMIT_RTPRIO soft limit is 0, then the only permitted changes are to lower the
          priority, or to switch to a non-real-time policy.

       *  Subject to the same rules, another unprivileged thread can also make these changes,  as
          long  as  the  effective  user  ID  of the thread making the change matches the real or
          effective user ID of the target thread.

       *  Special  rules  apply  for  the  SCHED_IDLE.   In  Linux  kernels  before  2.6.39,   an
          unprivileged thread operating under this policy cannot change its policy, regardless of
          the value of its RLIMIT_RTPRIO resource limit.   In  Linux  kernels  since  2.6.39,  an
          unprivileged  thread can switch to either the SCHED_BATCH or the SCHED_NORMAL policy so
          long as its nice value falls within the range permitted  by  its  RLIMIT_NICE  resource
          limit (see getrlimit(2)).

       Privileged  (CAP_SYS_NICE)  threads ignore the RLIMIT_RTPRIO limit; as with older kernels,
       they can make arbitrary changes to scheduling policy and priority.  See  getrlimit(2)  for
       further information on RLIMIT_RTPRIO.

   Response time
       A  blocked  high priority thread waiting for the I/O has a certain response time before it
       is scheduled again.  The device driver writer can greatly reduce  this  response  time  by
       using a "slow interrupt" interrupt handler.

   Miscellaneous
       Child  processes  inherit  the  scheduling  policy  and  parameters across a fork(2).  The
       scheduling policy and parameters are preserved across execve(2).

       Memory locking is usually needed for real-time processes to avoid paging delays; this  can
       be done with mlock(2) or mlockall(2).

       Since  a nonblocking infinite loop in a thread scheduled under SCHED_FIFO or SCHED_RR will
       block all threads with lower priority forever, a software  developer  should  always  keep
       available  on the console a shell scheduled under a higher static priority than the tested
       application.  This will allow an emergency kill of tested real-time applications  that  do
       not  block  or  terminate  as  expected.   See  also  the description of the RLIMIT_RTTIME
       resource limit in getrlimit(2).

       POSIX systems on which sched_setscheduler() and sched_getscheduler() are available  define
       _POSIX_PRIORITY_SCHEDULING in <unistd.h>.

RETURN VALUE

       On  success,  sched_setscheduler() returns zero.  On success, sched_getscheduler() returns
       the policy for the thread (a nonnegative integer).  On error, -1 is returned, and errno is
       set appropriately.

ERRORS

       EINVAL The  scheduling  policy  is  not  one of the recognized policies, param is NULL, or
              param does not make sense for the policy.

       EPERM  The calling thread does not have appropriate privileges.

       ESRCH  The thread whose ID is pid could not be found.

CONFORMING TO

       POSIX.1-2001 (but see BUGS below).  The SCHED_BATCH and  SCHED_IDLE  policies  are  Linux-
       specific.

NOTES

       POSIX.1  does  not detail the permissions that an unprivileged thread requires in order to
       call sched_setscheduler(), and details vary across systems.  For example,  the  Solaris  7
       manual page says that the real or effective user ID of the caller must match the real user
       ID or the save set-user-ID of the target.

       The scheduling policy and parameters are in fact  per-thread  attributes  on  Linux.   The
       value returned from a call to gettid(2) can be passed in the argument pid.  Specifying pid
       as 0 will operate on the attribute for the calling thread, and passing the value  returned
       from  a  call to getpid(2) will operate on the attribute for the main thread of the thread
       group.  (If you are using  the  POSIX  threads  API,  then  use  pthread_setschedparam(3),
       pthread_getschedparam(3),  and  pthread_setschedprio(3),  instead of the sched_*(2) system
       calls.)

       Originally, Standard Linux was intended as a general-purpose operating system  being  able
       to  handle  background  processes,  interactive applications, and less demanding real-time
       applications (applications that need to usually  meet  timing  deadlines).   Although  the
       Linux  kernel  2.6  allowed  for kernel preemption and the newly introduced O(1) scheduler
       ensures that the time needed to schedule is fixed and deterministic  irrespective  of  the
       number  of  active  tasks,  true real-time computing was not possible up to kernel version
       2.6.17.

   Real-time features in the mainline Linux kernel
       From kernel version 2.6.18 onward, however, Linux  is  gradually  becoming  equipped  with
       real-time capabilities, most of which are derived from the former realtime-preempt patches
       developed by Ingo Molnar, Thomas Gleixner, Steven Rostedt, and others.  Until the  patches
       have been completely merged into the mainline kernel (this is expected to be around kernel
       version 2.6.30), they must be installed to achieve the best real-time performance.   These
       patches are named:

           patch-kernelversion-rtpatchversion

       and can be downloaded from ⟨http://www.kernel.org/pub/linux/kernel/projects/rt/⟩.

       Without the patches and prior to their full inclusion into the mainline kernel, the kernel
       configuration   offers   only   the   three   preemption   classes    CONFIG_PREEMPT_NONE,
       CONFIG_PREEMPT_VOLUNTARY,  and CONFIG_PREEMPT_DESKTOP which respectively provide no, some,
       and considerable reduction of the worst-case scheduling latency.

       With the patches applied or after their full  inclusion  into  the  mainline  kernel,  the
       additional  configuration  item CONFIG_PREEMPT_RT becomes available.  If this is selected,
       Linux is transformed  into  a  regular  real-time  operating  system.   The  FIFO  and  RR
       scheduling policies that can be selected using sched_setscheduler() are then used to run a
       thread with true real-time priority and a minimum worst-case scheduling latency.

BUGS

       POSIX says that on success, sched_setscheduler() should  return  the  previous  scheduling
       policy.   Linux sched_setscheduler() does not conform to this requirement, since it always
       returns 0 on success.

SEE ALSO

       chrt(1), getpriority(2), mlock(2), mlockall(2), munlock(2), munlockall(2), nice(2),
       sched_get_priority_max(2), sched_get_priority_min(2), sched_getaffinity(2),
       sched_getparam(2), sched_rr_get_interval(2), sched_setaffinity(2), sched_setparam(2),
       sched_yield(2), setpriority(2), capabilities(7), cpuset(7)

       Programming  for  the  real world - POSIX.4 by Bill O. Gallmeister, O'Reilly & Associates,
       Inc., ISBN 1-56592-074-0.

       The Linux kernel source file Documentation/scheduler/sched-rt-group.txt

COLOPHON

       This page is part of release 3.54 of the Linux man-pages project.  A  description  of  the
       project,     and    information    about    reporting    bugs,    can    be    found    at
       http://www.kernel.org/doc/man-pages/.