Provided by: libnuma-dev_1.0.2-1_i386 bug

NAME

       numa - NUMA policy library

SYNOPSIS

       #include <numa.h>

       cc ... -lnuma

       int numa_available(void);

       int numa_max_node(void);
       int numa_preferred(void);
       long numa_node_size(int node, long *freep);
       long long numa_node_size64(int node, long long *freep);

       nodemask_t numa_all_nodes;
       nodemask_t numa_no_nodes;
       int numa_node_to_cpus(int node, unsigned long *buffer, int bufferlen);

       void nodemask_zero(nodemask_t *mask);
       void nodemask_set(nodemask_t *mask, int node);
       void nodemask_clr(nodemask_t *mask, int node);
       int nodemask_isset(const nodemask_t *mask, int node);
       int nodemask_equal(const nodemask_t *a, const nodemask_t b);

       void numa_set_interleave_mask(nodemask_t *nodemask);
       nodemask_t numa_get_interleave_mask(void);
       void numa_bind(nodemask_t *nodemask);
       void numa_set_preferred(int node);
       void numa_set_localalloc(void);
       void numa_set_membind(nodemask_t *nodemask);
       nodemask_t numa_get_membind(void);

       void *numa_alloc_interleaved_subset(size_t size, nodemask_t *nodemask);
       void *numa_alloc_interleaved(size_t size);
       void *numa_alloc_onnode(size_t size, int node);
       void *numa_alloc_local(size_t size);
       void *numa_alloc(size_t size);
       void numa_free(void *start, size_t size);

       int numa_run_on_node_mask(nodemask_t *nodemask);
       int numa_run_on_node(int node);
       nodemask_t numa_get_run_node_mask(void);

       void  numa_interleave_memory(void  *start,  size_t   size,   nodemask_t
       *nodemask);
       void numa_tonode_memory(void *start, size_t size, int node);
       void   numa_tonodemask_memory(void   *start,  size_t  size,  nodemask_t
       *nodemask);
       void numa_setlocal_memory(void *start, size_t size);
       void numa_police_memory(void *start, size_t size);
       int numa_distance(int node1, int node2);
       void numa_set_bind_policy(int strict);
       void numa_set_strict(int strict);
       void numa_error(char *where);
       void numa_warn(int number, char *where, ...);
       extern int numa_exit_on_error;

DESCRIPTION

       The libnuma library offers a simple programming interface to  the  NUMA
       (Non  Uniform Memory Access) policy supported by the Linux kernel. On a
       NUMA architecture some memory areas have different latency or bandwidth
       than others.

       Available  policies  are  page interleaving (i.e., allocate in a round-
       robin fashion from all, or a subset,  of  the  nodes  on  the  system),
       preferred  node  allocation  (i.e., preferably allocate on a particular
       node), local allocation (i.e., allocate on the node on which the thread
       is  currently  executing),  or allocation only on specific nodes (i.e.,
       allocate on some subset of the available nodes).  It is  also  possible
       to bind threads to specific nodes.

       Numa  memory  allocation  policy  is  a  per-thread  attribute,  but is
       inherited by children.

       For setting a specific policy globally for all memory allocations in  a
       process  and its children it is easiest to start it with the numactl(8)
       utility. For more finegrained policy inside an application this library
       can be used.

       All  numa  memory  allocation  policy  only takes effect when a page is
       actually faulted into the address space of a process by  accessing  it.
       The numa_alloc_* functions take care of this automatically.

       A  node  is  defined  as an area where all memory has the same speed as
       seen from a particular CPU. A node can contain  multiple  CPUs.  Caches
       are ignored for this definition.

       This  library  is  only concerned about nodes and their memory and does
       not  deal  with  individual  CPUs  inside  these  nodes   (except   for
       numa_node_to_cpus )

       Before  any  other  calls  in this library can be used numa_available()
       must be called. If it returns -1, all other functions in  this  library
       are undefined.

       numa_max_node()  returns  the  highest  node  number  available  on the
       current system. If a node number or a node mask with a  bit  set  above
       the  value  returned  by this function is passed to a libnuma function,
       the result is undefined.

       numa_node_size() returns the memory size of a  node.  If  the  argument
       freep  is  not NULL, it used to return the amount of free memory on the
       node.  On error it returns -1.  numa_node_size64() works  the  same  as
       numa_node_size()  except that it returns values as long long instead of
       long.  This is useful on 32-bit architectures with large nodes.

       Some of these functions accept or return a nodemask.   A  nodemask  has
       type nodemask_t.  It is an abstract bitmap type containing a bit set of
       nodes.  The maximum node number depends on the architecture, but is not
       larger  than  numa_max_node().  What happens in libnuma calls when bits
       above numa_max_node() are passed is  undefined.   A  nodemask_t  should
       only   be   manipulated   with   the  nodemask_zero(),  nodemask_clr(),
       nodemask_isset(), and nodemask_set() functions.  nodemask_zero() clears
       a  nodemask_t.   nodemask_isset()  returns  true  if node is set in the
       passed   nodemask.    nodemask_clr()   clears   node    in    nodemask.
       nodemask_set()   sets   node  in  nodemask.   The  predefined  variable
       numa_all_nodes has all available nodes set; numa_no_nodes is the  empty
       set.   nodemask_equal()  returns  non-zero if its two nodeset arguments
       are equal.

       numa_preferred() returns the preferred  node  of  the  current  thread.
       This  is  the  node  on  which  the kernel preferably allocates memory,
       unless some other policy overrides this.

       numa_set_interleave_mask() sets the  memory  interleave  mask  for  the
       current  thread  to  nodemask.   All  new  memory  allocations are page
       interleaved over all nodes in the interleave mask. Interleaving can  be
       turned  off  again  by passing an empty mask (numa_no_nodes).  The page
       interleaving only occurs on the actual page fault that puts a new  page
       into the current address space. It is also only a hint: the kernel will
       fall back to other nodes if no memory is available  on  the  interleave
       target.  This is a low level function, it may be more convenient to use
       the   higher   level   functions   like   numa_alloc_interleaved()   or
       numa_alloc_interleaved_subset().

       numa_get_interleave_mask() returns the current interleave mask.

       numa_bind()  binds  the  current  thread  and its children to the nodes
       specified in nodemask.  They will only run on the CPUs of the specified
       nodes  and only be able to allocate memory from them.  This function is
       equivalent  to  calling  numa_run_on_node_mask(nodemask)  followed   by
       numa_set_membind(nodemask).   If  threads should be bound to individual
       CPUs  inside   nodes   consider   using   numa_node_to_cpus   and   the
       sched_setaffinity(2) syscall.

       numa_set_preferred()  sets the preferred node for the current thread to
       node.  The preferred node is the node on  which  memory  is  preferably
       allocated  before  falling  back to other nodes.  The default is to use
       the node on which the process  is  currently  running  (local  policy).
       Passing a -1 argument is equivalent to numa_set_localalloc().

       numa_set_localalloc()  sets  a  local  memory allocation policy for the
       calling thread.  Memory is preferably allocated on the  node  on  which
       the thread is currently running.

       numa_set_membind()  sets  the  memory allocation mask.  The thread will
       only allocate memory from  the  nodes  set  in  nodemask.   Passing  an
       argument of numa_no_nodes or numa_all_nodes turns off memory binding to
       specific nodes.

       numa_get_membind() returns the mask of  nodes  from  which  memory  can
       currently be allocated.  If the returned mask is equal to numa_no_nodes
       or numa_all_nodes, then all nodes are available for memory  allocation.

       numa_alloc_interleaved()   allocates   size   bytes   of   memory  page
       interleaved on all nodes. This function is relatively slow  and  should
       only  be  used  for  large  areas  consisting  of  multiple  pages. The
       interleaving works at page level and will only show an effect when  the
       area  is  large.   The allocated memory must be freed with numa_free().
       On error, NULL is returned.

       numa_alloc_interleaved_subset() is like numa_alloc_interleaved() except
       that  it  also accepts a mask of the nodes to interleave on.  On error,
       NULL is returned.

       numa_alloc_onnode() allocates memory on a specific node. This  function
       is  relatively  slow  and allocations are rounded up to the system page
       size.  The memory must be freed with numa_free().  On  errors  NULL  is
       returned.

       numa_alloc_local()  allocates  size  bytes of memory on the local node.
       This function is relatively slow and allocations are rounded up to  the
       system  page  size.   The  memory  must  be freed with numa_free().  On
       errors NULL is returned.

       numa_alloc() allocates size bytes  of  memory  with  the  current  NUMA
       policy.   This  function is relatively slow and allocations are rounded
       up to the system page size.  The memory must be freed with numa_free().
       On errors NULL is returned.

       numa_free()  frees size bytes of memory starting at start, allocated by
       the numa_alloc_* functions above.

       numa_run_on_node() runs the  current  thread  and  its  children  on  a
       specific  node.  They will not migrate to CPUs of other nodes until the
       node affinity is reset with  a  new  call  to  numa_run_on_node_mask().
       Passing  -1  permits  the  kernel  to  schedule on all nodes again.  On
       success, 0 is returned; on error -1 is returned, and errno  is  set  to
       indicate the error.

       numa_run_on_node_mask()  runs  the current thread and its children only
       on nodes specified in nodemask.  They will not migrate to CPUs of other
       nodes   until   the   node  affinity  is  reset  with  a  new  call  to
       numa_run_on_node_mask().  Passing numa_all_nodes permits the kernel  to
       schedule on all nodes again.  On success, 0 is returned; on error -1 is
       returned, and errno is set to indicate the error.

       numa_get_run_node_mask() returns the mask of  nodes  that  the  current
       thread is allowed to run on.

       numa_interleave_memory()  interleaves size bytes of memory page by page
       from start on nodes nodemask.   This  is  a  lower  level  function  to
       interleave not yet faulted in but  allocated memory. Not yet faulted in
       means the memory is allocated using mmap(2) or shmat(2),  but  has  not
       been   accessed  by  the  current  process  yet.  The  memory  is  page
       interleaved   to   all   nodes   specified   in   nodemask.    Normally
       numa_alloc_interleaved() should be used for private memory instead, but
       this function is useful to handle shared memory areas. To be useful the
       memory  area should be several megabytes at least (or tens of megabytes
       of hugetlbfs mappings) If the numa_set_strict() flag is true  then  the
       operation  will  cause  a numa_error if there were already pages in the
       mapping that do not follow the policy.

       numa_tonode_memory() put memory on a  specific  node.  The  constraints
       described for numa_interleave_memory() apply here too.

       numa_tonodemask_memory()  put  memory  on  a specific set of nodes. The
       constraints described for numa_interleave_memory() apply here too.

       numa_setlocal_memory()  locates  memory  on  the  current   node.   The
       constraints described for numa_interleave_memory() apply here too.

       numa_police_memory()  locates  memory with the current NUMA policy. The
       constraints described for numa_interleave_memory() apply here too.

       numa_node_to_cpus() converts a  node  number  to  a  bitmask  of  CPUs.
       bufferlen  is  the size of buffer in bytes.  The user must pass a large
       enough buffer. If the buffer is not large enough errno will be  set  to
       ERANGE and -1 returned. On success 0 is returned.

       numa_set_bind_policy()  specifies  whether  calls that bind memory to a
       specific node should use the preferred policy or a strict policy.   The
       preferred  policy  allows  the kernel to allocate memory on other nodes
       when there isn’t enough free on the target node. strict will  fail  the
       allocation  in  that case.  Setting the argument to specifies strict, 0
       preferred.  Note that specifying more than one node non strict may only
       use the first node in some kernel versions.

       numa_set_strict()   sets   a  flag  that  says  whether  the  functions
       allocating on specific nodes should use use  a  strict  policy.  Strict
       means the allocation will fail if the memory cannot be allocated on the
       target node.  Default operation is to fall back to other  nodes.   This
       doesn’t apply to interleave and default.

       numa_distance()  reports  the  distance in the machine topology between
       two nodes.  The factors are a multiple of 10. It  returns  0  when  the
       distance  cannot  be  determined.  A  node  has  distance 10 to itself.
       Reporting the distance requires a Linux kernel  version  of  2.6.10  or
       newer.

       numa_error() is a weak internal libnuma function that can be overridden
       by the user program.  This function is called with a  char  *  argument
       when  a libnuma function fails.  Overriding the weak library definition
       makes it possible to specify a different error handling strategy when a
       libnuma function fails. It does not affect numa_available().

       The  num_error()  function defined in libnuma prints an error on stderr
       and terminates the program if numa_exit_on_error is set to  a  non-zero
       value.  The default value of numa_exit_on_error is zero.

       numa_warn()  is  a  weak  internal  libnuma  function  that can be also
       overridden by the user program.  It is called to warn the user  when  a
       libnuma   function   encounters   a   non-fatal   error.   The  default
       implementation prints a warning to stderr.

       The first argument is a unique number identifying each  warning.  After
       that  there is a printf(3)-style format string and a variable number of
       arguments.

THREAD SAFETY

       numa_set_bind_policy and numa_exit_on_error  are  process  global.  The
       other calls are thread safe.

       Memory  policy  set  for  memory  areas is shared by all threads of the
       process.  Memory policy is also shared by other processes  mapping  the
       same  memory using shmat(2) or mmap(2) from shmfs/hugetlbfs.  It is not
       shared for disk backed file mappings right now although that may change
       in the future.

COPYRIGHT

       Copyright  2002, 2004, Andi Kleen, SuSE Labs.  libnuma is under the GNU
       Lesser General Public License, v2.1.

SEE ALSO

       get_mempolicy(2), getpagesize(2), mbind(2), mmap(2),  set_mempolicy(2),
       shmat(2), numactl(8), sched_setaffinity(2)