plucky (1) hwloc-distrib.1.gz

Provided by: hwloc-nox_2.12.0-1_amd64 bug

NAME

       hwloc-distrib - Build a number of cpu masks distributed on the system

SYNOPSIS

       hwloc-distrib [options] <integer>

OPTIONS

       --single
              Singlify each output to a single CPU.

       --cpuset-output-format <hwloc|list|taskset> --cof <hwloc|list|taskset>
              Change  the  format  of displayed CPU set strings.  By default, the hwloc-specific format is used.
              If list is given, the output is a comma-separated of numbers or ranges, e.g. 2,4-5,8 .  If taskset
              is  given,  the  output  is  compatible  with  the  taskset program (replaces the former --taskset
              option).

       -v --verbose
              Verbose messages.

       -i <path>, --input <path>
              Read the topology from <path> instead of discovering the topology of the local machine.

              If <path> is a file, it may be a XML file exported by a previous hwloc program.  If <path> is "-",
              the standard input may be used as a XML file.

              On  Linux,  <path>  may be a directory containing the topology files gathered from another machine
              topology with hwloc-gather-topology.

              On x86, <path> may be a directory containing a cpuid dump gathered with hwloc-gather-cpuid.

              When the archivemount program is available, <path> may also be a tarball containing such Linux  or
              x86 topology files.

       -i <specification>, --input <specification>
              Simulate  a  fake  hierarchy  (instead  of  discovering  the  topology  on  the local machine). If
              <specification> is "node:2 pu:3", the topology will contain two NUMA nodes with 3 processing units
              in each of them.  The <specification> string must end with a number of PUs.

       --if <format>, --input-format <format>
              Enforce the input in the given format, among xml, fsroot, cpuid and synthetic.

       --ignore <type>
              Ignore all objects of type <type> in the topology.

       --from <type>
              Distribute  starting  from  objects  of  the  given  type  instead of from the top of the topology
              hierarchy, i.e. ignoring the structure given by objects above.

              <type> cannot be among NUMANode, I/O or Misc types.

       --to <type>
              Distribute down to objects of the given type instead  of  down  to  the  bottom  of  the  topology
              hierarchy,  i.e.  ignoring  the  structure  given  by  objects  below.  This may be useful if some
              latitude is desired for the binding, e.g. just bind several  processes  to  each  package  without
              specifying a single core for each of them.

              <type> cannot be among NUMANode, I/O or Misc types.

       --at <type>
              Distribute among objects of the given type.  This is equivalent to specifying both --from and --to
              at the same time.

       --reverse
              Distribute by starting with the last objects first, and singlify CPU sets by keeping the last  bit
              (instead of the first bit).

       --restrict <cpuset>
              Restrict  the  topology  to  the  given  cpuset.   This  removes some PUs and their now-child-less
              parents.

              Beware that restricting the PUs in a topology may change the  logical  indexes  of  many  objects,
              including NUMA nodes.

       --restrict nodeset=<nodeset>
              Restrict   the  topology  to  the  given  nodeset  (unless  --restrict-flags  specifies  something
              different).  This removes some NUMA nodes and their now-child-less parents.

              Beware that restricting the NUMA nodes in a topology  may  change  the  logical  indexes  of  many
              objects, including PUs.

       --restrict-flags <flags>
              Enforce  flags when restricting the topology.  Flags may be given as numeric values or as a comma-
              separated list of flag names that are passed to hwloc_topology_restrict().   Those  names  may  be
              substrings  of  actual flag names as long as a single one matches, for instance bynodeset,memless.
              The default is 0 (or none).

       --disallowed
              Include objects disallowed by administrative limitations.

       --version
              Report version and exit.

       -h --help
              Display help message and exit.

DESCRIPTION

       hwloc-distrib generates a series of CPU masks corresponding to  a  distribution  of  a  given  number  of
       elements  over  the  topology  of  the  machine. The distribution is done recursively from the top of the
       hierarchy (or from the level specified by option --from) down to the bottom of the hierarchy (or down  to
       the  level specified by option --to, or until only one element remains), splitting the number of elements
       at each encountered hierarchy level not ignored by options --ignore.

       This can e.g. be used to distribute a set of processes hierarchically according  to  the  topology  of  a
       machine. These masks can be used with hwloc-bind(1).

       On hybrid CPUs (or asymmetric platforms), distribution may be suboptimal since the number of cores or PUs
       inside packages or below caches may vary (the top-down recursive partitioning ignores these numbers until
       reaching  their  levels).  Hence it is recommended to distribute only inside a single homogeneous domain.
       For instance on a CPU with energy-efficient E-cores and high-performance P-cores, one  should  distribute
       separately  N  tasks on E-cores and M tasks on P-cores instead of trying to distribute directly M+N tasks
       on the entire CPUs.

       NOTE: It is highly recommended that you read the hwloc(7) overview page before  reading  this  man  page.
       Most of the concepts described in hwloc(7) directly apply to the hwloc-bind utility.

EXAMPLES

       hwloc-distrib's operation is best described through several examples.

       If 4 processes have to be distributed across a machine, their CPU masks may be obtained with:

           $ hwloc-distrib 4
           0x0000000f
           0x00000f00
           0x000000f0
           0x0000f000

       To distribute only among the second package, the topology should be restricted:

           $ hwloc-distrib --restrict $(hwloc-calc package:1) 4
           0x00000010
           0x00000020
           0x00000040
           0x00000080

       To get a single processor of each CPU masks (prevent migration in case of binding)

           $ hwloc-distrib 4 --single
           0x00000001
           0x00000100
           0x00000010
           0x00001000

       Each output line may be converted independently with hwloc-calc:

           $ hwloc-distrib 4 --single | hwloc-calc --oo -q -I pu
           PU:0
           PU:8
           PU:4
           PU:12

       To  convert  the output into a list of processors that may be passed to dplace -c inside a mpirun command
       line:

           $ hwloc-distrib 4 --single | xargs hwloc-calc -I pu
           0,8,4,16

RETURN VALUE

       Upon successful execution, hwloc-distrib displays one or more CPU mask strings.  The return value is 0.

       hwloc-distrib will return nonzero if any kind of error occurs, such as (but not limited  to)  failure  to
       parse the command line.

SEE ALSO

       hwloc(7)