Provided by: nvidia-compute-utils-550-server_550.54.15-0ubuntu0.23.10.3_amd64 bug

NAME

       nvidia-cuda-mps-control - NVIDIA CUDA Multi Process Service management program

SYNOPSIS

       nvidia-cuda-mps-control [-d | -f]

DESCRIPTION

       MPS  is  a  runtime  service  designed  to  let  multiple  MPI processes using CUDA to run
       concurrently in a way that's transparent to the MPI program.  A CUDA program runs  in  MPS
       mode if the MPS control daemon is running on the system.

       When  CUDA  is  first initialized in a program, the CUDA driver attempts to connect to the
       MPS control daemon. If the connection attempt fails, the program continues to  run  as  it
       normally  would  without  MPS.  If  however,  the connection attempt to the control daemon
       succeeds, the CUDA driver then requests the daemon to start an MPS server on  its  behalf.
       If  there's  an MPS server already running, and the user id of that server process matches
       that of the requesting client process, the  control  daemon  simply  notifies  the  client
       process  of  it,  which  then  proceeds to connect to the server. If there's no MPS server
       already running on the system, the control daemon launches an MPS  server  with  the  same
       user  id  (UID) as that of the requesting client process. If there's an MPS server already
       running, but with a different user id than that of the client process, the control  daemon
       requests  the  existing  server  to shutdown as soon as all its clients are done. Once the
       existing server has terminated, the control daemon launches a new server with the user  id
       same as that of the queued client process.

       The MPS server creates the shared GPU context, and manages its clients.  An MPS server can
       support a finite amount of CUDA contexts determined by the  hardware  architecture  it  is
       running  on.  For compute capability SM 3.5 through SM 6.0 the limit is 16 clients per GPU
       at a time. Compute capability SM 7.0 has a  limit  of  48.  MPS  is  transparent  to  CUDA
       programs,  with all the complexity of communication between the client process, the server
       and the control daemon hidden within the driver binaries.

       Currently, CUDA MPS is available on 64-bit Linux only, requires  a  device  that  supports
       Unified  Virtual  Address (UVA) and has compute capability SM 3.5 or higher.  Applications
       requiring pre-CUDA 4.0 APIs are not supported under CUDA  MPS.  Certain  capabilities  are
       only available starting with compute capability SM 7.0.

       Refer to the MPS documentation on NVIDiA Docs for more details.

OPTIONS

   -d
       Start  the  MPS  control daemon in background mode, assuming the user has enough privilege
       (e.g. root). Parent process  exits  when  control  daemon  started  listening  for  client
       connections.

   -f
       Start  the  MPS  control daemon in foreground mode, assuming the user has enough privilege
       (e.g. root). The debug messages are sent to standard output.

   -h, --help
       Print a help message.

   <no arguments>
       Start the front-end management user interface to the MPS control daemon, which needs to be
       started first. The front-end UI keeps reading commands from stdin until EOF.  Commands are
       separated by the newline character. If an invalid command is issued and rejected, an error
       message  will  be  printed  to  stdout.  The  exit  status  of the front-end UI is zero if
       communication with the daemon is successful. A non-zero value is returned if the daemon is
       not found or connection to the daemon is broken unexpectedly. See the "quit" command below
       for more information about the exit status.

       Commands supported by the MPS control daemon:

       get_server_list
              Print out a list of PIDs of all MPS servers.

       get_server_status PID
              Print out the status of the server with given (PID).

       start_server -uid UID
              Start a new MPS server for the specified user (UID).

       shutdown_server PID [-f]
              Shutdown the MPS server with given PID. MPS server only  exits  after  all  clients
              disconnect  and  the  MPS  server may accept new clients while there is a connected
              client.  -f is forced immediate shutdown. If a client launches a faulty kernel that
              runs  forever,  a  forced shutdown of the MPS server may be required, since the MPS
              server creates and issues GPU work on behalf of its clients.

       get_client_list PID
              Print out a list of PIDs of all clients connected to the MPS server with given PID.

       quit [-t TIMEOUT]
              Shutdown the MPS control daemon process and all MPS servers. The MPS control daemon
              stops  accepting  new clients while waiting for current MPS servers and MPS clients
              to finish. If TIMEOUT is specified (in seconds), the daemon will force MPS  servers
              to shutdown if they are still running after TIMEOUT seconds.

              This  command  is  synchronous.  The front-end UI waits for the daemon to shutdown,
              then returns the daemon's exit status. The exit status is zero iff all MPS  servers
              have exited gracefully.

       Commands available to Volta MPS control daemon:

       get_device_client_list PID
              List  the  devices  and PIDs of client applications that enumerated this device. It
              optionally takes the server instance PID.

       set_default_active_thread_percentage percentage
              Set the default active thread percentage for MPS servers. If  there  is  already  a
              server  spawned,  this  command  will only affect the next server. The set value is
              lost if a quit command is executed. The default is 100.

       get_default_active_thread_percentage
              Query the current default available thread percentage.

       set_active_thread_percentage PID percentage
              Set the active thread percentage for the MPS server instance of the given PID.  All
              clients  created  with  that server afterwards will observe the new limit. Existing
              clients are not affected.

       get_active_thread_percentage PID
              Query the current available thread percentage of the MPS  server  instance  of  the
              given PID.

       set_default_device_pinned_mem_limit dev value
              Sets  the  default  device pinned memory limit for for the MPS servers. If there is
              already a server spawned, this command will only affect the next server. The  value
              must  be  in the form of an integer followed by a qualifier, either “G” or “M” that
              specifies the value in Gigabytes or Megabytes respectively.

       get_default_device_pinned_mem_limit dev
              Query the current default pinned memory limit for the device.

       set_device_pinned_mem_limit PID dev value
              Overrides the device pinned memory limit for the MPS server of the given  PID.  All
              the  clients  created  with  that  server  afterwards  will  observe the new limit.
              Existing clients are not affected.

       get_default_device_pinned_mem_limit PID dev
              Query the current device pinned memory limit of the  MPS  server  instance  of  the
              given PID for the device dev.

       terminate_client server PID client PID
              Terminates  all  the  outstanding  GPU  work of the MPS client process of the given
              client PID running on the MPS server denoted by <server PID>.

       ps [-p PID]
              Reports a snapshot of the current client processes.

       set_default_client_priority priority
              Set the client priority level that will be used for new  clients.  Priority  values
              are  only  considered  as  hints to the CUDA driver and can be ignored or overriden
              depending on platform.  priority follows a convention  where  smaller  numbers  are
              higher  priorities,  and  the default priority value is 0. The only other supported
              value for priority is 1, which represents a below-normal priority level.

       get_default_client_priority
              Query the current priority value that will be used for new clients.

ENVIRONMENT

       CUDA_MPS_PIPE_DIRECTORY
              Specify the directory that contains the named pipes and UNIX  domain  sockets  used
              for  communication among the MPS control, MPS server, and MPS clients. The value of
              this environment variable should be consistent in the MPS control  daemon  and  all
              MPS client processes. Default directory is /tmp/nvidia-mps

       CUDA_MPS_LOG_DIRECTORY
              Specify the directory that contains the MPS log files. This variable is used by the
              MPS control daemon only. Default directory is /var/log/nvidia-mps

       CUDA_VISIBLE_DEVICES
              Specify which CUDA devices are visible to the control daemon. Takes either an index
              value or UUID.

       CUDA_DEVICE_MAX_CONNECTIONS
              Specify the preferred number of connections between the host and device.

       CUDA_MPS_ACTIVE_THREAD_PERCENTAGE
              Specify  the  portion of available threads that clients can use on Volta+ hardware.
              Setting this for the control daemon will set the default active  thread  percentage
              for  all  servers  spawned,  while setting this for a client or client context will
              constrain the active thread percentage for that unit and cannot be higher than  the
              active thread percentage value set for the control daemon.

       CUDA_MPS_ENABLE_PER_CTX_DEVICE_MULTIPROCESSOR_PARTITIONING
              Specify whether individual client contexts are allowed to have different values for
              CUDA_MPS_ACTIVE_THREAD_PERCENTAGE.

       CUDA_MPS_PINNED_DEVICE_MEM_LIMIT
              Specify the amount of GPU memory available to MPS client processes.

       CUDA_MPS_CLIENT_PRIORITY
              Specify the default client priority value at initialization.

FILES

       Log files created by the MPS control daemon in the specified directory

       control.log
              Record startup and shutdown of MPS control daemon, user commands issued with  their
              results, and status of MPS servers.

       server.log
              Record startup and shutdown of MPS servers, and status of MPS clients.