Ubuntu Manpage: Occupancy -

Provided by: nvidia-cuda-dev_9.1.85-3ubuntu1_amd64

NAME

       Occupancy -

   Functions
       __cudart_builtin__ cudaError_t cudaOccupancyMaxActiveBlocksPerMultiprocessor (int *numBlocks, const void
           *func, int blockSize, size_t dynamicSMemSize)
           Returns occupancy for a device function.
       __cudart_builtin__ cudaError_t cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags (int *numBlocks,
           const void *func, int blockSize, size_t dynamicSMemSize, unsigned int flags)
           Returns occupancy for a device function with the specified flags.

Detailed Description

       \brief occupancy calculation functions of the CUDA runtime API (cuda_runtime_api.h)

       This section describes the occupancy calculation functions of the CUDA runtime application programming
       interface.

       Besides the occupancy calculator functions (cudaOccupancyMaxActiveBlocksPerMultiprocessor and
       cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags), there are also C++ only occupancy-based launch
       configuration functions documented in C++ API Routines module.

       See cudaOccupancyMaxPotentialBlockSize (C++ API), cudaOccupancyMaxPotentialBlockSize (C++ API),
       cudaOccupancyMaxPotentialBlockSizeVariableSMem (C++ API), cudaOccupancyMaxPotentialBlockSizeVariableSMem
       (C++ API)

Function Documentation

   __cudart_builtin__ cudaError_t cudaOccupancyMaxActiveBlocksPerMultiprocessor (int * numBlocks, const void *
       func, int blockSize, size_t dynamicSMemSize)
       Returns in *numBlocks the maximum number of active blocks per streaming multiprocessor for the device
       function.

       Parameters:
           numBlocks - Returned occupancy
           func - Kernel function for which occupancy is calculated
           blockSize - Block size the kernel is intended to be launched with
           dynamicSMemSize - Per-block dynamic shared memory usage intended, in bytes

       Returns:
           cudaSuccess, cudaErrorCudartUnloading, cudaErrorInitializationError, cudaErrorInvalidDevice,
           cudaErrorInvalidDeviceFunction, cudaErrorInvalidValue, cudaErrorUnknown,

       Note:
           Note that this function may also return error codes from previous, asynchronous launches.

       See also:
           cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags, cudaOccupancyMaxPotentialBlockSize (C++ API),
           cudaOccupancyMaxPotentialBlockSizeWithFlags (C++ API), cudaOccupancyMaxPotentialBlockSizeVariableSMem
           (C++ API), cudaOccupancyMaxPotentialBlockSizeVariableSMemWithFlags (C++ API),
           cuOccupancyMaxActiveBlocksPerMultiprocessor

   __cudart_builtin__ cudaError_t cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags (int * numBlocks, const
       void * func, int blockSize, size_t dynamicSMemSize, unsigned int flags)
       Returns in *numBlocks the maximum number of active blocks per streaming multiprocessor for the device
       function.

       The flags parameter controls how special cases are handled. Valid flags include:

       • cudaOccupancyDefault: keeps the default behavior as cudaOccupancyMaxActiveBlocksPerMultiprocessor

       • cudaOccupancyDisableCachingOverride: This flag suppresses the default behavior on platform where global
         caching affects occupancy. On such platforms, if caching is enabled, but per-block SM resource usage
         would result in zero occupancy, the occupancy calculator will calculate the occupancy as if caching is
         disabled. Setting this flag makes the occupancy calculator to return 0 in such cases. More information
         can be found about this feature in the 'Unified L1/Texture Cache' section of the Maxwell tuning guide.

       Parameters:
           numBlocks - Returned occupancy
           func - Kernel function for which occupancy is calculated
           blockSize - Block size the kernel is intended to be launched with
           dynamicSMemSize - Per-block dynamic shared memory usage intended, in bytes
           flags - Requested behavior for the occupancy calculator

       Returns:
           cudaSuccess, cudaErrorCudartUnloading, cudaErrorInitializationError, cudaErrorInvalidDevice,
           cudaErrorInvalidDeviceFunction, cudaErrorInvalidValue, cudaErrorUnknown,

       Note:
           Note that this function may also return error codes from previous, asynchronous launches.

       See also:
           cudaOccupancyMaxActiveBlocksPerMultiprocessor, cudaOccupancyMaxPotentialBlockSize (C++ API),
           cudaOccupancyMaxPotentialBlockSizeWithFlags (C++ API), cudaOccupancyMaxPotentialBlockSizeVariableSMem
           (C++ API), cudaOccupancyMaxPotentialBlockSizeVariableSMemWithFlags (C++ API),
           cuOccupancyMaxActiveBlocksPerMultiprocessorWithFlags

Author

       Generated automatically by Doxygen from the source code.