Provided by: nvidia-cuda-dev_9.1.85-3ubuntu1_amd64
NAME
Occupancy - Functions __cudart_builtin__ cudaError_t cudaOccupancyMaxActiveBlocksPerMultiprocessor (int *numBlocks, const void *func, int blockSize, size_t dynamicSMemSize) Returns occupancy for a device function. __cudart_builtin__ cudaError_t cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags (int *numBlocks, const void *func, int blockSize, size_t dynamicSMemSize, unsigned int flags) Returns occupancy for a device function with the specified flags.
Detailed Description
\brief occupancy calculation functions of the CUDA runtime API (cuda_runtime_api.h) This section describes the occupancy calculation functions of the CUDA runtime application programming interface. Besides the occupancy calculator functions (cudaOccupancyMaxActiveBlocksPerMultiprocessor and cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags), there are also C++ only occupancy-based launch configuration functions documented in C++ API Routines module. See cudaOccupancyMaxPotentialBlockSize (C++ API), cudaOccupancyMaxPotentialBlockSize (C++ API), cudaOccupancyMaxPotentialBlockSizeVariableSMem (C++ API), cudaOccupancyMaxPotentialBlockSizeVariableSMem (C++ API)
Function Documentation
__cudart_builtin__ cudaError_t cudaOccupancyMaxActiveBlocksPerMultiprocessor (int * numBlocks, const void * func, int blockSize, size_t dynamicSMemSize) Returns in *numBlocks the maximum number of active blocks per streaming multiprocessor for the device function. Parameters: numBlocks - Returned occupancy func - Kernel function for which occupancy is calculated blockSize - Block size the kernel is intended to be launched with dynamicSMemSize - Per-block dynamic shared memory usage intended, in bytes Returns: cudaSuccess, cudaErrorCudartUnloading, cudaErrorInitializationError, cudaErrorInvalidDevice, cudaErrorInvalidDeviceFunction, cudaErrorInvalidValue, cudaErrorUnknown, Note: Note that this function may also return error codes from previous, asynchronous launches. See also: cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags, cudaOccupancyMaxPotentialBlockSize (C++ API), cudaOccupancyMaxPotentialBlockSizeWithFlags (C++ API), cudaOccupancyMaxPotentialBlockSizeVariableSMem (C++ API), cudaOccupancyMaxPotentialBlockSizeVariableSMemWithFlags (C++ API), cuOccupancyMaxActiveBlocksPerMultiprocessor __cudart_builtin__ cudaError_t cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags (int * numBlocks, const void * func, int blockSize, size_t dynamicSMemSize, unsigned int flags) Returns in *numBlocks the maximum number of active blocks per streaming multiprocessor for the device function. The flags parameter controls how special cases are handled. Valid flags include: • cudaOccupancyDefault: keeps the default behavior as cudaOccupancyMaxActiveBlocksPerMultiprocessor • cudaOccupancyDisableCachingOverride: This flag suppresses the default behavior on platform where global caching affects occupancy. On such platforms, if caching is enabled, but per-block SM resource usage would result in zero occupancy, the occupancy calculator will calculate the occupancy as if caching is disabled. Setting this flag makes the occupancy calculator to return 0 in such cases. More information can be found about this feature in the 'Unified L1/Texture Cache' section of the Maxwell tuning guide. Parameters: numBlocks - Returned occupancy func - Kernel function for which occupancy is calculated blockSize - Block size the kernel is intended to be launched with dynamicSMemSize - Per-block dynamic shared memory usage intended, in bytes flags - Requested behavior for the occupancy calculator Returns: cudaSuccess, cudaErrorCudartUnloading, cudaErrorInitializationError, cudaErrorInvalidDevice, cudaErrorInvalidDeviceFunction, cudaErrorInvalidValue, cudaErrorUnknown, Note: Note that this function may also return error codes from previous, asynchronous launches. See also: cudaOccupancyMaxActiveBlocksPerMultiprocessor, cudaOccupancyMaxPotentialBlockSize (C++ API), cudaOccupancyMaxPotentialBlockSizeWithFlags (C++ API), cudaOccupancyMaxPotentialBlockSizeVariableSMem (C++ API), cudaOccupancyMaxPotentialBlockSizeVariableSMemWithFlags (C++ API), cuOccupancyMaxActiveBlocksPerMultiprocessorWithFlags
Author
Generated automatically by Doxygen from the source code.