Ubuntu Manpage: Event Management -

Provided by: nvidia-cuda-dev_9.1.85-3ubuntu1_amd64

NAME

       Event Management -

   Functions
       CUresult cuEventCreate (CUevent *phEvent, unsigned int Flags)
           Creates an event.
       CUresult cuEventDestroy (CUevent hEvent)
           Destroys an event.
       CUresult cuEventElapsedTime (float *pMilliseconds, CUevent hStart, CUevent hEnd)
           Computes the elapsed time between two events.
       CUresult cuEventQuery (CUevent hEvent)
           Queries an event's status.
       CUresult cuEventRecord (CUevent hEvent, CUstream hStream)
           Records an event.
       CUresult cuEventSynchronize (CUevent hEvent)
           Waits for an event to complete.
       CUresult cuStreamBatchMemOp (CUstream stream, unsigned int count, CUstreamBatchMemOpParams *paramArray,
           unsigned int flags)
           Batch operations to synchronize the stream via memory operations.
       CUresult cuStreamWaitValue32 (CUstream stream, CUdeviceptr addr, cuuint32_t value, unsigned int flags)
           Wait on a memory location.
       CUresult cuStreamWaitValue64 (CUstream stream, CUdeviceptr addr, cuuint64_t value, unsigned int flags)
           Wait on a memory location.
       CUresult cuStreamWriteValue32 (CUstream stream, CUdeviceptr addr, cuuint32_t value, unsigned int flags)
           Write a value to memory.
       CUresult cuStreamWriteValue64 (CUstream stream, CUdeviceptr addr, cuuint64_t value, unsigned int flags)
           Write a value to memory.

Detailed Description

       \brief event management functions of the low-level CUDA driver API (cuda.h)

       This section describes the event management functions of the low-level CUDA driver application
       programming interface.

Function Documentation

   CUresult cuEventCreate (CUevent * phEvent, unsigned int Flags)
       Creates an event *phEvent for the current context with the flags specified via Flags. Valid flags
       include:

       • CU_EVENT_DEFAULT: Default event creation flag.

       • CU_EVENT_BLOCKING_SYNC:  Specifies  that  the  created event should use blocking synchronization. A CPU
         thread that uses cuEventSynchronize() to wait on an event created with this flag will block  until  the
         event has actually been recorded.

       • CU_EVENT_DISABLE_TIMING:  Specifies  that the created event does not need to record timing data. Events
         created with this flag specified and the CU_EVENT_BLOCKING_SYNC flag not  specified  will  provide  the
         best performance when used with cuStreamWaitEvent() and cuEventQuery().

       • CU_EVENT_INTERPROCESS:  Specifies  that  the  created  event  may  be  used as an interprocess event by
         cuIpcGetEventHandle(). CU_EVENT_INTERPROCESS must be specified along with CU_EVENT_DISABLE_TIMING.

       Parameters:
           phEvent - Returns newly created event
           Flags - Event creation flags

       Returns:
           CUDA_SUCCESS,   CUDA_ERROR_DEINITIALIZED,   CUDA_ERROR_NOT_INITIALIZED,   CUDA_ERROR_INVALID_CONTEXT,
           CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_OUT_OF_MEMORY

       Note:
           Note that this function may also return error codes from previous, asynchronous launches.

       See also:
           cuEventRecord, cuEventQuery, cuEventSynchronize, cuEventDestroy, cuEventElapsedTime, cudaEventCreate,
           cudaEventCreateWithFlags

   CUresult cuEventDestroy (CUevent hEvent)
       Destroys the event specified by hEvent.

       An   event   may   be   destroyed  before  it  is  complete  (i.e.,  while  cuEventQuery()  would  return
       CUDA_ERROR_NOT_READY). In this case, the call does  not  block  on  completion  of  the  event,  and  any
       associated resources will automatically be released asynchronously at completion.

       Parameters:
           hEvent - Event to destroy

       Returns:
           CUDA_SUCCESS,   CUDA_ERROR_DEINITIALIZED,   CUDA_ERROR_NOT_INITIALIZED,   CUDA_ERROR_INVALID_CONTEXT,
           CUDA_ERROR_INVALID_HANDLE

       Note:
           Note that this function may also return error codes from previous, asynchronous launches.

       See also:
           cuEventCreate, cuEventRecord, cuEventQuery, cuEventSynchronize, cuEventElapsedTime, cudaEventDestroy

   CUresult cuEventElapsedTime (float * pMilliseconds, CUevent hStart, CUevent hEnd)
       Computes the elapsed  time  between  two  events  (in  milliseconds  with  a  resolution  of  around  0.5
       microseconds).

       If  either  event was last recorded in a non-NULL stream, the resulting time may be greater than expected
       (even if both used the same stream handle). This happens  because  the  cuEventRecord()  operation  takes
       place asynchronously and there is no guarantee that the measured latency is actually just between the two
       events. Any number of other different stream operations could execute in between the two measured events,
       thus altering the timing in a significant way.

       If  cuEventRecord()  has  not  been called on either event then CUDA_ERROR_INVALID_HANDLE is returned. If
       cuEventRecord() has been called on both events but one or both of them has not yet been  completed  (that
       is, cuEventQuery() would return CUDA_ERROR_NOT_READY on at least one of the events), CUDA_ERROR_NOT_READY
       is  returned.  If either event was created with the CU_EVENT_DISABLE_TIMING flag, then this function will
       return CUDA_ERROR_INVALID_HANDLE.

       Parameters:
           pMilliseconds - Time between hStart and hEnd in ms
           hStart - Starting event
           hEnd - Ending event

       Returns:
           CUDA_SUCCESS,   CUDA_ERROR_DEINITIALIZED,   CUDA_ERROR_NOT_INITIALIZED,   CUDA_ERROR_INVALID_CONTEXT,
           CUDA_ERROR_INVALID_HANDLE, CUDA_ERROR_NOT_READY

       Note:
           Note that this function may also return error codes from previous, asynchronous launches.

       See also:
           cuEventCreate, cuEventRecord, cuEventQuery, cuEventSynchronize, cuEventDestroy, cudaEventElapsedTime

   CUresult cuEventQuery (CUevent hEvent)
       Queries  the  status of all work currently captured by hEvent. See cuEventRecord() for details on what is
       captured by an event.

       Returns CUDA_SUCCESS if all captured work has been completed, or  CUDA_ERROR_NOT_READY  if  any  captured
       work is incomplete.

       For  the  purposes  of  Unified  Memory,  a  return  value of CUDA_SUCCESS is equivalent to having called
       cuEventSynchronize().

       Parameters:
           hEvent - Event to query

       Returns:
           CUDA_SUCCESS,   CUDA_ERROR_DEINITIALIZED,   CUDA_ERROR_NOT_INITIALIZED,    CUDA_ERROR_INVALID_HANDLE,
           CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_NOT_READY

       Note:
           Note that this function may also return error codes from previous, asynchronous launches.

       See also:
           cuEventCreate, cuEventRecord, cuEventSynchronize, cuEventDestroy, cuEventElapsedTime, cudaEventQuery

   CUresult cuEventRecord (CUevent hEvent, CUstream hStream)
       Captures  in hEvent the contents of hStream at the time of this call. hEvent and hStream must be from the
       same context. Calls such  as  cuEventQuery()  or  cuStreamWaitEvent()  will  then  examine  or  wait  for
       completion  of the work that was captured. Uses of hStream after this call do not modify hEvent. See note
       on default stream behavior for what is captured in the default case.

       cuEventRecord() can be called multiple times on the same event and will overwrite the previously captured
       state. Other APIs such as cuStreamWaitEvent() use the most recently captured state at the time of the API
       call, and are not affected by later calls to cuEventRecord(). Before the first call  to  cuEventRecord(),
       an event represents an empty set of work, so for example cuEventQuery() would return CUDA_SUCCESS.

       Parameters:
           hEvent - Event to record
           hStream - Stream to record event for

       Returns:
           CUDA_SUCCESS,   CUDA_ERROR_DEINITIALIZED,   CUDA_ERROR_NOT_INITIALIZED,   CUDA_ERROR_INVALID_CONTEXT,
           CUDA_ERROR_INVALID_HANDLE, CUDA_ERROR_INVALID_VALUE

       Note:
           This function uses standard  semantics.

           Note that this function may also return error codes from previous, asynchronous launches.

       See also:
           cuEventCreate,     cuEventQuery,     cuEventSynchronize,      cuStreamWaitEvent,      cuEventDestroy,
           cuEventElapsedTime, cudaEventRecord

   CUresult cuEventSynchronize (CUevent hEvent)
       Waits  until  the completion of all work currently captured in hEvent. See cuEventRecord() for details on
       what is captured by an event.

       Waiting for an event that was created with the CU_EVENT_BLOCKING_SYNC flag will  cause  the  calling  CPU
       thread  to block until the event has been completed by the device. If the CU_EVENT_BLOCKING_SYNC flag has
       not been set, then the CPU thread will busy-wait until the event has been completed by the device.

       Parameters:
           hEvent - Event to wait for

       Returns:
           CUDA_SUCCESS,   CUDA_ERROR_DEINITIALIZED,   CUDA_ERROR_NOT_INITIALIZED,   CUDA_ERROR_INVALID_CONTEXT,
           CUDA_ERROR_INVALID_HANDLE

       Note:
           Note that this function may also return error codes from previous, asynchronous launches.

       See also:
           cuEventCreate, cuEventRecord, cuEventQuery, cuEventDestroy, cuEventElapsedTime, cudaEventSynchronize

   CUresult  cuStreamBatchMemOp  (CUstream  stream,  unsigned  int count, CUstreamBatchMemOpParams * paramArray,
       unsigned int flags)
       This is a batch version of cuStreamWaitValue32()  and  cuStreamWriteValue32().  Batching  operations  may
       avoid  some  performance overhead in both the API call and the device execution versus adding them to the
       stream in separate API calls. The operations are enqueued in the order they appear in the array.

       See  CUstreamBatchMemOpType  for  the  full  set  of  supported  operations,  and  cuStreamWaitValue32(),
       cuStreamWaitValue64(),   cuStreamWriteValue32(),  and  cuStreamWriteValue64()  for  details  of  specific
       operations.

       Basic     support     for     this     can     be     queried     with     cuDeviceGetAttribute()     and
       CU_DEVICE_ATTRIBUTE_CAN_USE_STREAM_MEM_OPS. See related APIs for details on querying support for specific
       operations.

       Parameters:
           stream The stream to enqueue the operations in.
           count The number of operations in the array. Must be less than 256.
           paramArray The types and parameters of the individual operations.
           flags Reserved for future expansion; must be 0.

       Returns:
           CUDA_SUCCESS, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_NOT_SUPPORTED

       Note:
           Note that this function may also return error codes from previous, asynchronous launches.

       See also:
           cuStreamWaitValue32,       cuStreamWaitValue64,      cuStreamWriteValue32,      cuStreamWriteValue64,
           cuMemHostRegister

   CUresult cuStreamWaitValue32 (CUstream stream, CUdeviceptr addr, cuuint32_t value, unsigned int flags)
       Enqueues a synchronization of the stream on the given memory location. Work ordered after  the  operation
       will block until the given condition on the memory is satisfied. By default, the condition is to wait for
       (int32_t)(*addr  -  value)  >=  0,  a cyclic greater-or-equal. Other condition types can be specified via
       flags.

       If the memory was registered  via  cuMemHostRegister(),  the  device  pointer  should  be  obtained  with
       cuMemHostGetDevicePointer(). This function cannot be used with managed memory (cuMemAllocManaged).

       Support       for       this      can      be      queried      with      cuDeviceGetAttribute()      and
       CU_DEVICE_ATTRIBUTE_CAN_USE_STREAM_MEM_OPS. The only requirement for basic support is that on Windows,  a
       device must be in TCC mode.

       Parameters:
           stream The stream to synchronize on the memory location.
           addr The memory location to wait on.
           value The value to compare with the memory location.
           flags See CUstreamWaitValue_flags.

       Returns:
           CUDA_SUCCESS, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_NOT_SUPPORTED

       Note:
           Note that this function may also return error codes from previous, asynchronous launches.

       See also:
           cuStreamWaitValue64,       cuStreamWriteValue32,       cuStreamWriteValue64       cuStreamBatchMemOp,
           cuMemHostRegister, cuStreamWaitEvent

   CUresult cuStreamWaitValue64 (CUstream stream, CUdeviceptr addr, cuuint64_t value, unsigned int flags)
       Enqueues a synchronization of the stream on the given memory location. Work ordered after  the  operation
       will block until the given condition on the memory is satisfied. By default, the condition is to wait for
       (int64_t)(*addr  -  value)  >=  0,  a cyclic greater-or-equal. Other condition types can be specified via
       flags.

       If the memory was registered  via  cuMemHostRegister(),  the  device  pointer  should  be  obtained  with
       cuMemHostGetDevicePointer().

       Support       for       this      can      be      queried      with      cuDeviceGetAttribute()      and
       CU_DEVICE_ATTRIBUTE_CAN_USE_64_BIT_STREAM_MEM_OPS.  The  requirements  are  compute  capability  7.0   or
       greater, and on Windows, that the device be in TCC mode.

       Parameters:
           stream The stream to synchronize on the memory location.
           addr The memory location to wait on.
           value The value to compare with the memory location.
           flags See CUstreamWaitValue_flags.

       Returns:
           CUDA_SUCCESS, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_NOT_SUPPORTED

       Note:
           Note that this function may also return error codes from previous, asynchronous launches.

       See also:
           cuStreamWaitValue32,       cuStreamWriteValue32,       cuStreamWriteValue64,      cuStreamBatchMemOp,
           cuMemHostRegister, cuStreamWaitEvent

   CUresult cuStreamWriteValue32 (CUstream stream, CUdeviceptr addr, cuuint32_t value, unsigned int flags)
       Write a value to memory. Unless the CU_STREAM_WRITE_VALUE_NO_MEMORY_BARRIER flag is passed, the write  is
       preceded  by  a system-wide memory fence, equivalent to a __threadfence_system() but scoped to the stream
       rather than a CUDA thread.

       If the memory was registered  via  cuMemHostRegister(),  the  device  pointer  should  be  obtained  with
       cuMemHostGetDevicePointer(). This function cannot be used with managed memory (cuMemAllocManaged).

       Support       for       this      can      be      queried      with      cuDeviceGetAttribute()      and
       CU_DEVICE_ATTRIBUTE_CAN_USE_STREAM_MEM_OPS. The only requirement for basic support is that on Windows,  a
       device must be in TCC mode.

       Parameters:
           stream The stream to do the write in.
           addr The device address to write to.
           value The value to write.
           flags See CUstreamWriteValue_flags.

       Returns:
           CUDA_SUCCESS, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_NOT_SUPPORTED

       Note:
           Note that this function may also return error codes from previous, asynchronous launches.

       See also:
           cuStreamWriteValue64,       cuStreamWaitValue32,       cuStreamWaitValue64,       cuStreamBatchMemOp,
           cuMemHostRegister, cuEventRecord

   CUresult cuStreamWriteValue64 (CUstream stream, CUdeviceptr addr, cuuint64_t value, unsigned int flags)
       Write a value to memory. Unless the CU_STREAM_WRITE_VALUE_NO_MEMORY_BARRIER flag is passed, the write  is
       preceded  by  a system-wide memory fence, equivalent to a __threadfence_system() but scoped to the stream
       rather than a CUDA thread.

       If the memory was registered  via  cuMemHostRegister(),  the  device  pointer  should  be  obtained  with
       cuMemHostGetDevicePointer().

       Support       for       this      can      be      queried      with      cuDeviceGetAttribute()      and
       CU_DEVICE_ATTRIBUTE_CAN_USE_64_BIT_STREAM_MEM_OPS.  The  requirements  are  compute  capability  7.0   or
       greater, and on Windows, that the device be in TCC mode.

       Parameters:
           stream The stream to do the write in.
           addr The device address to write to.
           value The value to write.
           flags See CUstreamWriteValue_flags.

       Returns:
           CUDA_SUCCESS, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_NOT_SUPPORTED

       Note:
           Note that this function may also return error codes from previous, asynchronous launches.

       See also:
           cuStreamWriteValue32,       cuStreamWaitValue32,       cuStreamWaitValue64,       cuStreamBatchMemOp,
           cuMemHostRegister, cuEventRecord

Author

       Generated automatically by Doxygen from the source code.

Version 6.0                                        3 Nov 2017                                Event Management(3)