Low-Level CUDA Support¶
Device management¶
cupy.cuda.Device |
Object that represents a CUDA device. |
Memory management¶
cupy.get_default_memory_pool |
Returns CuPy default memory pool for GPU memory. |
cupy.get_default_pinned_memory_pool |
Returns CuPy default memory pool for pinned memory. |
cupy.cuda.Memory |
Memory allocation on a CUDA device. |
cupy.cuda.UnownedMemory |
CUDA memory that is not owned by CuPy. |
cupy.cuda.PinnedMemory |
Pinned memory allocation on host. |
cupy.cuda.MemoryPointer |
Pointer to a point on a device memory. |
cupy.cuda.PinnedMemoryPointer |
Pointer of a pinned memory. |
cupy.cuda.alloc |
Calls the current allocator. |
cupy.cuda.alloc_pinned_memory |
Calls the current allocator. |
cupy.cuda.get_allocator |
Returns the current allocator for GPU memory. |
cupy.cuda.set_allocator |
Sets the current allocator for GPU memory. |
cupy.cuda.using_allocator |
Sets a thread-local allocator for GPU memory inside |
cupy.cuda.set_pinned_memory_allocator |
Sets the current allocator for the pinned memory. |
cupy.cuda.MemoryPool |
Memory pool for all GPU devices on the host. |
cupy.cuda.PinnedMemoryPool |
Memory pool for pinned memory on the host. |
Memory hook¶
cupy.cuda.MemoryHook |
Base class of hooks for Memory allocations. |
cupy.cuda.memory_hooks.DebugPrintHook |
Memory hook that prints debug information. |
cupy.cuda.memory_hooks.LineProfileHook |
Code line CuPy memory profiler. |
Streams and events¶
cupy.cuda.Stream |
CUDA stream. |
cupy.cuda.get_current_stream |
Gets current CUDA stream. |
cupy.cuda.Event |
CUDA event, a synchronization point of CUDA streams. |
cupy.cuda.get_elapsed_time |
Gets the elapsed time between two events. |
Texture memory¶
cupy.cuda.texture.ChannelFormatDescriptor |
A class that holds the channel format description. |
cupy.cuda.texture.CUDAarray |
Allocate a CUDA array (cudaArray_t) that can be used as texture memory. |
cupy.cuda.texture.ResourceDescriptor |
A class that holds the resource description. |
cupy.cuda.texture.TextureDescriptor |
A class that holds the texture description. |
cupy.cuda.texture.TextureObject |
A class that holds a texture object. |
cupy.cuda.texture.TextureReference |
A class that holds a texture reference. |
Profiler¶
cupy.cuda.profile |
Enable CUDA profiling during with statement. |
cupy.cuda.profiler.initialize |
Initialize the CUDA profiler. |
cupy.cuda.profiler.start |
Enable profiling. |
cupy.cuda.profiler.stop |
Disable profiling. |
cupy.cuda.nvtx.Mark |
Marks an instantaneous event (marker) in the application. |
cupy.cuda.nvtx.MarkC |
Marks an instantaneous event (marker) in the application. |
cupy.cuda.nvtx.RangePush |
Starts a nested range. |
cupy.cuda.nvtx.RangePushC |
Starts a nested range. |
cupy.cuda.nvtx.RangePop |
Ends a nested range. |
NCCL¶
cupy.cuda.nccl.NcclCommunicator |
Initialize an NCCL communicator for one device controlled by one process. |
cupy.cuda.nccl.get_build_version |
|
cupy.cuda.nccl.get_version |
Returns the runtime version of NCCL. |
cupy.cuda.nccl.get_unique_id |
|
cupy.cuda.nccl.groupStart |
Start a group of NCCL calls. |
cupy.cuda.nccl.groupEnd |
End a group of NCCL calls. |
Runtime API¶
CuPy wraps CUDA Runtime APIs to provide the native CUDA operations. Please check the Original CUDA Runtime API document to use these functions.