Low-Level CUDA Support¶
Device management¶
Object that represents a CUDA device. |
Memory management¶
Returns CuPy default memory pool for GPU memory. |
|
Returns CuPy default memory pool for pinned memory. |
|
Memory allocation on a CUDA device. |
|
CUDA memory that is not owned by CuPy. |
|
Pinned memory allocation on host. |
|
Pointer to a point on a device memory. |
|
Pointer of a pinned memory. |
|
Calls the current allocator. |
|
Calls the current allocator. |
|
Returns the current allocator for GPU memory. |
|
Sets the current allocator for GPU memory. |
|
Sets a thread-local allocator for GPU memory inside |
|
Sets the current allocator for the pinned memory. |
|
Memory pool for all GPU devices on the host. |
|
Memory pool for pinned memory on the host. |
|
Allocator with python functions to perform memory allocation. |
Memory hook¶
Base class of hooks for Memory allocations. |
|
Memory hook that prints debug information. |
|
Code line CuPy memory profiler. |
Streams and events¶
CUDA stream. |
|
CUDA stream. |
|
Gets current CUDA stream. |
|
CUDA event, a synchronization point of CUDA streams. |
|
Gets the elapsed time between two events. |
Texture and surface memory¶
A class that holds the channel format description. |
|
Allocate a CUDA array (cudaArray_t) that can be used as texture memory. |
|
A class that holds the resource description. |
|
A class that holds the texture description. |
|
A class that holds a texture object. |
|
A class that holds a surface object. |
|
A class that holds a texture reference. |
Profiler¶
Enable CUDA profiling during with statement. |
|
Initialize the CUDA profiler. |
|
Enable profiling. |
|
Disable profiling. |
|
Marks an instantaneous event (marker) in the application. |
|
Marks an instantaneous event (marker) in the application. |
|
Starts a nested range. |
|
Starts a nested range. |
|
Ends a nested range. |
NCCL¶
Initialize an NCCL communicator for one device controlled by one process. |
|
Returns the runtime version of NCCL. |
|
Start a group of NCCL calls. |
|
End a group of NCCL calls. |
Runtime API¶
CuPy wraps CUDA Runtime APIs to provide the native CUDA operations. Please check the Original CUDA Runtime API document to use these functions.