cupy.cuda.MemoryAsyncPool#

class cupy.cuda.MemoryAsyncPool(pool_handles='current')[source]#

(Experimental) CUDA memory pool for all GPU devices on the host.

A memory pool preserves any allocations even if they are freed by the user. One instance of this class can be used for multiple devices. This class uses CUDA’s Stream Ordered Memory Allocator (supported on CUDA 11.2+). The simplest way to use this pool as CuPy’s default allocator is the following code:

set_allocator(MemoryAsyncPool().malloc)

Using this feature requires CUDA >= 11.2 with a supported GPU and platform. If it is not supported, an error will be raised.

The current CuPy stream is used to allocate/free the memory.

Parameters:: pool_handles (str or int) – A flag to indicate which mempool to use. ‘default’ is for the device’s default mempool, ‘current’ is for the current mempool (which could be the default one), and an int that represents cudaMemPool_t created from elsewhere for an external mempool. A list consisting of these flags can also be accepted, in which case the list length must equal to the total number of visible devices so that the mempools for each device can be set independently.

Warning

This feature is currently experimental and subject to change.

Note

MemoryAsyncPool currently cannot work with memory hooks.

See also

Physical Page Caching Behavior

free_bytes(self) → size_t#

Gets the total number of bytes acquired but not used by the pool.

Returns:: The total number of bytes acquired but not used by the pool.
Return type:: int

get_limit(self) → size_t#

Gets the upper limit of memory allocation of the current device.

Returns:: The number of bytes
Return type:: int

Note

Unlike with MemoryPool, MemoryAsyncPool’s set_limit() method can only impose a soft limit. If other (non-CuPy) applications are also allocating memory from the same mempool, this limit may not be respected.

malloc(self, size_t size) → MemoryPointer#

Allocate memory from the current device’s pool on the current stream.

This method can be used as a CuPy memory allocator. The simplest way to use a memory pool as the default allocator is the following code:

set_allocator(MemoryAsyncPool().malloc)

Parameters:: size (int) – Size of the memory buffer to allocate in bytes.
Returns:: Pointer to the allocated buffer.
Return type:: MemoryPointer

n_free_blocks(self) → size_t#

set_limit(self, size=None, fraction=None)#

Sets the upper limit of memory allocation of the current device.

When fraction is specified, its value will become a fraction of the amount of GPU memory that is available for allocation. For example, if you have a GPU with 2 GiB memory, you can either use set_limit(fraction=0.5) or set_limit(size=1024**3) to limit the memory size to 1 GiB.

size and fraction cannot be specified at the same time. If both of them are not specified or 0 is specified, the limit will be disabled.

Note

You can also set the limit by using CUPY_GPU_MEMORY_LIMIT environment variable, see Environment variables for the details. The limit set by this method supersedes the value specified in the environment variable.

Also note that this method only changes the limit for the current device, whereas the environment variable sets the default limit for all devices.

Parameters:

size (int) – Limit size in bytes.
fraction (float) – Fraction in the range of [0, 1].

total_bytes(self) → size_t#

Gets the total number of bytes acquired by the pool.

Returns:: The total number of bytes acquired by the pool.
Return type:: int

used_bytes(self) → size_t#

Gets the total number of bytes used by the pool.

Returns:: The total number of bytes used by the pool.
Return type:: int

__eq__(value, /)#: Return self==value.

__ne__(value, /)#: Return self!=value.

__lt__(value, /)#: Return self<value.

__le__(value, /)#: Return self<=value.

__gt__(value, /)#: Return self>value.

__ge__(value, /)#: Return self>=value.

Attributes

memoryAsyncHasStat#