class cupy.cuda.MemoryAsyncPool(pool_handles='current')

(Experimental) CUDA memory pool for all GPU devices on the host.

A memory pool preserves any allocations even if they are freed by the user. One instance of this class can be used for multiple devices. This class uses CUDA’s Stream Ordered Memory Allocator (supported on CUDA 11.2+). The simplest way to use this pool as CuPy’s default allocator is the following code:


Using this feature requires CUDA >= 11.2 with a supported GPU and platform. If it is not supported, an error will be raised.

The current CuPy stream is used to allocate/free the memory.


pool_handles (str or int) – A flag to indicate which mempool to use. ‘default’ is for the device’s default mempool, ‘current’ is for the current mempool (which could be the default one), and an int that represents cudaMemPool_t created from elsewhere for an external mempool. A list consisting of these flags can also be accepted, in which case the list length must equal to the total number of visible devices so that the mempools for each device can be set independently.


This feature is currently experimental and subject to change.


MemoryAsyncPool currently cannot work with memory hooks.


free_all_blocks(self, stream=None)
free_bytes(self) size_t
get_limit(self) size_t
malloc(self, size_t size) MemoryPointer

Allocate memory from the current device’s pool on the current stream.

This method can be used as a CuPy memory allocator. The simplest way to use a memory pool as the default allocator is the following code:


size (int) – Size of the memory buffer to allocate in bytes.


Pointer to the allocated buffer.

Return type


n_free_blocks(self) size_t
set_limit(self, size=None, fraction=None)
total_bytes(self) size_t
used_bytes(self) size_t
__eq__(value, /)

Return self==value.

__ne__(value, /)

Return self!=value.

__lt__(value, /)

Return self<value.

__le__(value, /)

Return self<=value.

__gt__(value, /)

Return self>value.

__ge__(value, /)

Return self>=value.