cupy.cuda.MemoryAsyncPool#
- class cupy.cuda.MemoryAsyncPool(pool_handles='current')[source]#
(Experimental) CUDA memory pool for all GPU devices on the host.
A memory pool preserves any allocations even if they are freed by the user. One instance of this class can be used for multiple devices. This class uses CUDA’s Stream Ordered Memory Allocator (supported on CUDA 11.2+). The simplest way to use this pool as CuPy’s default allocator is the following code:
set_allocator(MemoryAsyncPool().malloc)
Using this feature requires CUDA >= 11.2 with a supported GPU and platform. If it is not supported, an error will be raised.
The current CuPy stream is used to allocate/free the memory.
- Parameters:
pool_handles (str or int) – A flag to indicate which mempool to use. ‘default’ is for the device’s default mempool, ‘current’ is for the current mempool (which could be the default one), and an int that represents
cudaMemPool_t
created from elsewhere for an external mempool. A list consisting of these flags can also be accepted, in which case the list length must equal to the total number of visible devices so that the mempools for each device can be set independently.
Warning
This feature is currently experimental and subject to change.
Note
MemoryAsyncPool
currently cannot work with memory hooks.See also
Methods
- free_all_blocks(self, stream=None)#
Releases free memory.
- Parameters:
stream (cupy.cuda.Stream) – Release memory freed on the given
stream
. Ifstream
isNone
, the current stream is used.
See also
- free_bytes(self) size_t #
Gets the total number of bytes acquired but not used by the pool.
- Returns:
The total number of bytes acquired but not used by the pool.
- Return type:
- get_limit(self) size_t #
Gets the upper limit of memory allocation of the current device.
- Returns:
The number of bytes
- Return type:
Note
Unlike with
MemoryPool
,MemoryAsyncPool
’sset_limit()
method can only impose a soft limit. If other (non-CuPy) applications are also allocating memory from the same mempool, this limit may not be respected.
- malloc(self, size_t size) MemoryPointer #
Allocate memory from the current device’s pool on the current stream.
This method can be used as a CuPy memory allocator. The simplest way to use a memory pool as the default allocator is the following code:
set_allocator(MemoryAsyncPool().malloc)
- Parameters:
size (int) – Size of the memory buffer to allocate in bytes.
- Returns:
Pointer to the allocated buffer.
- Return type:
- n_free_blocks(self) size_t #
- set_limit(self, size=None, fraction=None)#
Sets the upper limit of memory allocation of the current device.
When fraction is specified, its value will become a fraction of the amount of GPU memory that is available for allocation. For example, if you have a GPU with 2 GiB memory, you can either use
set_limit(fraction=0.5)
orset_limit(size=1024**3)
to limit the memory size to 1 GiB.size
andfraction
cannot be specified at the same time. If both of them are not specified or0
is specified, the limit will be disabled.Note
Unlike with
MemoryPool
,MemoryAsyncPool
’sset_limit()
method can only impose a soft limit. If other (non-CuPy) applications are also allocating memory from the same mempool, this limit may not be respected. Internally, this limit is set via thecudaMemPoolAttrReleaseThreshold
attribute.Note
You can also set the limit by using
CUPY_GPU_MEMORY_LIMIT
environment variable, see Environment variables for the details. The limit set by this method supersedes the value specified in the environment variable.Also note that this method only changes the limit for the current device, whereas the environment variable sets the default limit for all devices.
- total_bytes(self) size_t #
Gets the total number of bytes acquired by the pool.
- Returns:
The total number of bytes acquired by the pool.
- Return type:
- used_bytes(self) size_t #
Gets the total number of bytes used by the pool.
- Returns:
The total number of bytes used by the pool.
- Return type:
- __eq__(value, /)#
Return self==value.
- __ne__(value, /)#
Return self!=value.
- __lt__(value, /)#
Return self<value.
- __le__(value, /)#
Return self<=value.
- __gt__(value, /)#
Return self>value.
- __ge__(value, /)#
Return self>=value.
Attributes
- memoryAsyncHasStat#