cupyx.distributed.array.DistributedArray#
- class cupyx.distributed.array.DistributedArray(self, shape, dtype, chunks_map, mode=REPLICA, comms=None)[source]#
Multi-dimensional array distributed across multiple CUDA devices.
This class implements some elementary operations that
cupy.ndarrayprovides. The array content is split into chunks, contiguous arrays corresponding to slices of the original array. Note that one device can hold multiple chunks.This direct constructor is designed for internal calls. Users should create distributed arrays using
distributed_array().- Parameters:
shape (tuple of ints) – Shape of created array.
dtype (dtype_like) – Any object that can be interpreted as a numpy data type.
chunks_map (dict from int to list of chunks) – Lists of chunk objects associated with each device.
mode (mode object, optional) – Mode that determines how overlaps of the chunks are interpreted. Defaults to
cupyx.distributed.array.REPLICA.comms (optional) – Communicator objects which a distributed array hold internally. Sharing them with other distributed arrays can save time because their initialization is a costly operation.
- Return type:
See also
DistributedArray.modefor details about modes.Methods
- all_chunks()[source]#
Return the chunks with all buffered data flushed.
Buffered data are created in situations such as resharding and mode changing.
- Return type:
dict[int, list[cupy.ndarray]]
- change_mode(mode)[source]#
Return a view or a copy in the given mode.
- Parameters:
mode (mode Object) – How overlaps of the chunks are interpreted.
- Return type:
See also
DistributedArray.modefor details about modes.
- get(stream=None, order='C', out=None, blocking=True)[source]#
Return a copy of the array on the host memory.
- Return type:
- max(axis=None, out=None, keepdims=False)[source]#
Return the maximum along a given axis.
Note
Currently, it only supports non-
Nonevalues foraxisand the default values foroutandkeepdims.See also
- mdspan(self, *, index_type, allow_unsafe=False)#
Returns an mdspan view of the array for use in CUDA kernels.
This method creates a view of the CuPy array that is compatible with
cuda::std::mdspanfor use in custom CUDA kernels.- Parameters:
index_type (dtype) – The data type for extent and stride indices. Must be either
cupy.int32orcupy.int64. Ifcupy.int32is specified, the array size must not exceedINT32_MAX.allow_unsafe (bool) – If True, allows creating an mdspan for arrays that have either zero or negative strides, or one or more dimensions of size zero. Depending on the access pattern such mdspan may lead to undefined behavior when described with either a left-, right, or stride- layout_stride as per C++ standard. For example,
mdspan.required_span_size()might become negative. Default is False.
- Returns:
An mdspan view of the array that can be passed to CUDA kernels as a kernel argument.
- Return type:
mdspan
- Raises:
ValueError – If
index_typeis notcupy.int32orcupy.int64, or if the array size exceeds the range of the specifiedindex_type.
Note
The returned mdspan can work with either
layout_stride,layout_left, orlayout_right, but your kernel must declare the mdspan type with all extents being dynamic:template<typename T, typename IndexType> __global__ void my_kernel( cuda::std::mdspan< T, cuda::std::extents< IndexType, cuda::std::dynamic_extent, cuda::std::dynamic_extent>, cuda::std::layout_stride> arr ) { // arr is a 2D mdspan // Access: arr(i, j) }
Static extents are NOT supported. For example, using
extents<int, 4, 8>will result in undefined behavior.Example
>>> import cupy >>> a = cupy.arange(12, dtype=cupy.float32).reshape(3, 4) >>> a_mdspan = a.mdspan(index_type=cupy.int64) >>> # Pass a_mdspan to a cupy.RawKernel expecting: >>> # mdspan<float, extents<int64_t, dyn, dyn>, layout_stride>
- min(axis=None, out=None, keepdims=False)[source]#
Return the minimum along a given axis.
Note
Currently, it only supports non-
Nonevalues foraxisand the default values foroutandkeepdims.See also
- prod(axis=None, dtype=None, out=None, keepdims=None)[source]#
Return the minimum along a given axis.
Note
Currently, it only supports non-
Nonevalues foraxisand the default values foroutandkeepdims.See also
- reshard(index_map)[source]#
Return a view or a copy having the given index_map.
Data transfers across devices are done on separate streams created internally. To make them asynchronous, transferred data is buffered and reflected to the chunks when necessary.
- Parameters:
index_map (dict from int to array indices) – Indices for the chunks that devices with designated IDs own. The current index_map of a distributed array can be obtained from
DistributedArray.index_map.- Return type:
- sum(axis=None, dtype=None, out=None, keepdims=False)[source]#
Return the minimum along a given axis.
Note
Currently, it only supports non-
Nonevalues foraxisand the default values foroutandkeepdims.See also
- __eq__(value, /)#
Return self==value.
- __ne__(value, /)#
Return self!=value.
- __lt__(value, /)#
Return self<value.
- __le__(value, /)#
Return self<=value.
- __gt__(value, /)#
Return self>value.
- __ge__(value, /)#
Return self>=value.
- __bool__()#
True if self else False
Attributes
- T#
Not supported.
- base#
Not supported.
- cstruct#
Not supported.
- data#
Not supported.
- device#
Not supported.
- devices#
A collection of device IDs holding part of the data.
- dtype#
- flags#
Not supported.
- flat#
Not supported.
- imag#
Not supported.
- index_map#
Indices for the chunks that devices with designated IDs own.
- itemsize#
Size of each element in bytes.
See also
- mT#
Matrix-transpose view of the array.
If ndim < 2, raise a ValueError.
- mode#
Describe how overlaps of the chunks are interpreted.
In the replica mode, chunks are guaranteed to have identical values on their overlapping segments. In other modes, they are not necessarily identical and represent the original data as their max, sum, etc.
DistributedArraycurrently supportscupyx.distributed.array.REPLICA,cupyx.distributed.array.MIN,cupyx.distributed.array.MAX,cupyx.distributed.array.SUM,cupyx.distributed.array.PRODmodes.Many operations on distributed arrays including
cupy.ufuncandmatmul()involve changing their mode beforehand. These mode conversions are done automatically, so in most cases users do not have to manage modes manually.Example
>>> A = distributed_array( ... cupy.arange(6).reshape(2, 3), ... make_2d_index_map([0, 2], [0, 1, 3], ... [[{0}, {1, 2}]])) >>> B = distributed_array( ... cupy.arange(12).reshape(3, 4), ... make_2d_index_map([0, 1, 3], [0, 2, 4], ... [[{0}, {0}], ... [{1}, {2}]])) >>> C = A @ B >>> C array([[20, 23, 26, 29], [56, 68, 80, 92]]) >>> C.mode 'sum' >>> C.all_chunks() {0: [array([[0, 0], [0, 3]]), # left half array([[0, 0], [6, 9]])], # right half 1: [array([[20, 23], [56, 65]])], # left half 2: [array([[26, 29], [74, 83]])]} # right half >>> C_replica = C.change_mode('replica') >>> C_replica.mode 'replica' >>> C_replica.all_chunks() {0: [array([[20, 23], [56, 68]]), # left half array([[26, 29], [80, 92]])], # right half 1: [array([[20, 23], [56, 68]])], # left half 2: [array([[26, 29], [80, 92]])]} # right half
- nbytes#
Total size of all elements in bytes.
It does not count skips between elements.
See also
- ndim#
Number of dimensions.
a.ndimis equivalent tolen(a.shape).See also
- real#
Not supported.
- shape#
Tuple of array dimensions.
Assignment to this property is currently not supported.
- size#
- strides#
Not supported.