Custom kernels

cupy.ElementwiseKernel(in_params, …[, …])

User-defined elementwise kernel.

cupy.ReductionKernel(unicode in_params, …)

User-defined reduction kernel.

cupy.RawKernel(unicode code, unicode name, …)

User-defined custom kernel.

cupy.RawModule(unicode code=None, *, …[, …])

User-defined custom module.

cupy.fuse(*args, **kwargs)

Decorator that fuses a function.

JIT kernel definition

cupyx.jit.rawkernel([mode])

A decorator compiles a Python function into CUDA kernel.

cupyx.jit.threadIdx

dim3 threadIdx

cupyx.jit.blockDim

dim3 blockDim

cupyx.jit.blockIdx

dim3 blockIdx

cupyx.jit.gridDim

dim3 gridDim

cupyx.jit.grid

Compute the thread index in the grid.

cupyx.jit.gridsize

Compute the grid size.

cupyx.jit.laneid

Returns the lane ID of the calling thread, ranging in [0, jit.warpsize).

cupyx.jit.warpsize

Returns the number of threads in a warp.

cupyx.jit.syncthreads

Calls __syncthreads().

cupyx.jit.syncwarp

Calls __syncwarp().

cupyx.jit.shfl_sync

Calls the __shfl_sync function.

cupyx.jit.shfl_up_sync

Calls the __shfl_up_sync function.

cupyx.jit.shfl_down_sync

Calls the __shfl_down_sync function.

cupyx.jit.shfl_xor_sync

Calls the __shfl_xor_sync function.

cupyx.jit.shared_memory

Allocates shared memory and returns the 1-dim array.

cupyx.jit.atomic_add

Calls the atomicAdd function to operate atomically on array[index].

cupyx.jit.atomic_sub

Calls the atomicSub function to operate atomically on array[index].

cupyx.jit.atomic_exch

Calls the atomicExch function to operate atomically on array[index].

cupyx.jit.atomic_min

Calls the atomicMin function to operate atomically on array[index].

cupyx.jit.atomic_max

Calls the atomicMax function to operate atomically on array[index].

cupyx.jit.atomic_inc

Calls the atomicInc function to operate atomically on array[index].

cupyx.jit.atomic_dec

Calls the atomicDec function to operate atomically on array[index].

cupyx.jit.atomic_cas

Calls the atomicCAS function to operate atomically on array[index].

cupyx.jit.atomic_and

Calls the atomicAnd function to operate atomically on array[index].

cupyx.jit.atomic_or

Calls the atomicOr function to operate atomically on array[index].

cupyx.jit.atomic_xor

Calls the atomicXor function to operate atomically on array[index].

cupyx.jit._interface._JitRawKernel(func, mode)

JIT CUDA kernel object.

Kernel binary memoization

cupy.memoize(bool for_each_device=False)

Makes a function memoizing the result for each argument and device.

cupy.clear_memo()

Clears the memoized results for all functions decorated by memoize.