cupy.cuda.ExternalStream#

class cupy.cuda.ExternalStream(ptr, device_id=-1)[source]#

CUDA stream not managed by CuPy.

This class allows to use external streams in CuPy by providing the stream pointer obtained from the CUDA runtime call. The user is in charge of managing the life-cycle of the stream.

Parameters:
  • ptr (intptr_t) – Address of the cudaStream_t object.

  • device_id (int) – The ID of the device that the stream was created on. Default is -1, indicating it is unknown.

Variables:
  • ~Stream.ptr (intptr_t) – Raw stream handle.

  • ~Stream.device_id (int) – The ID of the device that the stream was created on. The value -1 is used to indicate it is unknown.

Warning

If device_id is not specified, the user is required to ensure legal operations of the stream. Specifically, the stream must be used on the device that it was created on.

Methods

__enter__(self)#
__exit__(self, *args)#
add_callback(self, callback, arg)#

Adds a callback that is called when all queued work is done.

Parameters:
  • callback (function) – Callback function. It must take three arguments (Stream object, int error status, and user data object), and returns nothing.

  • arg (object) – Argument to the callback.

Note

Whenever possible, use the launch_host_func() method instead of this one, as it may be deprecated and removed from CUDA at some point.

begin_capture(self, mode=None)#

Begin stream capture to construct a CUDA graph.

A call to this function must be paired with a call to end_capture() to complete the capture.

# create a non-blocking stream for the purpose of capturing
s1 = cp.cuda.Stream(non_blocking=True)
with s1:
    s1.begin_capture()
    # ... perform operations to construct a graph ...
    g = s1.end_capture()

# the returned graph can be launched on any stream (including s1)
g.launch(stream=s1)
s1.synchronize()

s2 = cp.cuda.Stream()
with s2:
    g.launch()
s2.synchronize()
Parameters:

mode (int) – The stream capture mode. Default is streamCaptureModeRelaxed.

Note

During the stream capture, synchronous device-host transfers are not allowed. This has a particular implication for CuPy APIs, as some functions that internally require synchronous transfer would not work as expected and an exception would be raised. For further constraints of CUDA stream capture, please refer to the CUDA Programming Guide.

Note

Currently this capability is not supported on HIP.

end_capture(self)#

End stream capture and retrieve the constructed CUDA graph.

Returns:

A CUDA graph object that encapsulates the captured work.

Return type:

cupy.cuda.Graph

Note

Currently this capability is not supported on HIP.

is_capturing(self)#

Check if the stream is capturing.

Returns:

If the capturing status is successfully queried, the returned value indicates the capturing status. An exception could be raised if such a query is illegal, please refer to the CUDA Programming Guide for detail.

Return type:

bool

launch_host_func(self, callback, arg)#

Launch a callback on host when all queued work is done.

Parameters:
  • callback (function) – Callback function. It must take only one argument (user data object), and returns nothing.

  • arg (object) – Argument to the callback.

Note

Whenever possible, this method is recommended over add_callback(), which may be deprecated and removed from CUDA at some point.

record(self, event=None)#

Records an event on the stream.

Parameters:

event (None or cupy.cuda.Event) – CUDA event. If None, then a new plain event is created and used.

Returns:

The recorded event.

Return type:

cupy.cuda.Event

synchronize(self)#

Waits for the stream completing all queued work.

use(self)#

Makes this stream current.

If you want to switch a stream temporarily, use the with statement.

wait_event(self, event)#

Makes the stream wait for an event.

The future work on this stream will be done after the event.

Parameters:

event (cupy.cuda.Event) – CUDA event.

__eq__(self, other)#
__ne__(value, /)#

Return self!=value.

__lt__(value, /)#

Return self<value.

__le__(value, /)#

Return self<=value.

__gt__(value, /)#

Return self>value.

__ge__(value, /)#

Return self>=value.

Attributes

done#

True if all work on this stream has been done.

is_non_blocking#

True if the stream is non_blocking. False indicates the default stream creation flag.

priority#

Query the priority of a stream.