cupy.cuda.memory_hooks.LineProfileHook#

class cupy.cuda.memory_hooks.LineProfileHook(max_depth=0)[source]#

Code line CuPy memory profiler.

This profiler shows line-by-line GPU memory consumption using traceback module. But, note that it can trace only CPython level, no Cython level. ref. cython/cython#1755

Example

Code example:

from cupy.cuda import memory_hooks
hook = memory_hooks.LineProfileHook()
with hook:
    # some CuPy codes
hook.print_report()

Output example:

_root (4.00KB, 4.00KB)
  lib/python3.6/unittest/__main__.py:18:<module> (4.00KB, 4.00KB)
    lib/python3.6/unittest/main.py:255:runTests (4.00KB, 4.00KB)
      tests/cupy_tests/test.py:37:test (1.00KB, 1.00KB)
      tests/cupy_tests/test.py:38:test (1.00KB, 1.00KB)
      tests/cupy_tests/test.py:39:test (2.00KB, 2.00KB)

Each line shows:

{filename}:{lineno}:{func_name} ({used_bytes}, {acquired_bytes})

where used_bytes is the memory bytes used from CuPy memory pool, and acquired_bytes is the actual memory bytes the CuPy memory pool acquired from GPU device. _root is a root node of the stack trace to show total memory usage.

Parameters: max_depth (int) – maximum depth to follow stack traces. Default is 0 (no limit).

Methods

__enter__(self)#

__exit__(self, *_)#

alloc_postprocess(self, **kwargs)#

Callback function invoked after allocating memory from GPU device.

Keyword Arguments

device_id (int) – CUDA device ID
mem_size (int) – Rounded memory bytesize allocated
mem_ptr (int) – Obtained memory pointer. 0 if an error occurred in allocation.

alloc_preprocess(self, **kwargs)[source]#

Callback function invoked before allocating memory from GPU device.

Keyword Arguments

device_id (int) – CUDA device ID
mem_size (int) – Rounded memory bytesize to be allocated

free_postprocess(self, **kwargs)#

Callback function invoked after releasing memory to memory pool.

Keyword Arguments

device_id (int) – CUDA device ID
mem_size (int) – Memory bytesize
mem_ptr (int) – Memory pointer to free
pmem_id (int) – Pooled memory object ID.

free_preprocess(self, **kwargs)#

Callback function invoked before releasing memory to memory pool.

Keyword Arguments

device_id (int) – CUDA device ID
mem_size (int) – Memory bytesize
mem_ptr (int) – Memory pointer to free
pmem_id (int) – Pooled memory object ID.

malloc_postprocess(self, **kwargs)#

Callback function invoked after retrieving memory from memory pool.

Keyword Arguments

device_id (int) – CUDA device ID
size (int) – Requested memory bytesize to allocate
mem_size (int) – Rounded memory bytesize allocated
mem_ptr (int) – Obtained memory pointer. 0 if an error occurred in malloc.
pmem_id (int) – Pooled memory object ID. 0 if an error occurred in malloc.

malloc_preprocess(self, **kwargs)[source]#

Callback function invoked before retrieving memory from memory pool.

Keyword Arguments

device_id (int) – CUDA device ID
size (int) – Requested memory bytesize to allocate
mem_size (int) – Rounded memory bytesize to be allocated

print_report(file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]#: Prints a report of line memory profiling.

__eq__(value, /)#: Return self==value.

__ne__(value, /)#: Return self!=value.

__lt__(value, /)#: Return self<value.

__le__(value, /)#: Return self<=value.

__gt__(value, /)#: Return self>value.

__ge__(value, /)#: Return self>=value.

Attributes

name = 'LineProfileHook'#