cupy.cuda.memory_hooks.LineProfileHook

class cupy.cuda.memory_hooks.LineProfileHook(max_depth=0)[source]

Code line CuPy memory profiler.

This profiler shows line-by-line GPU memory consumption using traceback module. But, note that it can trace only CPython level, no Cython level. ref. https://github.com/cython/cython/issues/1755

Example

Code example:

from cupy.cuda import memory_hooks
hook = memory_hooks.LineProfileHook()
with hook:
    # some CuPy codes
hook.print_report()

Output example:

_root (4.00KB, 4.00KB)
  lib/python3.6/unittest/__main__.py:18:<module> (4.00KB, 4.00KB)
    lib/python3.6/unittest/main.py:255:runTests (4.00KB, 4.00KB)
      tests/cupy_tests/test.py:37:test (1.00KB, 1.00KB)
      tests/cupy_tests/test.py:38:test (1.00KB, 1.00KB)
      tests/cupy_tests/test.py:39:test (2.00KB, 2.00KB)

Each line shows:

{filename}:{lineno}:{func_name} ({used_bytes}, {acquired_bytes})

where used_bytes is the memory bytes used from CuPy memory pool, and acquired_bytes is the actual memory bytes the CuPy memory pool acquired from GPU device. _root is a root node of the stack trace to show total memory usage.

Parameters

max_depth (int) – maximum depth to follow stack traces. Default is 0 (no limit).

Methods

__enter__(self)
__exit__(self, *_)
alloc_preprocess(self, **kwargs)[source]

Callback function invoked before allocating memory from GPU device.

Keyword Arguments
  • device_id (int) – CUDA device ID

  • mem_size (int) – Rounded memory bytesize to be allocated

malloc_preprocess(self, **kwargs)[source]

Callback function invoked before retrieving memory from memory pool.

Keyword Arguments
  • device_id (int) – CUDA device ID

  • size (int) – Requested memory bytesize to allocate

  • mem_size (int) – Rounded memory bytesize to be allocated

print_report(file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]

Prints a report of line memory profiling.

Attributes

alloc_postprocess

Callback function invoked after allocating memory from GPU device.

Keyword Arguments
  • device_id (int) – CUDA device ID

  • mem_size (int) – Rounded memory bytesize allocated

  • mem_ptr (int) – Obtained memory pointer. 0 if an error occurred in allocation.

free_postprocess

Callback function invoked after releasing memory to memory pool.

Keyword Arguments
  • device_id (int) – CUDA device ID

  • mem_size (int) – Memory bytesize

  • mem_ptr (int) – Memory pointer to free

  • pmem_id (int) – Pooled memory object ID.

free_preprocess

Callback function invoked before releasing memory to memory pool.

Keyword Arguments
  • device_id (int) – CUDA device ID

  • mem_size (int) – Memory bytesize

  • mem_ptr (int) – Memory pointer to free

  • pmem_id (int) – Pooled memory object ID.

malloc_postprocess

Callback function invoked after retrieving memory from memory pool.

Keyword Arguments
  • device_id (int) – CUDA device ID

  • size (int) – Requested memory bytesize to allocate

  • mem_size (int) – Rounded memory bytesize allocated

  • mem_ptr (int) – Obtained memory pointer. 0 if an error occurred in malloc.

  • pmem_id (int) – Pooled memory object ID. 0 if an error occurred in malloc.

name = 'LineProfileHook'