Skip to content

Conversation

madsbk
Copy link
Member

@madsbk madsbk commented Sep 4, 2024

Depend on rapidsai/rmm#1665

Introduce the spill_oom_protection option that uses managed memory when spilling-on-demand would otherwise crash with an OOM error.

This targets our CUDF_SPILL users, which have workflows that cudf-spilling can handle generally but might encounter memory hotspots that sometime trigger an OOM crash. With CUDF_SPILL_OOM_PROTECTION, these hotspots will now use managed memory. If there is no such hotspots that CUDF_SPILL cannot handle, this option does nothing.

The target is not heavy oversubscribing workflows, in such cases using manager memory with prefetching is preferable.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • Update docstrings
  • The documentation is up to date with these changes.

@madsbk madsbk added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Sep 4, 2024
@github-actions github-actions bot added the Python Affects Python cuDF API. label Sep 4, 2024
@madsbk madsbk closed this Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant