-
Notifications
You must be signed in to change notification settings - Fork 6.3k
[Core] Remote placement using gpu memory #26929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
If this feature request is approved, I am happy to work on it and create a pull request. |
Hi @fostiropoulos, sorry for the late reply. Could you elaborate why the fractional GPU placement strategy is not suitable? and why it's not accelerator agnostic? cc @cadedaniel |
+1. Also, assigning a fixed memory amount to a task or actor comes with user experience problems: what happens if the task or actor consumes more than 20 megabytes of GPU memory? Ray currently defers management of GPU memory to the user code / application. If you give users the option to specify XX amount of megabytes, then they'll be surprised when Ray does nothing to prevent their code from exceeding that budget. AFAIK this is a big part of why Ray has stuck to fractional placement for common accelerators like GPUs -- it is only a signal for scheduling tasks and actors and places no constraints on the application code. |
Example 1: System 1: Has GPU A100 (memory 40GB) Having a remote that uses 10GB of memory, I will need to specify num_gpus 0.25 for system1 and 0.5 for system 2, which would make my code not work out of the box on both systems and would either require a user configurable attribute or for the programmer to detect the memory available for a GPU and calculate the GPU fractional allocation on the fly (the feature I am suggesting) Example 2: |
@cadedaniel maybe my initial post was misunderstood. My clarification with examples above can help explain. The GPU memory management and correct resource allocation is still at the user's burden. |
Thanks @fostiropoulos, your examples make sense to me and I feel it's something reasonable to be supported inside Ray. Let me add this to the backlog and chat with the team to see when and whether we want to do it. |
Duplicate of #37574 |
Hi, want to quick update on this. So we have REP and prototype ready for review. Please try out and leave feedback! |
@fostiropoulos did you have chance to check the REP and try the prototype? |
Description
When running ray on machines with different type of GPU accelerators the fractional GPU placement strategy is not suitable. Instead allow for specifying mb of gpu for example.
Additionally the code is not accelerator agnostic and requires writing boiler plate code to determine the fractional gpu to be used even if all accelerators are the same for a given machine.
Use case
This applies for both making the ray remote code work more portable as well as improving GPU utilization for various applications when there is inconsistent GPU types across a cluster.
The text was updated successfully, but these errors were encountered: