You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First, I want to express my appreciation for your excellent work on this library - TRL has become an indispensable tool in our LLM development pipeline.
Proposal
I'd like to suggest an enhancement to improve flexibility when using vLLM with online methods. Currently, the vLLM client constructs its endpoint URL by combining host (IP address) and port parameters. To better support dynamic environments and scalable inference, could we add direct support for URL specification during client initialization?
Benefits
1.Infrastructure Agnosticism : Supports proxy setups, load balancers, and service meshes
2.K8s Friendliness : Aligns with cloud-native practices using stable service endpoints
3. Large Scale Model Support: A large model server deploy on more than one node, with DP+TP (#3310), will be possible
Use Case
Our RL data pipeline requires:
1.Developers to spin up ephemeral vLLM instances
2.Automatic registration with our proxy service
3.Client configuration using fixed proxy URL rather than instance-specific IPs
Motivation
In our Kubernetes-based infrastructure:
vLLM servers receive dynamic IP assignments on each deployment
We use a stable proxy service that routes requests to the current vLLM endpoint
The current IP+port binding requires manual updates whenever servers restart
A URL-based configuration would provide permanent endpoint addressing
Your contribution
Would this be a valuable addition to the library? I'm happy contribute this feature 😄 #3324
Thank you for reading this request!
The text was updated successfully, but these errors were encountered:
Feature request
Hello TRL Team!
First, I want to express my appreciation for your excellent work on this library - TRL has become an indispensable tool in our LLM development pipeline.
Proposal
I'd like to suggest an enhancement to improve flexibility when using vLLM with online methods. Currently, the vLLM client constructs its endpoint URL by combining host (IP address) and port parameters. To better support dynamic environments and scalable inference, could we add direct support for URL specification during client initialization?
Benefits
1.Infrastructure Agnosticism : Supports proxy setups, load balancers, and service meshes
2.K8s Friendliness : Aligns with cloud-native practices using stable service endpoints
3. Large Scale Model Support: A large model server deploy on more than one node, with DP+TP (#3310), will be possible
Use Case
Our RL data pipeline requires:
1.Developers to spin up ephemeral vLLM instances
2.Automatic registration with our proxy service
3.Client configuration using fixed proxy URL rather than instance-specific IPs
Motivation
In our Kubernetes-based infrastructure:
vLLM servers receive dynamic IP assignments on each deployment
We use a stable proxy service that routes requests to the current vLLM endpoint
The current IP+port binding requires manual updates whenever servers restart
A URL-based configuration would provide permanent endpoint addressing
Your contribution
Would this be a valuable addition to the library? I'm happy contribute this feature 😄
#3324
Thank you for reading this request!
The text was updated successfully, but these errors were encountered: