How "vllm-server" option is meant to work? #17079
Unanswered
About-to-break
asked this question in
Q&A
Replies: 1 comment
-
|
Hello, "vl_rec_backend" refers to the inference backend, which is essentially a method of performing inference. The "vl_rec_server_url" is the link to the started vLLM service. It can be configured to point to any vLLM service—as long as the model is integrated with vLLM and the service is launched, this link will be available. There is no scenario where an API key is missing. We recommend first familiarizing yourself with the specific process of deploying vLLM. To use a custom visual language model, simply launch a service based on that model and enter the service address into the vl_rec_server_url field. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Sorry if this was asked before or me being dumb.
When initializing
PaddleOCRVLclass, there are allowed options like:"vl_rec_backend": "vllm-server"and"vl_rec_server_url": "http://gpu_server.local:8000/v1",.How is it supposed to be used if i cannot specify an
API keyfor vLLM or the model name? I wanna use vLLM to run a custom compatible VL and have benefits of Paddle's own ROI detectors.And how even this is supposed to work? Like i get all detected ROI's transalted by VL one by one and get a full result?
Beta Was this translation helpful? Give feedback.
All reactions