Open
Description
This issue tracks progress on running Bamba on vLLM.
Success for this issue implies the following:
- Running the model successfully from the HF checkpoint in vLLM (Add Bamba Model vllm-project/vllm#10909)
- Ensuring chunked prefill and TP work in vLLM
- Closing the performance gap in vLLM wrt Llama of similar sizes
- Reporting the performance results in a blog post
Metadata
Metadata
Assignees
Labels
No labels