vllm-project / vllm-gaudi Public

Notifications You must be signed in to change notification settings
Fork 51
Star 12

Code
Issues 1
Pull requests 58
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Pull requests: vllm-project/vllm-gaudi

Labels 12 Milestones 0

New pull request New

58 Open 337 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[WIP] Unified Attention - partial batch persistence

#400 opened Oct 13, 2025 by kzawora-intel • Draft

Unified Attention - High Level Profiler Integration

#399 opened Oct 13, 2025 by kzawora-intel

Loading…

Change max bucket while using conti pa + defrag

#397 opened Oct 13, 2025 by adobrzyn

Loading…

Use query in linear flags - seq as fallback option

#396 opened Oct 13, 2025 by adobrzyn

Loading…

Minor optimizationm for bucketing calc

#395 opened Oct 13, 2025 by michalkuligowski

Loading…

Sampling with rep penalty wa

#394 opened Oct 13, 2025 by michalkuligowski • Draft

Unified Attention Accuracy Bugfixes

#393 opened Oct 13, 2025 by kzawora-intel

Loading…

Docs installation, quick start and build fixes (#384) documentation

Improvements or additions to documentation

skip-gaudi-tests

#392 opened Oct 13, 2025 by PatrykWo

Loading…

Fix typo in installation.md: correct script name to install_nixl.py

#385 opened Oct 10, 2025 by yafshar

Loading…

Multi-image generation CI tests

#377 opened Oct 10, 2025 by MohitIntel

Loading…

defrag: tail-compaction that preserves hot-block locality and halves memcpy traffic

#376 opened Oct 9, 2025 by yafshar • Draft

Buckets from file - alpha version

#375 opened Oct 9, 2025 by adobrzyn

Loading…

2 of 5 tasks

Draft: Add Unified Attention UTs

#374 opened Oct 9, 2025 by kzawora-intel • Draft

Add non-unified attention numerics unit tests

#372 opened Oct 9, 2025 by kzawora-intel

Loading…

Unified attention improvemets

#363 opened Oct 9, 2025 by adobrzyn

Loading…

2 tasks done

Fix issue with async_scheduling when dealing with chunked input

#360 opened Oct 8, 2025 by tianmu-li

Loading…

Fix issue with async_scheduling when dealing with chunked input

#359 opened Oct 8, 2025 by tianmu-li

Loading…

[v0.10.2] update sha, was using v0.10.2rc3 sha due missing of v0.10.2

#356 opened Oct 8, 2025 by xuechendi

Loading…

Add missing prompt bucket to warmup, when max_ctx is 0

#352 opened Oct 8, 2025 by iboiko-habana

Loading…

Unit test for prefix caching in Gaudi plugin

#349 opened Oct 8, 2025 by iirzynska

Loading…

Update of VLLM_PROMPT_BS_BUCKET_MAX logic, real bs change, not only linear warmup

#348 opened Oct 8, 2025 by iboiko-habana

Loading…

Cherrypick cd docker fixes/commits from v0.10.2 to v0.11.0

#347 opened Oct 8, 2025 by nngokhale

Loading…

Install pytorch from habanalabs-installer

#345 opened Oct 8, 2025 by cabelo

Loading…

[0.11.0] [SW-241908] Omit all prompt buckets that exceed max_num_batched_tokens (#331) - cherry-pick

#343 opened Oct 8, 2025 by skavulya

Loading…

[0.10.2] [SW-241908] Omit all prompt buckets that exceed max_num_batched_tokens (#331) - cherry-pick

#342 opened Oct 8, 2025 by skavulya

Loading…

Previous 1 2 3 Next

Previous Next

ProTip! Filter pull requests by the default branch with base:main.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!