Skip to content

Conversation

matthewygf
Copy link
Collaborator

@matthewygf matthewygf commented Aug 15, 2025

This PR adds Ascend NPU 910b series initial support to LMCache v0.3.3 :

  • support for multi_layer and single_layer transfer for vLLM connector

Changes Made

  1. Added custom CMakeBuild system support based on Ascend CANN Toolkit 8.2
  2. Kernels support for multi_layer and single_layer transfer
  3. Added a host register API in the pybind11 interface for Ascend due to current python API limitation
  4. Extend the MixedMemory and PinnedMemory Allocator for hostregister API
  5. Monkey patched vllm_adapter for init_cache_engine for config check and used the extended classes
  6. Made changes to the test_*.py that uses the necessary NPU APIs to run successfully, all unit tests passed on our platform except the skipped ones, and require non-supported kernels.
  7. We have included a dockerfile for our A2 series
  8. Introduce a Dynamic LMCacheAscendConnector that extend the LMCacheConnector for the serve entry point and patching the necessary c_ops and functions

related issue
related pr

Pre-requisite

NPU Driver version >= 24.1

CANN Tool-kit version >= 8.2

Installed compatible version of vllm-ascend, torch-npu

We currently support the following vLLM-ascend versions:

  • v0.9.2

Build Instructions

pip install --no-build-isolation -v -e .

Test Coverage

We ran the unit tests successfully for the main patched classes.

Related PR

This following PR in vllm-ascend is related to support LMCache

Roadmap

We currently only support eagermode execution. We plan to have the following features in the near future:

  • Cachegen
  • Cacheblend
  • Graph Mode with torch_npu torchair
  • P/D with NPU Transport

gfmyeung and others added 2 commits August 15, 2025 10:51
Signed-off-by: matthewygf <yyygggfff@hotmail.com>

Co-authored-by: Marco Barletta <barlettamarco8@gmail.com>

Co-authored-by: chloroethylene <jjysama@gmail.com>
Signed-off-by: matthewygf <yyygggfff@hotmail.com>
@matthewygf matthewygf assigned YaoJiayi and matthewygf and unassigned YaoJiayi Aug 15, 2025
Copy link

@YaoJiayi YaoJiayi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants