Skip to content

Conversation

shen-shanshan
Copy link
Collaborator

@shen-shanshan shen-shanshan commented Aug 14, 2025

What this PR does / why we need it?

Refactor AscendAttentionMetadataBuilder for better extensibility and make the builder class of torchair extend from it.

Extract _assemble_build_info() and _assemble_attn_metadata() method from build() in AscendAttentionMetadataBuilder for better extensibility.

Workflow of build() method:

  • Prepare build info: the common logic of preparing build info.
  • _assemble_build_info(): the custom logic that can be overwritten in torchair_attention.py.
  • _assemble_attn_metadata(): the custom logic that can be overwritten in torchair_attention.py.

After this refactor, we can remove the build() method in AscendAttentionTorchairMetadataBuilder, and just need to overwrite these two methods: _assemble_build_info() and _assemble_attn_metadata().

Note

Do not merge this PR before #2017.

Does this PR introduce any user-facing change?

How was this patch tested?

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the build method in AscendAttentionMetadataBuilder by extracting _prepare_build_info and _assemble_build_info. This is a good change that improves modularity and extensibility, as demonstrated by the new AscendAttentionTorchairMetadataBuilder. The implementation is solid, but I've found one area in the new torchair_attention.py file with some confusing code that could be clarified.

Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@shen-shanshan shen-shanshan changed the title [4/N][Refactor] Extract _prepare_build_info() and _assemble_build_info() from build() in AscendAttentionMetadataBuilder [4/N][Refactor] Refactor AscendAttentionMetadataBuilder for better extensibility and make the builder class of torchair extend from it Aug 15, 2025
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Copy link

codecov bot commented Aug 22, 2025

Codecov Report

❌ Patch coverage is 70.27027% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.64%. Comparing base (2693196) to head (ea15107).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
vllm_ascend/torchair/torchair_attention.py 25.00% 9 Missing ⚠️
vllm_ascend/attention/attention_v1.py 92.00% 2 Missing ⚠️

❌ Your patch check has failed because the patch coverage (70.27%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2375      +/-   ##
==========================================
+ Coverage   72.61%   72.64%   +0.03%     
==========================================
  Files         154      154              
  Lines       21319    21333      +14     
==========================================
+ Hits        15480    15497      +17     
+ Misses       5839     5836       -3     
Flag Coverage Δ
unittests 72.64% <70.27%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@shen-shanshan
Copy link
Collaborator Author

@wangxiyuan This PR has been peer reviewed in the meeting before, and the CI has all passed finally.

Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

…make the builder class of torchair extend from it

Signed-off-by: shen-shanshan <467638484@qq.com>
@shen-shanshan
Copy link
Collaborator Author

test:

pytest -sv tests/e2e/multicard/test_torchair_graph_mode.py::test_e2e_qwen2_with_torchair

output:

Generated text: 'Hello, my name is Alex and I am a'
Generated text: 'The president of the United States is a very important person.'
Generated text: 'The capital of France is Paris. It is the'
Generated text: 'The future of AI is in the hands of the'
PASSED

@shen-shanshan shen-shanshan added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR accuracy-test enable all accuracy test for PR labels Sep 16, 2025
@shen-shanshan shen-shanshan added the accuracy-test enable all accuracy test for PR label Sep 16, 2025
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accuracy-test enable all accuracy test for PR merge-conflicts ready-for-test start test by label for PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant