-
Notifications
You must be signed in to change notification settings - Fork 458
[4/N][Refactor] Refactor AscendAttentionMetadataBuilder
for better extensibility and make the builder class of torchair extend from it
#2375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the build
method in AscendAttentionMetadataBuilder
by extracting _prepare_build_info
and _assemble_build_info
. This is a good change that improves modularity and extensibility, as demonstrated by the new AscendAttentionTorchairMetadataBuilder
. The implementation is solid, but I've found one area in the new torchair_attention.py
file with some confusing code that could be clarified.
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
_prepare_build_info()
and _assemble_build_info()
from build()
in AscendAttentionMetadataBuilder
AscendAttentionMetadataBuilder
for better extensibility and make the builder class of torchair extend from it
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (70.27%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #2375 +/- ##
==========================================
+ Coverage 72.61% 72.64% +0.03%
==========================================
Files 154 154
Lines 21319 21333 +14
==========================================
+ Hits 15480 15497 +17
+ Misses 5839 5836 -3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
badf9d1
to
0f6bbc8
Compare
@wangxiyuan This PR has been peer reviewed in the meeting before, and the CI has all passed finally. |
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
…make the builder class of torchair extend from it Signed-off-by: shen-shanshan <467638484@qq.com>
test: pytest -sv tests/e2e/multicard/test_torchair_graph_mode.py::test_e2e_qwen2_with_torchair output: Generated text: 'Hello, my name is Alex and I am a'
Generated text: 'The president of the United States is a very important person.'
Generated text: 'The capital of France is Paris. It is the'
Generated text: 'The future of AI is in the hands of the'
PASSED |
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
What this PR does / why we need it?
Refactor
AscendAttentionMetadataBuilder
for better extensibility and make the builder class of torchair extend from it.Extract
_assemble_build_info()
and_assemble_attn_metadata()
method frombuild()
inAscendAttentionMetadataBuilder
for better extensibility.Workflow of
build()
method:_assemble_build_info()
: the custom logic that can be overwritten intorchair_attention.py
._assemble_attn_metadata()
: the custom logic that can be overwritten intorchair_attention.py
.After this refactor, we can remove the
build()
method inAscendAttentionTorchairMetadataBuilder
, and just need to overwrite these two methods:_assemble_build_info()
and_assemble_attn_metadata()
.Note
Do not merge this PR before #2017.
Does this PR introduce any user-facing change?
How was this patch tested?