Skip to content

Conversation

@AviralGoelAMD
Copy link
Contributor

  • Add AQ offset calculation in SplitKBatchOffset for ABQuant mode
  • Update mode restriction to support ABQuant alongside BQuant
  • Add AQ alignment validation for split-K batches
  • Apply AQ pointer offsetting based on tensor layout
  • Update block window creation to handle AQ group offsets
  • Requires non-preshuffle mode for both A and B quantization

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

- Add AQ offset calculation in SplitKBatchOffset for ABQuant mode
- Update mode restriction to support ABQuant alongside BQuant
- Add AQ alignment validation for split-K batches
- Apply AQ pointer offsetting based on tensor layout
- Update block window creation to handle AQ group offsets
- Requires non-preshuffle mode for both A and B quantization
- Create test_gemm_quant_abquant_splitk_decode.cpp for decode workloads
- Create test_gemm_quant_abquant_splitk_prefill.cpp for prefill workloads
- Add k_batch parameter to ABQuant test fixture run_test_with_validation
- Fix missing c_m_n_dev_buf.SetZero() in ABQuant fixture (critical for split-K atomic operations)
- Register new test targets in CMakeLists.txt
- Tests cover split_k values: 2, 3, 4, 5
- All 16 tests passing (8 FP8, 8 BF8)
@AviralGoelAMD AviralGoelAMD marked this pull request as ready for review February 9, 2026 21:15
@AviralGoelAMD AviralGoelAMD requested a review from a team as a code owner February 9, 2026 21:16
@AviralGoelAMD AviralGoelAMD requested a review from Copilot February 9, 2026 21:19
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Extends Split-K support in CK Tile quantized GEMM to cover ABQuantGrouped mode (in addition to existing BQuantGrouped), and adds Split-K ABQuant test coverage for decode/prefill shapes.

Changes:

  • Add Split-K AQ/BQ group offset + pointer offset handling for ABQuantGrouped in QuantGemmKernel (with added alignment validation).
  • Update ABQuant test fixture to accept k_batch and zero-initialize C for Split-K atomic accumulation.
  • Add new ABQuant Split-K gtests (decode + prefill) and wire them into the test CMake.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
projects/composablekernel/test/ck_tile/gemm_block_scale/test_gemm_quant_fixtures.hpp Add k_batch parameter to ABQuant fixture and zero C buffer before Split-K runs
projects/composablekernel/test/ck_tile/gemm_block_scale/test_gemm_quant_abquant_splitk_prefill.cpp New ABQuantGrouped Split-K tests for prefill shape
projects/composablekernel/test/ck_tile/gemm_block_scale/test_gemm_quant_abquant_splitk_decode.cpp New ABQuantGrouped Split-K tests for decode shape
projects/composablekernel/test/ck_tile/gemm_block_scale/CMakeLists.txt Build new ABQuant Split-K test executables
projects/composablekernel/include/ck_tile/ops/gemm_quant/kernel/gemm_quant_kernel.hpp Implement Split-K support for ABQuantGrouped (AQ offsets, constraints, block windows)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Add missing aq_group_offset param to MakeAQBlockWindow call
- Use actual k_batch for tolerance calculation in split-K tests
@ThomasNing
Copy link
Contributor

@AviralGoelAMD Could we add the support for the pre-shuffle of BQuant?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants