Add block ptr test for dot product with transpose #4510

alexbaden · 2025-06-16T14:19:01Z

Adds the end-to-end dot product on block ptr testing to the block load unit test (maybe it should be renamed test_block_ptr.py?). Adds additional shapes, A transpose, and B transpose. The cold runtime (no cache) is approximately 1 minute on PVC 1100 in my environment. I picked the block shapes somewhat randomly, trying to balance breadth and runtime.

This somewhat duplicates tutorial 10 but allows us to run many more combinations in shorter time. I added this because #4463 is passing CI but has a few bugs that are not being caught by existing unit tests, including tutorials.

whitneywhtsang · 2025-06-16T19:24:40Z

I added this because #4463 is passing CI but has a few bugs that are not being caught by existing unit tests, including tutorials.

Is it because some configs/shapes are missing from tutorial 10 or is it because autotune would hide failures?

python/test/unit/intel/test_block_load.py

alexbaden · 2025-06-16T19:27:37Z

Is it because some configs/shapes are missing from tutorial 10 or is it because autotune would hide failures?

Both reasons - plus it is more expensive because the overall shape is bigger. Here we make the overall shape smaller which dramatically improves runtime, which is fine because we only care about large enough to have different configurations in a single block vs tutorial 10 where we want to see performance and need bigger shapes to really get good throughput.

python/test/unit/intel/test_block_load.py

Co-authored-by: Whitney Tsang <whitney.tsang@intel.com>

whitneywhtsang

IMO we can remove all comments about the kernel, as it adds maintenance cost, one could refer to tutorial 3 or 10 to understand how to write gemm in Triton.

whitneywhtsang · 2025-06-16T19:49:11Z

Another thought is we can add test_op in tutorial 10, similar to how it is done in tutorial 6.

alexbaden · 2025-06-16T19:52:05Z

Let's go with this approach for now, I can move it to tutorial 10 (or enable the tensor descriptor version of tutorial 10 and then add test_op there, if upstream does not do that first). After all theoretically block_ptr won't be around much longer...

whitneywhtsang · 2025-06-16T19:55:14Z

Let's go with this approach for now, I can move it to tutorial 10 (or enable the tensor descriptor version of tutorial 10 and then add test_op there, if upstream does not do that first). After all theoretically block_ptr won't be around much longer...

Sure, I am fine to do this first.

whitneywhtsang · 2025-06-16T21:47:04Z

IMO we can remove all comments about the kernel, as it adds maintenance cost, one could refer to tutorial 3 or 10 to understand how to write gemm in Triton.

@alexbaden WDYT?

alexbaden · 2025-06-17T00:39:42Z

I think it's ok as is - long term we should either remove it, since block_ptr will be deprecated, or fold it into tutorial 10 (but probably remove it).
For now though this coverage is important as it lets us know the block load lowering has no regressions - we will still rely on that even if we move completely to tensor descriptor.

alexbaden · 2025-06-17T00:53:19Z

Synced with Whitney offline - going to merge this now and then either merge it into tutorial 10 (if block ptr sticks around) or probably delete it alongside tutorial 10 when block ptr goes away.

python/test/unit/intel/test_block_load.py

This avoids a `Vector Type not specified properly` warning from ptxas.

Add block ptr test for dot product with transpose

9a3322a

alexbaden requested review from anmyachev, whitneywhtsang and chengjunlu June 16, 2025 14:19

alexbaden mentioned this pull request Jun 16, 2025

Use the Subgroup 2D Block Encoding in LoadStoreOpToLLVM #4500

Draft

whitneywhtsang reviewed Jun 16, 2025

View reviewed changes

alexbaden added 2 commits June 16, 2025 19:39

address review comments

d852a35

remove device print

1e80f67

alexbaden requested a review from whitneywhtsang June 16, 2025 19:40

whitneywhtsang reviewed Jun 16, 2025

View reviewed changes

python/test/unit/intel/test_block_load.py Outdated Show resolved Hide resolved

Update python/test/unit/intel/test_block_load.py

06f2a5a

Co-authored-by: Whitney Tsang <whitney.tsang@intel.com>

whitneywhtsang approved these changes Jun 16, 2025

View reviewed changes

alexbaden merged commit e279ab7 into main Jun 17, 2025
15 checks passed

alexbaden deleted the alex/add_dpas_block_tests branch June 17, 2025 00:53

chengjunlu reviewed Jun 17, 2025

View reviewed changes

python/test/unit/intel/test_block_load.py Show resolved Hide resolved

david-hls pushed a commit to david-hls/intel-xpu-backend-for-triton that referenced this pull request Jun 18, 2025

[backend] NFC: Fix ptx st argument order (intel#4510)

fcd2be8

This avoids a `Vector Type not specified properly` warning from ptxas.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add block ptr test for dot product with transpose #4510

Add block ptr test for dot product with transpose #4510

Uh oh!

alexbaden commented Jun 16, 2025

Uh oh!

whitneywhtsang commented Jun 16, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexbaden commented Jun 16, 2025

Uh oh!

Uh oh!

whitneywhtsang left a comment

Uh oh!

whitneywhtsang commented Jun 16, 2025

Uh oh!

alexbaden commented Jun 16, 2025

Uh oh!

whitneywhtsang commented Jun 16, 2025

Uh oh!

whitneywhtsang commented Jun 16, 2025

Uh oh!

alexbaden commented Jun 17, 2025

Uh oh!

alexbaden commented Jun 17, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add block ptr test for dot product with transpose #4510

Add block ptr test for dot product with transpose #4510

Uh oh!

Conversation

alexbaden commented Jun 16, 2025

Uh oh!

whitneywhtsang commented Jun 16, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexbaden commented Jun 16, 2025

Uh oh!

Uh oh!

whitneywhtsang left a comment

Choose a reason for hiding this comment

Uh oh!

whitneywhtsang commented Jun 16, 2025

Uh oh!

alexbaden commented Jun 16, 2025

Uh oh!

whitneywhtsang commented Jun 16, 2025

Uh oh!

whitneywhtsang commented Jun 16, 2025

Uh oh!

alexbaden commented Jun 17, 2025

Uh oh!

alexbaden commented Jun 17, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!