-
Notifications
You must be signed in to change notification settings - Fork 68
[optimize-dot-operands]: Fuse load and trans operations - part 2 #4468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Tiotto, Ettore <ettore.tiotto@intel.com>
Signed-off-by: Tiotto, Ettore <ettore.tiotto@intel.com>
Signed-off-by: Tiotto, Ettore <ettore.tiotto@intel.com>
Signed-off-by: Tiotto, Ettore <ettore.tiotto@intel.com>
Signed-off-by: Tiotto, Ettore <ettore.tiotto@intel.com>
Signed-off-by: Tiotto, Ettore <ettore.tiotto@intel.com>
Signed-off-by: Tiotto, Ettore <ettore.tiotto@intel.com>
…tt.dot Signed-off-by: Tiotto, Ettore <ettore.tiotto@intel.com>
third_party/intel/lib/TritonIntelGPUTransforms/OptimizeDotOperands.cpp
Outdated
Show resolved
Hide resolved
third_party/intel/lib/TritonIntelGPUTransforms/OptimizeDotOperands.cpp
Outdated
Show resolved
Hide resolved
Signed-off-by: Tiotto, Ettore <ettore.tiotto@intel.com>
If I understand the problem correctly, we are trying to rewrite the original That being said I wonder if we could resolve the original problem without changing the tensor ptr type. It might not be possible now since the only information carried about 2D block encoding is in the string attribute, but I think it might be possible to convey this info with the Subgroup 2D Block encoding layout in the future. |
Signed-off-by: Tiotto, Ettore <ettore.tiotto@intel.com>
third_party/intel/lib/TritonIntelGPUTransforms/OptimizeDotOperands.cpp
Outdated
Show resolved
Hide resolved
Signed-off-by: Tiotto, Ettore <ettore.tiotto@intel.com>
third_party/intel/lib/TritonIntelGPUTransforms/OptimizeDotOperands.cpp
Outdated
Show resolved
Hide resolved
Signed-off-by: Tiotto, Ettore <ettore.tiotto@intel.com>
Signed-off-by: Tiotto, Ettore <ettore.tiotto@intel.com>
Signed-off-by: Tiotto, Ettore <ettore.tiotto@intel.com>
This PR enhances the new transformation pass aimed at fusing
tt.load
andtt.trans
operations. Specifically it adds support for loop carried arguments used (possibly transitively) by the candidatett.load
that should be fused with att.trans
.Example:
Here the load
%20
is a candidate for fusion with thett.trans
operation. The pointer argument used by the candidate load (%19
) is produced by att.advance
operation which uses the loop carried pointer%arg6
.