-
Notifications
You must be signed in to change notification settings - Fork 93
[PTX] Enable migration of ldmatrix #2692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bd88679
to
4492ec9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
de49c1d
to
7f533ae
Compare
@@ -2055,6 +2055,103 @@ class joint_matrix { | |||
matrix_accessor x; | |||
const size_t num_elements; | |||
}; | |||
|
|||
/// Loads 1 8x8 b16 matrix from shared memory to local memory (32-bits per wi) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from local memory to private memory of each work item?
} | ||
} | ||
|
||
/// Loads 2 8x8 b16 matrix from shared memory to local memory (32-bits per wi) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use sycl language in the comments.
@@ -2055,6 +2055,98 @@ class joint_matrix { | |||
matrix_accessor x; | |||
const size_t num_elements; | |||
}; | |||
|
|||
/// Loads 1 8x8 b16 matrix from local memory to private memory (32-bits per wi) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Loads 1 8x8 b16 matrix from local memory to private memory (32-bits per wi) | |
/// Loads 1 8x8 b16 matrix from local memory to private memory (32-bits per work item) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please give more details on how the loaded data will be distributed on private memory. is it a work-group function or sub-group function? Make sure user can use it after reading the comments.
0463353
to
63ffbaf
Compare
22018c1
to
df279d5
Compare
This PR adds support for the migration of ldmatrix PTX ASM API