Skip to content

Conversation

@ergawy
Copy link

@ergawy ergawy commented Dec 11, 2024

Extends host_eval support for the currently supported form of the generic loop directive.

@ergawy ergawy requested review from agozillon and skatrak December 11, 2024 09:24
Copy link
Member

@skatrak skatrak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Kareem for working on this! I've got a couple of small comments, but I think it's almost there.

@ergawy ergawy force-pushed the loop_host_eval branch 2 times, most recently from 6e5cba3 to bd14f92 Compare December 16, 2024 08:43
@ergawy
Copy link
Author

ergawy commented Dec 16, 2024

Thanks for the review @skatrak. I handled your comments (hopefully) and added a small offloading test. But I wanted to ask: where in the spec do we extract the info that a certain config is generic, SPMD, or generic-SPMD?

Copy link
Member

@skatrak skatrak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Kareem, and sorry for the delay getting back to this! LGTM, just minor nits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Detect SPMD: target-teams-loop[-simd].
// Detect SPMD: target-teams-loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!isa<DistributeOp>(innermostWrapper) && !isa<LoopOp>(innermostWrapper))
if (!isa<DistributeOp, LoopOp>(innermostWrapper))

@skatrak
Copy link
Member

skatrak commented Jan 7, 2025

Thanks for the review @skatrak. I handled your comments (hopefully) and added a small offloading test. But I wanted to ask: where in the spec do we extract the info that a certain config is generic, SPMD, or generic-SPMD?

That differentiation is not really in the spec, but it's instead part of the device runtime. It's an implementation detail that allows us to provide more efficient specializations for specific kinds of kernels. SPMD, for example, is used to identify kernels for which the number of teams and threads and the trip count of the single loop being run across all threads and teams remains constant during execution and is known in advance.

A generic kernel might have multiple loops, none at all, or violate any of the conditions above (which are not mandated by the spec), and for that case the generated code implements a state machine that reduces performance on GPUs. Generic-SPMD seems to fall somewhere in the middle, and I don't know much about it apart from the fact that it seems to represent target teams distribute, so something inside of a loop that is distributed across teams. I got that understanding some time ago looking at uses of the OMPTgtExecModeFlags, but I wouldn't be surprised if I didn't understand Generic-SPMD fully.

@ergawy
Copy link
Author

ergawy commented Jan 13, 2025

That differentiation is not really in the spec, but it's instead part of the device runtime. It's an implementation detail that allows us to provide more efficient specializations for specific kinds of kernels. SPMD, for example, is used to identify kernels for which the number of teams and threads and the trip count of the single loop being run across all threads and teams remains constant during execution and is known in advance.

A generic kernel might have multiple loops, none at all, or violate any of the conditions above (which are not mandated by the spec), and for that case the generated code implements a state machine that reduces performance on GPUs. Generic-SPMD seems to fall somewhere in the middle, and I don't know much about it apart from the fact that it seems to represent target teams distribute, so something inside of a loop that is distributed across teams. I got that understanding some time ago looking at uses of the OMPTgtExecModeFlags, but I wouldn't be surprised if I didn't understand Generic-SPMD fully.

That clarifies things quite a bit. Thanks!

Extends `host_eval` support for the currently supported form of the
generic `loop` directive.
@ergawy ergawy merged commit a1eead6 into ROCm:amd-trunk-dev Jan 13, 2025
2 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants