Skip to content

Enable mixed precision dispatch in distributed matrix with ScalarCache, GenericDenseCache, and GenericVectorCache #1819

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
May 21, 2025

Conversation

yhmtsai
Copy link
Member

@yhmtsai yhmtsai commented Apr 2, 2025

This PR enables the mixed precision dispatch in distributed matrix.
Moreover, it adds the ScalarCache to handle the scalar with user-specified value but with different type (mainly for one scalar) and the GenericDenseCache and GenericVectorCache to reuse the workspace for Dense view with different value type for the communication buffer.
Both them will prepare the data during the get not initialized.

Although it is based on the distributed rowGatherer now, it is not required.

@yhmtsai yhmtsai self-assigned this Apr 2, 2025
@yhmtsai yhmtsai added the 1:ST:ready-for-review This PR is ready for review label Apr 2, 2025
@ginkgo-bot ginkgo-bot added reg:testing This is related to testing. mod:core This is related to the core module. mod:cuda This is related to the CUDA module. type:matrix-format This is related to the Matrix formats mod:hip This is related to the HIP module. labels Apr 2, 2025
@yhmtsai yhmtsai requested a review from a team April 2, 2025 09:54
@MarcelKoch MarcelKoch self-requested a review April 2, 2025 11:29
Copy link
Member

@MarcelKoch MarcelKoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT, just some minor remarks.

@yhmtsai yhmtsai requested a review from MarcelKoch April 24, 2025 17:05
@MarcelKoch MarcelKoch force-pushed the distributed-row-gatherer branch from b0a8db4 to d3bab40 Compare April 25, 2025 08:25
@MarcelKoch
Copy link
Member

@yhmtsai can you rebase again?

@MarcelKoch MarcelKoch force-pushed the distributed-row-gatherer branch from d3bab40 to 77868e5 Compare April 25, 2025 12:23
@yhmtsai yhmtsai force-pushed the mixed_distributed branch from ad74383 to fb83400 Compare April 25, 2025 15:39
@yhmtsai
Copy link
Member Author

yhmtsai commented Apr 25, 2025

@MarcelKoch sure, it is done

@yhmtsai yhmtsai added 1:ST:ready-to-merge This PR is ready to merge. and removed 1:ST:ready-for-review This PR is ready for review labels May 2, 2025
@MarcelKoch MarcelKoch added this to the Ginkgo 1.10.0 milestone May 6, 2025
@MarcelKoch MarcelKoch force-pushed the distributed-row-gatherer branch from dafb3e6 to e7178d5 Compare May 12, 2025 10:36
Base automatically changed from distributed-row-gatherer to develop May 13, 2025 10:31
@yhmtsai yhmtsai force-pushed the mixed_distributed branch from f6a3abb to 7e005ab Compare May 19, 2025 15:25
@yhmtsai yhmtsai added 1:ST:ready-for-review This PR is ready for review 1:ST:run-full-test and removed 1:ST:ready-to-merge This PR is ready to merge. labels May 19, 2025
@yhmtsai yhmtsai requested a review from MarcelKoch May 19, 2025 15:27
@yhmtsai
Copy link
Member Author

yhmtsai commented May 19, 2025

@MarcelKoch Because buffer is handled by distributed::Vector now, I add GenericVectorCache for that.
The last commit is to combine the init and get and I return the vector directly from init_recv_buffers now.
Before last commit, it follows the same pattern as the current develop, initializing the configuration in init_recv_buffers and get the vector later. I think it is a bit weird because initializing in GenericVectorCache does nothing and it splits the configuration to two places.

@yhmtsai yhmtsai changed the title Enable mixed precision dispatch in distributed matrix with ScalarCache and GenericDenseCache Enable mixed precision dispatch in distributed matrix with ScalarCache, GenericDenseCache, and GenericVectorCache May 19, 2025
Copy link
Member

@MarcelKoch MarcelKoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the changes to precision_dispatch.hpp intentional?
Otherwise, LGTM.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look like they are reverting some previous commits. Is that actually wanted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it was the mistake during rebase

@yhmtsai yhmtsai force-pushed the mixed_distributed branch 2 times, most recently from b88f705 to 0725db0 Compare May 20, 2025 08:38
@@ -62,8 +62,6 @@ DEBUG: end copy
DEBUG: begin copy
DEBUG: end copy
DEBUG: end copy(<typename>)
DEBUG: begin dense::fill
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because ScalarCache allocate the memory when it is first used with the type

"storage": 11476,
"storage": 11452,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is from DenseCache -> ScalarCache, we allocate the memory when using it.
It is 3 * sizeof(ValueType) because we use 3 processes for mpi

@yhmtsai yhmtsai requested a review from MarcelKoch May 20, 2025 13:04
@yhmtsai yhmtsai added 1:ST:ready-to-merge This PR is ready to merge. and removed 1:ST:ready-for-review This PR is ready for review labels May 20, 2025
@yhmtsai yhmtsai force-pushed the mixed_distributed branch from 074b9b1 to a60e3da Compare May 21, 2025 08:01
@yhmtsai yhmtsai force-pushed the mixed_distributed branch from f4da7a9 to 71c3aea Compare May 21, 2025 13:57
@yhmtsai yhmtsai merged commit 3da19d1 into develop May 21, 2025
16 of 19 checks passed
@yhmtsai yhmtsai deleted the mixed_distributed branch May 21, 2025 19:50
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:ready-to-merge This PR is ready to merge. 1:ST:run-full-test mod:core This is related to the core module. mod:cuda This is related to the CUDA module. mod:hip This is related to the HIP module. reg:testing This is related to testing. type:matrix-format This is related to the Matrix formats
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants