Enable mixed precision dispatch in distributed matrix with ScalarCache, GenericDenseCache, and GenericVectorCache #1819

yhmtsai · 2025-04-02T09:54:31Z

This PR enables the mixed precision dispatch in distributed matrix.
Moreover, it adds the ScalarCache to handle the scalar with user-specified value but with different type (mainly for one scalar) and the GenericDenseCache and GenericVectorCache to reuse the workspace for Dense view with different value type for the communication buffer.
Both them will prepare the data during the get not initialized.

Although it is based on the distributed rowGatherer now, it is not required.

MarcelKoch

LGMT, just some minor remarks.

common/cuda_hip/matrix/coo_kernels.cpp

include/ginkgo/core/base/dense_cache.hpp

core/distributed/matrix.cpp

core/test/base/dense_cache.cpp

MarcelKoch · 2025-04-25T08:28:58Z

@yhmtsai can you rebase again?

yhmtsai · 2025-04-25T15:39:59Z

@MarcelKoch sure, it is done

yhmtsai · 2025-05-19T15:33:32Z

@MarcelKoch Because buffer is handled by distributed::Vector now, I add GenericVectorCache for that.
The last commit is to combine the init and get and I return the vector directly from init_recv_buffers now.
Before last commit, it follows the same pattern as the current develop, initializing the configuration in init_recv_buffers and get the vector later. I think it is a bit weird because initializing in GenericVectorCache does nothing and it splits the configuration to two places.

MarcelKoch

Are the changes to precision_dispatch.hpp intentional?
Otherwise, LGTM.

core/test/mpi/distributed/vector_cache.cpp

include/ginkgo/core/base/dense_cache.hpp

MarcelKoch · 2025-05-20T07:20:48Z

include/ginkgo/core/base/precision_dispatch.hpp

These changes look like they are reverting some previous commits. Is that actually wanted?

No, it was the mistake during rebase

yhmtsai · 2025-05-20T13:03:26Z

benchmark/test/reference/distributed_solver.profile.stderr

@@ -62,8 +62,6 @@ DEBUG: end   copy
 DEBUG: begin copy
 DEBUG: end   copy
 DEBUG: end   copy(<typename>)
-DEBUG: begin dense::fill


because ScalarCache allocate the memory when it is first used with the type

yhmtsai · 2025-05-20T13:04:26Z

benchmark/test/reference/spmv_distributed.profile.stdout

-                "storage": 11476,
+                "storage": 11452,


It is from DenseCache -> ScalarCache, we allocate the memory when using it.
It is 3 * sizeof(ValueType) because we use 3 processes for mpi

…distributed matrix mixed precision

…ue type

Co-authored-by: Marcel Koch <marcel.koch@kit.edu>

the size difference is 3 * ValueType because we use 3 process for mpi

sonarqubecloud · 2025-05-22T05:18:11Z

Quality Gate passed

Issues
13 New issues
0 Accepted issues

Measures
0 Security Hotspots
95.5% Coverage on New Code
1.0% Duplication on New Code

See analysis details on SonarQube Cloud

yhmtsai self-assigned this Apr 2, 2025

yhmtsai added the 1:ST:ready-for-review This PR is ready for review label Apr 2, 2025

ginkgo-bot added reg:testing This is related to testing. mod:core This is related to the core module. mod:cuda This is related to the CUDA module. type:matrix-format This is related to the Matrix formats mod:hip This is related to the HIP module. labels Apr 2, 2025

yhmtsai requested a review from a team April 2, 2025 09:54

MarcelKoch self-requested a review April 2, 2025 11:29

MarcelKoch requested changes Apr 17, 2025

View reviewed changes

yhmtsai requested a review from MarcelKoch April 24, 2025 17:05

MarcelKoch force-pushed the distributed-row-gatherer branch from b0a8db4 to d3bab40 Compare April 25, 2025 08:25

MarcelKoch force-pushed the distributed-row-gatherer branch from d3bab40 to 77868e5 Compare April 25, 2025 12:23

yhmtsai force-pushed the mixed_distributed branch from ad74383 to fb83400 Compare April 25, 2025 15:39

MarcelKoch approved these changes Apr 28, 2025

View reviewed changes

yhmtsai added 1:ST:ready-to-merge This PR is ready to merge. and removed 1:ST:ready-for-review This PR is ready for review labels May 2, 2025

MarcelKoch added this to the Ginkgo 1.10.0 milestone May 6, 2025

MarcelKoch force-pushed the distributed-row-gatherer branch from dafb3e6 to e7178d5 Compare May 12, 2025 10:36

Base automatically changed from distributed-row-gatherer to develop May 13, 2025 10:31

yhmtsai force-pushed the mixed_distributed branch from f6a3abb to 7e005ab Compare May 19, 2025 15:25

yhmtsai added 1:ST:ready-for-review This PR is ready for review 1:ST:run-full-test and removed 1:ST:ready-to-merge This PR is ready to merge. labels May 19, 2025

yhmtsai requested a review from MarcelKoch May 19, 2025 15:27

yhmtsai changed the title ~~Enable mixed precision dispatch in distributed matrix with ScalarCache and GenericDenseCache~~ Enable mixed precision dispatch in distributed matrix with ScalarCache, GenericDenseCache, and GenericVectorCache May 19, 2025

MarcelKoch requested changes May 20, 2025

View reviewed changes

yhmtsai force-pushed the mixed_distributed branch 2 times, most recently from b88f705 to 0725db0 Compare May 20, 2025 08:38

yhmtsai commented May 20, 2025

View reviewed changes

yhmtsai requested a review from MarcelKoch May 20, 2025 13:04

MarcelKoch approved these changes May 20, 2025

View reviewed changes

yhmtsai added 1:ST:ready-to-merge This PR is ready to merge. and removed 1:ST:ready-for-review This PR is ready for review labels May 20, 2025

yhmtsai and others added 9 commits May 21, 2025 10:01

Add GenericDenseCache to get different type in runtime to enable the …

ee2e490

…distributed matrix mixed precision

add ScalarCache to generate different Dense scalar with different val…

b1e2ef1

…ue type

update format, make data private to avoid acciendental access

d530b20

Co-authored-by: Marcel Koch <marcel.koch@kit.edu>

add cache accessor for private member to test

efb9c10

add GenericVectorCache

2ae7eb9

enable distributed matrix mixed precision

62891de

return the vectors in init_recv_buffers and merge init and get together

36e5e52

fix ambiguous

b9d947c

ScalarCache allocate memory when it is used in the first time

a60e3da

the size difference is 3 * ValueType because we use 3 process for mpi

yhmtsai force-pushed the mixed_distributed branch from 074b9b1 to a60e3da Compare May 21, 2025 08:01

fix typo, macro, and use proper header

71c3aea

yhmtsai force-pushed the mixed_distributed branch from f4da7a9 to 71c3aea Compare May 21, 2025 13:57

yhmtsai merged commit 3da19d1 into develop May 21, 2025
16 of 19 checks passed

yhmtsai deleted the mixed_distributed branch May 21, 2025 19:50

Enable mixed precision dispatch in distributed matrix with ScalarCache, GenericDenseCache, and GenericVectorCache #1819

Enable mixed precision dispatch in distributed matrix with ScalarCache, GenericDenseCache, and GenericVectorCache #1819

Uh oh!

Conversation

yhmtsai commented Apr 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MarcelKoch left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MarcelKoch commented Apr 25, 2025

Uh oh!

yhmtsai commented Apr 25, 2025

Uh oh!

yhmtsai commented May 19, 2025

Uh oh!

MarcelKoch left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

MarcelKoch May 20, 2025

Choose a reason for hiding this comment

Uh oh!

yhmtsai May 20, 2025

Choose a reason for hiding this comment

Uh oh!

yhmtsai May 20, 2025

Choose a reason for hiding this comment

Uh oh!

yhmtsai May 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sonarqubecloud bot commented May 22, 2025

Quality Gate passed

Uh oh!

Uh oh!

yhmtsai commented Apr 2, 2025 •

edited

Loading