The draft of simulated double. #1808

yhmtsai · 2025-03-17T16:50:01Z

This draft provides the custom double on device side, which might reduce a bit work if the vendor does not provide simulated double and remove the double precision core.

do not need to review this pr unless we have decided to investigate this direction further.

This can be compiled with cuda 12.6.
It mainly focuses on the interface and forward the operation internally to double precision.
This PR disables the conversion custom_double -> double to ensure there is no double operation directly in the kernels.
Unfortunately, the other direction double -> custom_double (via static_cast), but the implicit conversion should be avoided.
The requirement is from thrust::complex that needs T(1.0)/s which require this constructor from double.

If the vendor delete the double precision core without the simulation way, some functions might not be available due to hardware limit, but the others can be simulated in software side by more operations or allowing reinterpret.

Somethings needs to implement (simulate):

__shfl_xor_sync: it can be replaced by cooperative group or by casting.
load/store: by reinterpret to 64bit int?
math operation (+-*/), comparison, sqrt
likely thrust::complex<custom_double> (if they can change it to T(1.0f), we might not have an issue) and corresponding abs and sqrt

If the application does not need the double precision at all, we have done the same things for dpcpp on the Intel GPU without double precision support. We only need to reapply it again to other backend, which mainly takes care of some accidental usage of double.

This can be compiled with cuda 12.6. It mainly focuses on the interface and forward the operation internally to double precision. If the corresponding instruction is also removed by the vendor, some of them might be able to simulate by more operations (or allowing reinterpret), but some of them like atomic on 64 bits or memory control will reuqire hardware.

ginkgo-bot · 2025-03-17T16:59:49Z

Error: The following files need to be formatted:

core/config/property_tree.cpp
core/test/config/property_tree.cpp
include/ginkgo/core/config/property_tree.hpp

You can find a formatting patch under Artifacts here or run format! if you have write access to Ginkgo

yhmtsai added the 1:ST:do-not-merge Please do not merge PR this yet. label Mar 17, 2025

yhmtsai force-pushed the simulate_double branch from 467e70e to c9e1a34 Compare March 17, 2025 16:59

MarcelKoch assigned yhmtsai Mar 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The draft of simulated double. #1808

The draft of simulated double. #1808

yhmtsai commented Mar 17, 2025

ginkgo-bot commented Mar 17, 2025

The draft of simulated double. #1808

Are you sure you want to change the base?

The draft of simulated double. #1808

Conversation

yhmtsai commented Mar 17, 2025

ginkgo-bot commented Mar 17, 2025