Skip to content

The draft of simulated double. #1808

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: develop
Choose a base branch
from
Draft

The draft of simulated double. #1808

wants to merge 1 commit into from

Conversation

yhmtsai
Copy link
Member

@yhmtsai yhmtsai commented Mar 17, 2025

This draft provides the custom double on device side, which might reduce a bit work if the vendor does not provide simulated double and remove the double precision core.

do not need to review this pr unless we have decided to investigate this direction further.

This can be compiled with cuda 12.6.
It mainly focuses on the interface and forward the operation internally to double precision.
This PR disables the conversion custom_double -> double to ensure there is no double operation directly in the kernels.
Unfortunately, the other direction double -> custom_double (via static_cast), but the implicit conversion should be avoided.
The requirement is from thrust::complex that needs T(1.0)/s which require this constructor from double.

If the vendor delete the double precision core without the simulation way, some functions might not be available due to hardware limit, but the others can be simulated in software side by more operations or allowing reinterpret.

Somethings needs to implement (simulate):

  • __shfl_xor_sync: it can be replaced by cooperative group or by casting.
  • load/store: by reinterpret to 64bit int?
  • math operation (+-*/), comparison, sqrt
  • likely thrust::complex<custom_double> (if they can change it to T(1.0f), we might not have an issue) and corresponding abs and sqrt

If the application does not need the double precision at all, we have done the same things for dpcpp on the Intel GPU without double precision support. We only need to reapply it again to other backend, which mainly takes care of some accidental usage of double.

This can be compiled with cuda 12.6.
It mainly focuses on the interface and forward the operation internally to double precision.
If the corresponding instruction is also removed by the vendor, some of them might be able to simulate by more operations (or allowing reinterpret), but some of them like atomic on 64 bits or memory control will reuqire hardware.
@yhmtsai yhmtsai added the 1:ST:do-not-merge Please do not merge PR this yet. label Mar 17, 2025
@ginkgo-bot ginkgo-bot added reg:testing This is related to testing. mod:core This is related to the core module. mod:cuda This is related to the CUDA module. mod:reference This is related to the reference module. type:solver This is related to the solvers type:preconditioner This is related to the preconditioners mod:hip This is related to the HIP module. reg:helper-scripts This issue/PR is related to the helper scripts mainly concerned with development of Ginkgo. mod:dpcpp This is related to the DPC++ module. labels Mar 17, 2025
@ginkgo-bot
Copy link
Member

Error: The following files need to be formatted:

core/config/property_tree.cpp
core/test/config/property_tree.cpp
include/ginkgo/core/config/property_tree.hpp

You can find a formatting patch under Artifacts here or run format! if you have write access to Ginkgo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:do-not-merge Please do not merge PR this yet. mod:core This is related to the core module. mod:cuda This is related to the CUDA module. mod:dpcpp This is related to the DPC++ module. mod:hip This is related to the HIP module. mod:reference This is related to the reference module. reg:helper-scripts This issue/PR is related to the helper scripts mainly concerned with development of Ginkgo. reg:testing This is related to testing. type:preconditioner This is related to the preconditioners type:solver This is related to the solvers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants