[FEA] Implement the mathematical functions by following the algorithms from the `fdlibm` library

**Is your feature request related to a problem? Please describe.**
Some GPU mathematical functions will produce different values than Spark ones when operating on some specific floating-point numbers, leading to the data diff issues in many customer queries.

There are two related issues so far:
 `pow`, https://github.com/NVIDIA/spark-rapids/issues/12685
 `hypot`, https://github.com/NVIDIA/spark-rapids/issues/9744

and there would be more, even we have not got into them now.

**Describe the solution you'd like**
Implement these functions with the same algorithms as that in [the `fdlibm` library](https://www.netlib.org/fdlibm/) to try to make the result bit for bit identical to Spark. This should work because `fdlibm` is what the Java math implementations are based off of.

**Describe alternatives you've considered**
Suggest CUDA team to follow the algorithms used in the `fdlibm` library, but it is not easy to move forward. Because it is not a real bug to CUDA, and the current implementations also conform to the ISO standard. 

There are many functions, so we can split them into sub tasks, and one function one task.

- [ ] 1. Figure out the legal license header (Need to add a NOTICE similar as https://github.com/NVIDIA/spark-rapids/blob/branch-25.06/NOTICE)
- [ ] 2. pow
- [ ] 3. hypot
- [ ] 4. ... (add more if needed)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEA] Implement the mathematical functions by following the algorithms from the `fdlibm` library #3337

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEA] Implement the mathematical functions by following the algorithms from the fdlibm library #3337

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[FEA] Implement the mathematical functions by following the algorithms from the `fdlibm` library #3337