Investigate improvement to `map_zip_with` algorithm

This is a follow-up issue for improvements to the `map_zip_with` algorithm, introduced in #3490.

Per @revans2's [review](https://github.com/NVIDIA/spark-rapids-jni/pull/3490#pullrequestreview-3076183717) of #3490, there is still scope for improvement in the performance.  For very large columns (with wide maps), it was observed that `thrust::cuda_cub::_uninitialized_fill()` took up too much time in the query.  This is likely from the calls to `cudf::detail::label_segments` which is called over a sequence spanning each key comparison.  This could be huge if each row has a large number of keys.

The current implementation is an almost straightforward algorithm tuned for small-medium sized maps.  There might be other approaches available:
1. When matching keys per row, one approach could involve using `cuco`'s hashmaps.  This might prove faster to compare.
2. While there isn't a guarantee of the keys being sorted, perhaps there's a way to sort both sides, and do a scan-based match.  Much like a join.  This might make it challenging to match the ordering one sees in Spark.  Further, the preprocessing stands the risk of eating into the performance gained from scanning.
3. Maybe there is a way to avoid materializing the cross-product vector entirely.  A quick implementation (with `thrust::upper_bound`) indicated that the current approach is faster.

There might be need to switch between algorithms based on a heuristic:  E.g. For small maps on average, use the current implementation.  For larger maps, switch to the hash-table approach.  Some strings algorithms do exactly this in CUDF.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Investigate improvement to `map_zip_with` algorithm #3588

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Investigate improvement to map_zip_with algorithm #3588

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Investigate improvement to `map_zip_with` algorithm #3588