Skip to content

Conversation

@ahkcs
Copy link
Contributor

@ahkcs ahkcs commented Nov 4, 2025

Description

Fixes #4740 - Binning UDFs are now resolved through PPLFuncImpTable.resolve() instead of being instantiated directly via makeCall().

Implemented automatic type coercion to allow string fields with numeric values to work seamlessly with all binning operations (bins, span, minspan, range).

Related Issues

Type coercion support for binning UDFs #4356

@ahkcs ahkcs requested a review from penghuo November 6, 2025 01:04
@dai-chen
Copy link
Collaborator

dai-chen commented Nov 6, 2025

@ahkcs CI is failing. I retried but still failed. Could you double check?

ahkcs added 7 commits November 6, 2025 09:44
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@ahkcs ahkcs force-pushed the fix-binning-udf-resolution-4740 branch from 3ae31ab to a93dcf8 Compare November 6, 2025 17:44
@ahkcs
Copy link
Contributor Author

ahkcs commented Nov 6, 2025

@ahkcs CI is failing. I retried but still failed. Could you double check?

Unit tests passed after rebase

@ahkcs ahkcs changed the title Fix binning udf resolution Fix binning udf resolution / Add type coercion support for binning UDFs Nov 6, 2025
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@ahkcs ahkcs force-pushed the fix-binning-udf-resolution-4740 branch from c2d3dd0 to d0ddb19 Compare November 7, 2025 00:28
ahkcs added 2 commits November 6, 2025 16:38
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
penghuo
penghuo previously approved these changes Nov 7, 2025
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Example 20: Type coercion with string fields
==============================================

The ``bin`` command automatically converts string fields containing valid numeric values to numeric types, enabling binning operations on keyword/text fields. Only strings that represent valid numbers (e.g., "25", "30.5", "100") are supported. Non-numeric strings will cause an error.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to remove comments

bin <field> [span=<interval>] [minspan=<interval>] [bins=<count>] [aligntime=(earliest | latest | <time-specifier>)] [start=<value>] [end=<value>]

* field: mandatory. The numeric field to bin.
* field: mandatory. The field to bin. Accepts numeric fields, time-based fields, or string fields containing valid numeric values (e.g., "25", "30.5"). String fields with numeric values are automatically coerced to numeric type. Non-numeric strings will result in an error.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest remove this section.

Type conversion is a generic feature applied across all commands and functions, so it does not need to be explicitly mentioned in each one.

Additionally, enabling binning operations should target string-type fields, not keyword or text fields specifically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to remove this section

Comment on lines 78 to 80
throw new IllegalArgumentException(
"Cannot apply binning: field contains non-numeric string values. "
+ "Only numeric types or string types with valid numeric values are supported.");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should return a null value. For example, if only one document contains an invalid field, the entire query should not fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense, I have updated to remove the validation logic. I kept the validation logic in BinnableField to only verify if it's supported type(Reject truly unsupported types (e.g., BOOLEAN, ARRAY, MAP))

ahkcs added 2 commits November 7, 2025 13:40
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@ahkcs ahkcs requested a review from penghuo November 7, 2025 22:04
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@penghuo penghuo merged commit 20f2234 into opensearch-project:main Nov 11, 2025
35 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Nov 11, 2025
…Fs (#4742)

* Fix binning udf resolution

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* spotless fix

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* update tes

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* add comments

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* removal

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* update yaml test

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* rerun

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* update support for type coercion

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* update doc

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* enhance error handling and doc

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* fixes

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* fixes

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* fixes

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* Fix tests

Signed-off-by: Kai Huang <ahkcs@amazon.com>

---------

Signed-off-by: Kai Huang <ahkcs@amazon.com>
(cherry picked from commit 20f2234)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Swiddis pushed a commit that referenced this pull request Nov 11, 2025
…Fs (#4742) (#4780)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.19-dev bug Something isn't working bugFix PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Binning UDFs aren’t resolved via PPLFuncImpTable

4 participants