Description
After #1108 is merged, min
/max
aggregations will only support operations on values that are self-comparable. This includes Dates, specific numbers, strings, etc.
Other aggregations, like mean
and sum
support calculating statistics with number unification, so calculating the sum of an Int
-column, a Double
-column, and a column containing both floats and integers is no problem whatsoever. The result will always be Double
.
min
/max
currently throws an exception when this is done. This is (sort-of) inline with the Kotlin stdlib, as you can only calculate the min/max of a self-comparable iterable as well. However, from a user-perspective, when dealing with data of many types, it's obvious what columnOf(1, 2.0, 3.0f).min()
would return (1
!) and they might be surprised when it doesn't work. describe()
actually has a workaround for this.
I previously thought it was impossible due to overload resolution ambiguity, however, it's possible to create 3 overloads for each function like:
-
fun <T : Comparable<T & Any>?> DataColumn<T>.min(): T & Any
for normal comparables -
fun <T> DataColumn<T>.min(): T & Any where T : Number?, T : Comparable<T & Any?>
for normal numbers -
fun <T : Number?> DataColumn<T>.min(): T & Any
for mixed number types
We might also need two new aggregator handlers:
- an input handler that allows either self-comparables or numbers
- a selecting-like aggregation handler that functions like
aggregateBy
in the sense that it returns the item atindexOfAggregationResultSingleSequence
by default, such that the original type is preserved while the aggregation result is decided by the unified numbers.
The same will hold for median
and percentile
. These functions will already need to be split into 1. and 2. overloads because they have different return types. Adding a type 3. overload for mixed number types will not be much more difficult.