- 
                Notifications
    You must be signed in to change notification settings 
- Fork 76
Description
Continuation of #558 which fixed the most annoying bugs related to describe.
See #558 for more information.
Our statistics functions need some more love. We used to have many missing types (mostly fixed by #937), but there are yet some more inconsistencies to be solved:
As mentioned here #543, some functions like median(ints) might result in an unexpectedly rounded Int in return. It might be better to let all functions return
Doubleand then handleBigInteger/BigDecimalseparately for now, as they're java-specific for now.
There are plenty of public overloads onIterableandSequence. It's fine to have them internally, but I feel like we're clogging the public scope here. mean, for instance, is already covered in the stdlib.
We'll need to hide public functions that are not on DataColumn as @AndreiKingsley will probably make a statistics library for that anyway.
We need to honor some conversion table (see below)
We won't support UByte, UShort, UInt, and ULong since they don't inherit Number.
We also drop support for BigNumber and BigDecimal as this makes generic typing and conversion very difficult and unpredictable.
Progress:
- underlying fixes Aggregator implementation rework #1078
- mean Mean statistics fixes #1091
- sum Sum statistics and aggregator improvements #1103
-  min Aggregatordependency injection,min/max, andskipNaN#1108
-  max Aggregatordependency injection,min/max, andskipNaN#1108
- std Overhaul for std #1119
- median Median overhaul #1122
- percentile Percentile #1149
- cumSum CumSum #1152
| Function | Conversion | extra information | nulls in input | 
|---|---|---|---|
| mean | Int -> Double | For all: Double.NaN if no elements | All nulls are filtered out | 
| Short -> Double | |||
| Byte -> Double | |||
| Long -> Double | |||
| Double -> Double | skipNaN option, false by default | ||
| Float -> Double | skipNaN option, false by default | ||
| Number -> Conversion(Common number type) -> Double | skipNaN option, false by default | ||
| Nothing / no values -> Double.NaN | |||
| sum | Int -> Int | All default to zero if no values | All nulls are filtered out | 
| Short -> Int | |||
| Byte -> Int | |||
| Long -> Long | |||
| Double -> Double | skipNaN option, false by default | ||
| Float -> Float | skipNaN option, false by default | ||
| Number -> Conversion(Common number type) -> Number | skipNaN option, false by default | ||
| Nothing / no values -> Double (0.0) | |||
| cumSum | Int -> Int | All default to zero if no values | All can optionally skip nulls in input with skipNull option, true by default | 
| Short -> Int | important because order matters with cumSum | ||
| Byte -> Int | |||
| Long -> Long | |||
| Double -> Double | skipNaN option, true by default | ||
| Float -> Float | skipNaN option, true by default | ||
| Number -> Conversion(Common number type) -> Number | skipNaN option, true by default | ||
| Nothing / no values -> Double (0.0) | |||
| min/max | T -> T? where T : Comparable<T> | For all: null if no elements, has -OrNull overloads | All nulls are filtered out | 
| Int -> Int? | |||
| Short -> Short? | |||
| Byte -> Byte? | |||
| Long -> Long? | |||
| Double -> Double? | skipNaN option, false by default, returns NaN when in the input | ||
| Float -> Float? | skipNaN option, false by default, returns NaN when in the input | ||
| Would need more overloads and more work | |||
| Nothing / no values -> Nothing? (null) | |||
| median/percentile | T -> T? where T : Comparable<T> | For all: median of even list will cause conversion to Double if possible, else lower middle | All nulls are filtered out | 
| Int -> Double? | null if no elements | ||
| Short -> Double? | |||
| Byte -> Double? | |||
| Long -> Double? | |||
| Double -> Double? | |||
| Float -> Double? | |||
| Would need more overloads and more work | |||
| Nothing / no values -> Nothing? (null) | |||
| std | Int -> Double | All have DDoF (Delta Degrees of Freedom) argument | All nulls are filtered out | 
| Short -> Double | and Double.NaN if no elements | ||
| Byte -> Double | |||
| Long -> Double | |||
| Double -> Double | skipNaN option, false by default | ||
| Float -> Double | skipNaN option, false by default | ||
| Number -> Conversion(Common number type) -> Double | skipNaN option, false by default | ||
| Nothing / no values -> Double.NaN | |||
| var (want to add?) | same as std |