Skip to content

[Feature Request] weighted KNN imputation #318

@bvenn

Description

@bvenn

FSharp.Stats already supports the KNN imputation via FSharp.Stats.ML.Impute.kNearestImpute. The current implementation takes the k nearest neighbors and computes the average of these at the index of interest. This average replaces the missing value of the incomplete data point. I suggest to make the following changes/additions:

  • rename the module to Imputation to be consistent within the library
  • add the possibility to define how a missing value is encoded (e.g., 0.0 or nan)
  • add an optional converter function that processes the distance measure. When using Pearson's correlation coefficient you determine the similarity rather the distance and therefore you have to take the reciprocal.
  • add a weighted version in which the averaging can be weighted according to the distance of the nearest neighbors
  • add proper documentation

Keywords

  • Local Least Squares

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions