Skip to content

[FEATURE REQUEST]: NumericToCategoricalEnconding Input Transform #2879

Open
@jduerholt

Description

@jduerholt

Motivation

@TobyBoyne recently added the support for categorical dimensions to optimize_mixed_alternating. This implementation assumes that the categorical dimensions are integer encoded. As consequence, it is directly usable with the MixedSingleTaskGP, but not with GPs that for example assume one-hot encoded categoricals. A solution for this case would be to have a NumericToOneHot input transform that encodes the categorical feature(s) with in the model.

Currently, botorch features a OneHotToNumeric input transform including a untransform functionality. It should be relatively straight-forward to come up with a NumericToOneHot transform based on this. This would solve the issue of models which expect one-hot encoded features, but not other possible encodings. For example, in chemistry one often uses descriptor encodings, in which one transforms the categorical feature in some kind of descriptors space based on chemical descriptors. From a software engineering point of view, these descriptor encodings of categoricals are very much the same as one-hot encodings: one transforms a vector into a matrix, just the mapping is different. For this reason, I would like to implement a generic NumericToCategoricalEnconding input transform which takes upon instantiation information regarding the dimensionality of the encoding space and a (non-differentiable) callable that performs the transformation. This would allow to use optimize_mixed_alternating with any kind of categorical encoding.

What do you think? Does this sounds reasonable to you?

Best,

Johannes

Describe the solution you'd like to see implemented in BoTorch.

see above

Describe any alternatives you've considered to the above solution.

No response

Is this related to an existing issue in BoTorch or another repository? If so please include links to those Issues here.

No response

Pull Request

Yes

Code of Conduct

  • I agree to follow BoTorch's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions