-
Notifications
You must be signed in to change notification settings - Fork 57
Description
Hi,
i am attempting to build a equivariant Variational Encoder-Decoder framework.
For this I am using R2Conv() and R3(Conv) layers in the encoder with trivial-representation input & output and regular-representations in between. For the Decoder I would like to use equivariant MLPs. However it is quite unclear to me how the examples map to a generic MLP.
For example I do not understand how one could specify the input and output-dimension respectively. Instead it seems to me, that the equivariant MLP expects (just like a CNN) a 2D or 3D dimensional input, and that the output dimension is determined by the Harmonics-decomposition of functions on that space. In contrast to that a MLP accepts a flat input and the (flat) output dimension is a hyperparameter specified by the user.
During my learning process, I start with a rectangular input grid of shape [B,1,X,Y,Z] corresponding to a scalar (field trivial representation). Use R3(Conv) to get [B,1,X,Y,1] with one hidden regular-representation and a trivial representation output, store [B,1,Z_encoding_size] as the encoding of Z and continue with [B,X,Y,1] and R2Conv() to obtain the encodings of X and Y in shape
[B, 1, X_encoding_size, Y_encoding_size].
A final linear layer maps the [B,1, X_encoding_size , Y_encoding_size , Z_encoding_size] shaped encoding to a latent-space that parametrizes the mean and variance of a distribution.
This to me seems more or less clear. The Decoder part much less.
I really hope for some clarification. The equivariant learning procedure is something I only discovered a week ago and it seems like opening the Pandora box considering all the nice but extensive theory behind it.
Sadly I do not have the time to pick up on it nor is there anyone in my environment who knows that stuff.
Is it reasonable to expect having a learning model within a week?