Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix bugs in EncoderDecoderTrainer and EncoderDecoderLoss
I identified two bugs while using EncoderDecoderTrainer for pretraining:
Issue 1: Negative Loss Values in EncoderDecoderLoss
When calculating the loss, if
x_true_stds
is zero, it's replaced with the corresponding value fromx_true_means
. If these mean values are negative, they cause the loss to become negative, which is problematic for optimization.The problematic code:
Issue 2: Inconsistent Return Order in _forward_tabnet
The
_forward_tabnet
method returns values in a different order compared to other encoder-decoder models:x_embed, x_embed_rec, mask
x_embed_rec, x_embed, mask
This inconsistency causes incorrect inputs to the loss function when using TabNet with EncoderDecoderTrainer.
Solution
For Issue 1:
Modified the EncoderDecoderLoss to use the absolute value of means when replacing zero standard deviations:
This ensures that the scaling factor remains positive, which is conceptually correct since standard deviations are always non-negative.
For Issue 2:
Changed the return order in
_forward_tabnet
to match the convention used by other models: