Skip to content

negative self-training loss for EncoderDecoderTrainer #238

Closed
@changbiHub

Description

@changbiHub

Hi, when I am doing pretraining using EncoderDecoderTrainer, I noticed my loss become negative. I think the problem originates from the EncoderDecoderLoss, which is

errors = x_pred - x_true

reconstruction_errors = torch.mul(errors, mask) ** 2

x_true_means = torch.mean(x_true, dim=0)
x_true_means[x_true_means == 0] = 1

x_true_stds = torch.std(x_true, dim=0) ** 2
x_true_stds[x_true_stds == 0] = x_true_means[x_true_stds == 0]

features_loss = torch.matmul(reconstruction_errors, 1 / x_true_stds)
nb_reconstructed_variables = torch.sum(mask, dim=1)
features_loss_norm = features_loss / (nb_reconstructed_variables + self.eps)

loss = torch.mean(features_loss_norm)

When x_true_means is negative, it would potentially cause x_true_stds to be negative. Should it be absolute value instead?

batch_stds[batch_stds == 0] = torch.abs(batch_means[batch_stds == 0])

I think the motivation is to scale the loss to the magnitude of the varience of the batch to count for different range of the features, when it fails(0 variance), it scale to the magnitude of the mean. It doesnt make too much sense when it becomes negative.

Also, there is a seperated problem for EncoderDecoderModel. _forward_tabnet returns x_embed_rec, x_embed, mask. This is inconsistent with others, which return x_embed, x_embed_rec, mask. This may cause miscalculation when pretaining Tabnet with EncoderDecoderTrainer, whose _train_step is

x_embed, x_embed_rec, mask = self.ed_model(X)
loss = self.loss_fn(x_embed, x_embed_rec, mask)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions