Skip to content

[ENH] implement LSTM model, and migrate LSTMmodel from notebook to main code base #1582

Open
@svnv-svsv-jm

Description

@svnv-svsv-jm

Just create this dataset:

import numpy as np
import pandas as pd

multi_target_test_data = pd.DataFrame(
    dict(
        target1=np.random.rand(30),
        target2=np.random.rand(30),
        group=np.repeat(np.arange(3), 10),
        time_idx=np.tile(np.arange(10), 3),
    )
)

from pytorch_forecasting import TimeSeriesDataSet
from pytorch_forecasting.data.encoders import EncoderNormalizer, MultiNormalizer, TorchNormalizer

# create the dataset from the pandas dataframe
dataset = TimeSeriesDataSet(
    multi_target_test_data,
    group_ids=["group"],
    target=["target1", "target2"],  # USING two targets
    time_idx="time_idx",
    min_encoder_length=5,
    max_encoder_length=5,
    min_prediction_length=2,
    max_prediction_length=2,
    time_varying_unknown_reals=["target1", "target2"],
    target_normalizer=MultiNormalizer(
        [EncoderNormalizer(), TorchNormalizer()]
    ),  # Use the NaNLabelEncoder to encode categorical target
)

And input it to the current LSTMModel in the tutorials:

model = LSTMModel.from_dataset(
    dataset,
    n_layers=2,
    hidden_size=10,
    loss=MultiLoss([MAE() for _ in range(2)]),
)

x, y = next(iter(dataset.to_dataloader()))

print(
    "prediction shape in training:", model(x)["prediction"].size()
)  # batch_size x decoder time steps x 1 (1 for one target dimension)
model.eval()  # set model into eval mode to use autoregressive prediction
print("prediction shape in inference:", model(x)["prediction"].size())  # should be the same as in training

And you'll get:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-13-037e20e20342> in <cell line: 3>()
      2 
      3 print(
----> 4     "prediction shape in training:", model(x)["prediction"].size()
      5 )  # batch_size x decoder time steps x 1 (1 for one target dimension)
      6 model.eval()  # set model into eval mode to use autoregressive prediction

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)
   1530             return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1531         else:
-> 1532             return self._call_impl(*args, **kwargs)
   1533 
   1534     def _call_impl(self, *args, **kwargs):

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
   1539                 or _global_backward_pre_hooks or _global_backward_hooks
   1540                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1541             return forward_call(*args, **kwargs)
   1542 
   1543         try:

<ipython-input-11-af0ffbd16c05> in forward(self, x)
    105 
    106     def forward(self, x: Dict[str, torch.Tensor]) -> Dict[str, torch.Tensor]:
--> 107         hidden_state = self.encode(x)  # encode to hidden state
    108         output = self.decode(x, hidden_state)  # decode leveraging hidden state
    109 

<ipython-input-11-af0ffbd16c05> in encode(self, x)
     51         effective_encoder_lengths = x["encoder_lengths"] - 1
     52         # run through LSTM network
---> 53         _, hidden_state = self.lstm(
     54             input_vector, lengths=effective_encoder_lengths, enforce_sorted=False  # passing the lengths directly
     55         )  # second ouput is not needed (hidden state)

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)
   1530             return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1531         else:
-> 1532             return self._call_impl(*args, **kwargs)
   1533 
   1534     def _call_impl(self, *args, **kwargs):

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
   1539                 or _global_backward_pre_hooks or _global_backward_hooks
   1540                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1541             return forward_call(*args, **kwargs)
   1542 
   1543         try:

/usr/local/lib/python3.10/dist-packages/pytorch_forecasting/models/nn/rnn.py in forward(self, x, hx, lengths, enforce_sorted)
    105             else:
    106                 pack_lengths = lengths.where(lengths > 0, torch.ones_like(lengths))
--> 107                 packed_out, hidden_state = super().forward(
    108                     rnn.pack_padded_sequence(
    109                         x, pack_lengths.cpu(), enforce_sorted=enforce_sorted, batch_first=self.batch_first

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/rnn.py in forward(self, input, hx)
    912                               self.dropout, self.training, self.bidirectional, self.batch_first)
    913         else:
--> 914             result = _VF.lstm(input, batch_sizes, hx, self._flat_weights, self.bias,
    915                               self.num_layers, self.dropout, self.training, self.bidirectional)
    916         output = result[0]

RuntimeError: mat1 and mat2 shapes cannot be multiplied (48x2 and 1x40)

It's just weird that there is still no fix for this, and no LSTM model out-of-the-box. I even made a fix, there is a PR.

Why does no one care about fixing this?

It is totally obscure how pytorch_forecasting handles uni-/multi-targets, I've also noticed that if you pass target=["target"] to TimeSeriesDataSet, the TimeSeriesDataSet behaves very differently w.r.t. if you passed target="target".

Please just review that PR and even merge it, or fix it...

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationgood first issueGood for newcomers

    Type

    No type

    Projects

    Status

    Needs triage & validation

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions