-
Notifications
You must be signed in to change notification settings - Fork 689
[ENH] [WIP] Standardize model output to 4d tensor #1895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- If the input tensor is 2D, it is reshaped to (batch_size, timesteps, 1, 1). | ||
- If the input tensor is 3D, it is reshaped to (batch_size, timesteps, 1, 1) for a | ||
multivariate single-target forecast, or (batch_size, timesteps, 1, last_dim) for a univariate quantile forecast. | ||
- If the input tensor is 4D, it is assumed to be in the shape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for 4D, you need to be explicit about how it is getting reshaped.
but any value can be explicitly provided. A fallback of 1 is used in case where | ||
no information is available on ``last_dim``. | ||
|
||
[2] This can currently handle situations where a single target is used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain that and give an example? Can you give the simplest example that shows a 4D tensor is not possible in this case? Is this, for instance, that we use squared loss for variable 1 but parametric log-loss on mean/variance (of normal) on variable 2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe using nestedtensor
-s?
https://docs.pytorch.org/tutorials/prototype/nestedtensor.html
(is this a stable feature? Looks like it?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can try looking at this..
nice, I now get it. I left some minor remraks. We should add tests for all individual conversion pathways here. |
…atP/pytorch-forecasting into tslib-dlinear-model
…anavBhatP/pytorch-forecasting into standardise-output-format
My 2 cents after the discussion of today (both with @PranavBhatP and @phoeenniixx ) maybe is better to setup the output format as a list of tensors (one for each output channels) and each tensor with the shape |
Sorry, I didn't get you fully, are you suggesting that we make all outputs as a list of tensor? |
Overall though, I think we should move towards writing an enhancement proposal, i.e., a proper API design document. This episode has shown me that this is a difficult problem where we should map out all use case vignettes and features before committing to a specific design - we have already changed it twice, partly also due to me not catching the difficulty earlier and recommending to go to API design first. |
@fkiraly @agobbifbk @phoeenniixx The proposal by this PR for a standard 4d tensor, can be scrapped. A better approach is outlined in the design doc. (high-level). Why should this be scrapped?
Imo, working on the design document was a really useful step and it did cover lots of gaps. We can ignore this proposal PR for now. Maybe close this pr also? |
I would agree - for me the main argument would be the mixed loss case. Though I would not delete the branch yet, the code might still be useful as lookup later on, and we have not fully decided on a target design yet. |
Reference Issues/PRs
Fixes #1894 . Stacks on #1874.
Files changed:
-
_base_model_v2.py
What does this implement/fix? Explain your changes.
The PR implements a
standardize_model_output
function in the v2BaseModel
class. This will be used as a helper function, to enforce the proposed standard for model tensor outputs, in the format -(batch_size, timesteps, n_features, quantiles)
. The PR is still a draft and needs discussion before actually enforcing the output change at the model level, since it will lead to widespread test failure due to a major breaking change in the output format.Once the design is decided, we can proceed with trying this out with two models -
dlinear
andtimexer
.What should a reviewer concentrate their feedback on?
Did you add any tests for the change?
Any other comments?
PR checklist
pre-commit install
.To run hooks independent of commit, execute
pre-commit run --all-files