Skip to content

Commit 1e31e11

Browse files
karl-richteralfonsogarciadecorralourownstoryjudussoari
authored
[major] lagged regressor with interaction modeling (shared NN) (#903)
* initial version of model attribution * consider quentiles in model interpretation * simplified captum integration * remove captum import error * adressed pr comments * initial shared covar net * support model weights for shared covar net * simplified covar net definition * docs: added docstring to covar weights function * reversed change on test file * refactored compute components in shared covar net * simplified weight calculation if shalllow covar net * added attribution-based component forecasting * refactored the compute components method * updated notebook with shared regressor net * support custom calculation of model attributions * isort + flake8 * removed todos * removed todos * fixing computing components * refactored storage of covar_weights * added docs * fixed pytests * added alternative attribution method * removed alternative attribution method * reduce pandas warning * Update plot_model_parameters_matplotlib.py * log scale on metrics plots * log scale on metrics plots * covar_net and ar_net initialised through networks arrays * covar_net and ar_net initialised through networks arrays * (ar_net_layers_array, covar_net_layers_array) renamed to (ar_layers, lagged_reg_layers ) * documentation updated * Tutorials updated * Minor apadtions: Docstr and Typing --------- Co-authored-by: alfonsogarciadecorral <alfonso.garcia.decorral@gmail.com> Co-authored-by: Oskar Triebe <ourownstory@users.noreply.github.com> Co-authored-by: julioare94 <julio@arend.mobi>
1 parent d26b9f6 commit 1e31e11

18 files changed

+1382
-332
lines changed

docs/source/guides/hyperparameter-selection.md

Lines changed: 7 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -31,17 +31,12 @@ The default loss function is the 'Huber' loss, which is considered to be robust
3131
However, you are free to choose the standard `MSE` or any other PyTorch `torch.nn.modules.loss` loss function.
3232

3333
## Increasing Depth of the Model
34-
`num_hidden_layers` defines the number of hidden layers of the FFNNs used in the overall model. This includes the
35-
AR-Net and the FFNN of the lagged regressors. The default is 0, meaning that the FFNNs will have only one final layer
36-
of size `n_forecasts`. Adding more layers results in increased complexity and also increased computational time, consequently.
37-
However, the added number of hidden layers can help build more complex relationships especially useful for the lagged
38-
regressors. To tradeoff between the computational complexity and the improved accuracy the `num_hidden_layers` is recommended
39-
to be set in between 1-2. Nevertheless, in most cases a good enough performance can be achieved by having no hidden layers at all.
40-
41-
`d_hidden` is the number of units in the hidden layers. This is only considered if `num_hidden_layers` is specified,
42-
otherwise ignored. The default value for `d_hidden` if not specified is (`n_lags` + `n_forecasts`). If tuned manually, the recommended
43-
practice is to set a value in between `n_lags` and `n_forecasts` for `d_hidden`. It is also important to note that with the current
44-
implementation, NeuralProphet sets the same `d_hidden` for the all the hidden layers.
34+
`ar_layers` defines the number of hidden layers and their sizes for the AR-Net in the overall model. It is an array where each element is the size of the corresponding hidden layer. The default is an empty array, meaning that the AR-Net will have only one final layer of size `n_forecasts`. Adding more layers results in increased complexity and also increased computational time, consequently. However, the added number of hidden layers can help build more complex relationships. To tradeoff between the computational complexity and the improved accuracy, the `ar_layers` is recommended to be set as an array with 1-2 elements. Nevertheless, in most cases, a good enough performance can be achieved by having no hidden layers at all.
35+
36+
`lagged_reg_layers` defines the number of hidden layers and their sizes for the lagged regressors' FFNN in the overall model. It is an array where each element is the size of the corresponding hidden layer. The default is an empty array, meaning that the FFNN of the lagged regressors will have only one final layer of size `n_forecasts`. Adding more layers results in increased complexity and also increased computational time, consequently. However, the added number of hidden layers can help build more complex relationships, especially useful for the lagged regressors. To tradeoff between the computational complexity and the improved accuracy, the `lagged_reg_layers` is recommended to be set as an array with 1-2 elements. Nevertheless, in most cases, a good enough performance can be achieved by having no hidden layers at all.
37+
38+
Please note that the previous `num_hidden_layers` and `d_hidden` arguments are now deprecated. The ar_net and covar_net architecture configuration is now input through `ar_layers` and `lagged_reg_layers`. If tuned manually, the recommended practice is to set values in between `n_lags` and `n_forecasts` for the sizes of the hidden layers. It is also important to note that with the current implementation, NeuralProphet allows you to specify different sizes for the hidden layers in both ar_net and covar_net.
39+
4540

4641
## Data Preprocessing Related Parameters
4742

@@ -83,7 +78,7 @@ distorted by such components, they can explicitly turn them off by setting the r
8378
`yearly_seasonality`, `weekly_seasonality` and `daily_seasonality` can also be set to number of Fourier terms of the respective seasonalities.
8479
The defaults are 6 for yearly, 4 for weekly and 6 for daily. Users can set this to any number they want. If the number of terms is 6 for yearly, that
8580
effectively makes the total number of Fourier terms for the yearly seasonality 12 (6*2), to accommodate both sine and cosine terms.
86-
Increasing the number of Fourier terms can make the model capable of capturing quite complex seasonal patterns. However, similar to the `num_hidden_layers`,
81+
Increasing the number of Fourier terms can make the model capable of capturing quite complex seasonal patterns. However, similar to the `ar_layers`,
8782
this too results in added model complexity. Users can get some insights about the optimal number of Fourier terms by looking at the final component
8883
plots. The default `seasonality_mode` is additive. This means that no heteroscedasticity is expected in the series in terms of the seasonality.
8984
However, if the series contains clear variance, where the seasonal fluctuations become larger proportional to the trend, the `seasonality_mode`

docs/source/tutorials/auto-regression.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,14 +29,14 @@ below.
2929

3030
![plot-param-1](../images/plot_param_ar_1.png){: style="height:600px"}
3131

32-
You can see the relevance of each of the lags when modelling the autocorrelation. You can also specify the `num_hidden_layers`
32+
You can see the relevance of each of the lags when modelling the autocorrelation. You can also specify the `ar_layers`
3333
for the AR-Net, in order to increase the complexity of the AR-Net.
3434

3535
```python
3636
m = NeuralProphet(
3737
n_forecasts=3,
3838
n_lags=5,
39-
num_hidden_layers=2,
39+
ar_layers=[32, 32],
4040
yearly_seasonality=False,
4141
weekly_seasonality=False,
4242
daily_seasonality=False
@@ -53,7 +53,7 @@ like below. For more details on setting a value for `ar_sparsity`, refer to the
5353
m = NeuralProphet(
5454
n_forecasts=3,
5555
n_lags=5,
56-
num_hidden_layers=2,
56+
ar_layers=[32, 32],
5757
ar_sparsity=0.01,
5858
yearly_seasonality=False,
5959
weekly_seasonality=False,

docs/zh/自回归.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,13 +24,13 @@ m = NeuralProphet(
2424

2525
![plot-param-1](http://neuralprophet.com/images/plot_param_ar_1.png)
2626

27-
在建立自相关模型时,您可以看到每个滞后的相关性。您也可以为AR-Net指定`num_hidden_layers`,以增加AR-Net的复杂性。
27+
在建立自相关模型时,您可以看到每个滞后的相关性。您也可以为AR-Net指定`ar_layers`,以增加AR-Net的复杂性。
2828

2929
```python
3030
m = NeuralProphet(
3131
n_forecasts=3,
3232
n_lags=5,
33-
num_hidden_layers=2,
33+
ar_layers=[32,32],
3434
yearly_seasonality=False,
3535
weekly_seasonality=False,
3636
daily_seasonality=False
@@ -45,7 +45,7 @@ m = NeuralProphet(
4545
m = NeuralProphet(
4646
n_forecasts=3,
4747
n_lags=5,
48-
num_hidden_layers=2,
48+
ar_layers=[32,32],
4949
ar_sparsity=0.01,
5050
yearly_seasonality=False,
5151
weekly_seasonality=False,

docs/zh/超参数选取.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,7 @@ NeuralProphet有一些超参数需要用户指定。如果没有指定,将使
1717
| `seasonality_reg` | None |
1818
| `n_forecasts` | 1 |
1919
| `n_lags` | 0 |
20-
| `num_hidden_layers` | 0 |
21-
| `d_hidden` | None |
20+
| `ar_layers` | [] |
2221
| `ar_sparsity` | None |
2322
| `learning_rate` | None |
2423
| `epochs` | None |
@@ -48,9 +47,12 @@ NeuralProphet采用随机梯度下降法进行拟合--更准确地说,是采
4847

4948
## 增加模型的深度
5049

51-
`num_hidden_layers`定义了整个模型中使用的FFNN的隐藏层数。这包括AR-Net和滞后回归项的FNNN。默认值为0,意味着FFNNs将只有一个大小为`n_forecasts`的最后一层。增加更多的层数会导致复杂度增加,也会因此增加计算时间。然而,增加的隐藏层数可以帮助建立更复杂的关系,特别是对滞后回归者有用。为了在计算复杂性和提高精度之间进行权衡,建议将 `num_hidden_layers` 设置在1-2之间。然而,在大多数情况下,完全没有隐藏层可以获得足够好的性能。
50+
`ar_layers`定义了整个模型中AR-Net的隐藏层数量及其大小。它是一个数组,其中每个元素都是相应隐藏层的大小。默认为空数组,这意味着AR-Net将只有一个大小为`n_forecasts`的最终层。添加更多层将增加复杂性和计算时间。然而,增加隐藏层的数量有助于构建更复杂的关系。为了在计算复杂性和改进精度之间取得平衡,建议将`ar_layers`设置为具有1-2个元素的数组。然而,在大多数情况下,通过完全没有隐藏层也可以实现足够好的性能。
51+
52+
`lagged_reg_layers`定义了整个模型中滞后回归器FFNN的隐藏层数量及其大小。它是一个数组,其中每个元素都是相应隐藏层的大小。默认为空数组,这意味着滞后回归器的FFNN将只有一个大小为`n_forecasts`的最终层。添加更多层将增加复杂性和计算时间。然而,增加隐藏层的数量有助于构建更复杂的关系,尤其是对于滞后回归器。为了在计算复杂性和改进精度之间取得平衡,建议将`lagged_reg_layers`设置为具有1-2个元素的数组。然而,在大多数情况下,通过完全没有隐藏层也可以实现足够好的性能。
53+
54+
请注意,以前的`num_hidden_layers``d_hidden`参数现在已被弃用。现在通过`ar_layers``lagged_reg_layers`输入ar_net和covar_net架构配置。如果手动调整,建议将隐藏层大小的值设置在`n_lags``n_forecasts`之间。同样重要的是要注意,当前的NeuralProphet实现允许您为ar_net和covar_net中的隐藏层指定不同的大小。
5255

53-
`d_hidden`是隐藏层的单位数。只有在指定了 "num_hidden_layers "的情况下才会考虑,否则忽略。如果没有指定,`d_hidden`的默认值是(`n_lags` + `n_forecasts`)。如果手动调整,建议的做法是在`n_lags``n_forecasts`之间为`d_hidden`设置一个值。还需要注意的是,在当前的实现中,NeuralProphet为所有的隐藏层设置了相同的`d_hidden`
5456

5557
## 数据预处理相关参数
5658

@@ -74,7 +76,7 @@ NeuralProphet采用随机梯度下降法进行拟合--更准确地说,是采
7476

7577
## 季节性相关参数
7678

77-
`yearly_seasonality`、`weekly_seasonality` 和 `daily_seasonality` 是关于要模拟的季节成分。例如,如果你使用温度数据,你可能可以选择每天和每年。例如,使用使用地铁的乘客数量更可能有一个每周的季节性。将这些季节性设置在默认的`auto`模式下,可以让NeuralProphet根据可用数据的多少来决定包括哪些季节性。例如,如果可用数据少于两年,则不会考虑年季节性。同样,如果可用数据少于两周,每周的季节性将不被考虑等等。然而,如果用户确定系列不包括年、周或日季节性,因此模型不应该被这些成分扭曲,他们可以通过设置相应的成分为`False`来明确关闭它们。除此之外,参数 `yearly_seasonality`、`weekly_seasonality` 和 `daily_seasonality` 也可以设置为各自季节性的傅里叶项数。默认值为年6,周4和日6。用户可以将其设置为任何他们想要的数字。如果每年的项数为6,那么实际上每年季节性的傅里叶项总数为12(6*2),以适应正弦和余弦项。增加Fourier项的数量可以使模型能够捕捉相当复杂的季节性模式。然而,与 `num_hidden_layers`类似,这也会增加模型的复杂性。用户可以通过观察最终的分量图来了解最佳的Fourier项数。默认的`seasonality_mode`是加法。这意味着在季节性方面,序列中没有异方差。然而,如果序列包含明显的方差,季节性波动与趋势成正比,则可以将`seasonality_mode` 设置为乘法。
79+
`yearly_seasonality`、`weekly_seasonality` 和 `daily_seasonality` 是关于要模拟的季节成分。例如,如果你使用温度数据,你可能可以选择每天和每年。例如,使用使用地铁的乘客数量更可能有一个每周的季节性。将这些季节性设置在默认的`auto`模式下,可以让NeuralProphet根据可用数据的多少来决定包括哪些季节性。例如,如果可用数据少于两年,则不会考虑年季节性。同样,如果可用数据少于两周,每周的季节性将不被考虑等等。然而,如果用户确定系列不包括年、周或日季节性,因此模型不应该被这些成分扭曲,他们可以通过设置相应的成分为`False`来明确关闭它们。除此之外,参数 `yearly_seasonality`、`weekly_seasonality` 和 `daily_seasonality` 也可以设置为各自季节性的傅里叶项数。默认值为年6,周4和日6。用户可以将其设置为任何他们想要的数字。如果每年的项数为6,那么实际上每年季节性的傅里叶项总数为12(6*2),以适应正弦和余弦项。增加Fourier项的数量可以使模型能够捕捉相当复杂的季节性模式。然而,与 `ar_layers`类似,这也会增加模型的复杂性。用户可以通过观察最终的分量图来了解最佳的Fourier项数。默认的`seasonality_mode`是加法。这意味着在季节性方面,序列中没有异方差。然而,如果序列包含明显的方差,季节性波动与趋势成正比,则可以将`seasonality_mode` 设置为乘法。
7880

7981
## 正则化相关参数
8082

neuralprophet/configure.py

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,7 @@
2121

2222
@dataclass
2323
class Model:
24-
num_hidden_layers: int
25-
d_hidden: Optional[int]
24+
lagged_reg_layers: Optional[List[int]]
2625

2726

2827
@dataclass
@@ -345,6 +344,7 @@ def append(self, name, period, resolution, arg, condition_name):
345344
class AR:
346345
n_lags: int
347346
ar_reg: Optional[float] = None
347+
ar_layers: Optional[List[int]] = None
348348

349349
def __post_init__(self):
350350
if self.ar_reg is not None and self.ar_reg > 0:
@@ -383,8 +383,7 @@ class LaggedRegressor:
383383
as_scalar: bool
384384
normalize: Union[bool, str]
385385
n_lags: int
386-
num_hidden_layers: Optional[int]
387-
d_hidden: Optional[int]
386+
lagged_reg_layers: Optional[List[int]]
388387

389388
def __post_init__(self):
390389
if self.reg_lambda is not None:

neuralprophet/forecaster.py

Lines changed: 23 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -168,16 +168,18 @@ class NeuralProphet:
168168
Large values (~1-100) will limit the number of nonzero coefficients dramatically.
169169
Small values (~0.001-1.0) will allow more non-zero coefficients.
170170
default: 0 no regularization of coefficients.
171+
ar_layers : list of int, optional
172+
array of hidden layer dimensions of the AR-Net. Specifies number of hidden layers (number of entries)
173+
and layer dimension (list entry).
171174
172175
COMMENT
173176
Model Config
174177
COMMENT
175178
n_forecasts : int
176179
Number of steps ahead of prediction time step to forecast.
177-
num_hidden_layers : int, optional
178-
number of hidden layer to include in AR-Net (defaults to 0)
179-
d_hidden : int, optional
180-
dimension of hidden layers of the AR-Net. Ignored if ``num_hidden_layers`` == 0.
180+
lagged_reg_layers : list of int, optional
181+
array of hidden layer dimensions of the Covar-Net. Specifies number of hidden layers (number of entries)
182+
and layer dimension (list entry).
181183
182184
COMMENT
183185
Train Config
@@ -344,9 +346,9 @@ def __init__(
344346
season_global_local: np_types.SeasonGlobalLocalMode = "global",
345347
n_forecasts: int = 1,
346348
n_lags: int = 0,
347-
num_hidden_layers: int = 0,
348-
d_hidden: Optional[int] = None,
349+
ar_layers: Optional[list] = [],
349350
ar_reg: Optional[float] = None,
351+
lagged_reg_layers: Optional[list] = [],
350352
learning_rate: Optional[float] = None,
351353
epochs: Optional[int] = None,
352354
batch_size: Optional[int] = None,
@@ -414,18 +416,12 @@ def __init__(
414416
self.metrics = utils_metrics.get_metrics(collect_metrics)
415417

416418
# AR
417-
self.config_ar = configure.AR(
418-
n_lags=n_lags,
419-
ar_reg=ar_reg,
420-
)
419+
self.config_ar = configure.AR(n_lags=n_lags, ar_reg=ar_reg, ar_layers=ar_layers)
421420
self.n_lags = self.config_ar.n_lags
422421
self.max_lags = self.n_lags
423422

424423
# Model
425-
self.config_model = configure.Model(
426-
num_hidden_layers=num_hidden_layers,
427-
d_hidden=d_hidden,
428-
)
424+
self.config_model = configure.Model(lagged_reg_layers=lagged_reg_layers)
429425

430426
# Trend
431427
self.config_trend = configure.Trend(
@@ -480,8 +476,6 @@ def add_lagged_regressor(
480476
self,
481477
names: Union[str, List[str]],
482478
n_lags: Union[int, np_types.Literal["auto", "scalar"]] = "auto",
483-
num_hidden_layers: Optional[int] = None,
484-
d_hidden: Optional[int] = None,
485479
regularization: Optional[float] = None,
486480
normalize: Union[bool, str] = "auto",
487481
):
@@ -497,21 +491,14 @@ def add_lagged_regressor(
497491
previous regressors time steps to use as input in the predictor (covar order)
498492
if ``auto``, time steps will be equivalent to the AR order (default)
499493
if ``scalar``, all the regressors will only use last known value as input
500-
num_hidden_layers : int
501-
number of hidden layers to include in Lagged-Regressor-Net (defaults to same configuration as AR-Net)
502-
d_hidden : int
503-
dimension of hidden layers of the Lagged-Regressor-Net. Ignored if ``num_hidden_layers`` == 0.
504494
regularization : float
505495
optional scale for regularization strength
506496
normalize : bool
507497
optional, specify whether this regressor will benormalized prior to fitting.
508498
if ``auto``, binary regressors will not be normalized.
509499
"""
510-
if num_hidden_layers is None:
511-
num_hidden_layers = self.config_model.num_hidden_layers
500+
lagged_reg_layers = self.config_model.lagged_reg_layers
512501

513-
if d_hidden is None:
514-
d_hidden = self.config_model.d_hidden
515502
if n_lags == 0 or n_lags is None:
516503
n_lags = 0
517504
log.warning(
@@ -552,8 +539,7 @@ def add_lagged_regressor(
552539
normalize=normalize,
553540
as_scalar=only_last_value,
554541
n_lags=n_lags,
555-
num_hidden_layers=num_hidden_layers,
556-
d_hidden=d_hidden,
542+
lagged_reg_layers=lagged_reg_layers,
557543
)
558544
return self
559545

@@ -2462,8 +2448,8 @@ def _init_model(self):
24622448
n_forecasts=self.n_forecasts,
24632449
n_lags=self.n_lags,
24642450
max_lags=self.max_lags,
2465-
num_hidden_layers=self.config_model.num_hidden_layers,
2466-
d_hidden=self.config_model.d_hidden,
2451+
ar_layers=self.config_ar.ar_layers,
2452+
lagged_reg_layers=self.config_model.lagged_reg_layers,
24672453
metrics=self.metrics,
24682454
id_list=self.id_list,
24692455
num_trends_modelled=self.num_trends_modelled,
@@ -2519,7 +2505,12 @@ def _init_train_loader(self, df, num_workers=0):
25192505
# Determine the max_number of epochs
25202506
self.config_train.set_auto_batch_epoch(n_data=len(dataset))
25212507

2522-
loader = DataLoader(dataset, batch_size=self.config_train.batch_size, shuffle=True, num_workers=num_workers)
2508+
loader = DataLoader(
2509+
dataset,
2510+
batch_size=self.config_train.batch_size,
2511+
shuffle=True,
2512+
num_workers=num_workers,
2513+
)
25232514

25242515
return loader
25252516

@@ -2748,7 +2739,9 @@ def _predict_raw(self, df, df_name, include_components=False, prediction_frequen
27482739
dates = df["ds"].iloc[self.max_lags :]
27492740

27502741
# Pass the include_components flag to the model
2751-
self.model.set_compute_components(include_components)
2742+
if include_components:
2743+
self.model.set_compute_components(include_components)
2744+
self.model.set_covar_weights(self.model.get_covar_weights())
27522745
# Compute the predictions and components (if requested)
27532746
result = self.trainer.predict(self.model, loader)
27542747
# Extract the prediction and components

0 commit comments

Comments
 (0)