Skip to content

Commit 718dae7

Browse files
authored
Add duration parameter in TimeSeasonality (#509)
* Add num_steps_per_season parameter in TimeSeasonality * Use floor sign to write the seasonal component * Models Definition * Edit symbolic graph * Correct transition matrix * Correct documentation * Add second examples in the documentation * Fix bugs * Merge with main * Minor fixes * Handle initial states * Build symbolic Graph from pytensor operations * Extend tests for the case duration > 1 * Reduce number of tests for duration * Merge & add conditions for raising ValueErrors for season_length and duration * Fix bug
1 parent 0bbf0d6 commit 718dae7

File tree

2 files changed

+238
-90
lines changed

2 files changed

+238
-90
lines changed

pymc_extras/statespace/models/structural/components/seasonality.py

Lines changed: 162 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -14,24 +14,29 @@ class TimeSeasonality(Component):
1414
----------
1515
season_length: int
1616
The number of periods in a single seasonal cycle, e.g. 12 for monthly data with annual seasonal pattern, 7 for
17-
daily data with weekly seasonal pattern, etc.
17+
daily data with weekly seasonal pattern, etc. It must be greater than one.
18+
19+
duration: int, default 1
20+
Number of time steps for each seasonal period.
21+
This determines how long each seasonal period is held constant before moving to the next.
1822
1923
innovations: bool, default True
2024
Whether to include stochastic innovations in the strength of the seasonal effect
2125
2226
name: str, default None
2327
A name for this seasonal component. Used to label dimensions and coordinates. Useful when multiple seasonal
24-
components are included in the same model. Default is ``f"Seasonal[s={season_length}]"``
28+
components are included in the same model. Default is ``f"Seasonal[s={season_length}, d={duration}]"``
2529
2630
state_names: list of str, default None
27-
List of strings for seasonal effect labels. If provided, it must be of length ``season_length``. An example
28-
would be ``state_names = ['Mon', 'Tue', 'Wed', 'Thur', 'Fri', 'Sat', 'Sun']`` when data is daily with a weekly
31+
List of strings for seasonal effect labels. If provided, it must be of length ``season_length`` times ``duration``.
32+
An example would be ``state_names = ['Mon', 'Tue', 'Wed', 'Thur', 'Fri', 'Sat', 'Sun']`` when data is daily with a weekly
2933
seasonal pattern (``season_length = 7``).
3034
31-
If None, states will be numbered ``[State_0, ..., State_s]``
35+
If None and ``duration = 1``, states will be named as ``[State_0, ..., State_s-1]`` (here s is ``season_length``).
36+
If None and ``duration > 1``, states will be named as ``[State_0_0, ..., State_s-1_d-1]`` (here d is ``duration``).
3237
3338
remove_first_state: bool, default True
34-
If True, the first state will be removed from the model. This is done because there are only n-1 degrees of
39+
If True, the first state will be removed from the model. This is done because there are only ``season_length-1`` degrees of
3540
freedom in the seasonal component, and one state is not identified. If False, the first state will be
3641
included in the model, but it will not be identified -- you will need to handle this in the priors (e.g. with
3742
ZeroSumNormal).
@@ -41,17 +46,76 @@ class TimeSeasonality(Component):
4146
4247
Notes
4348
-----
44-
A seasonal effect is any pattern that repeats every fixed interval. Although there are many possible ways to
45-
model seasonal effects, the implementation used here is the one described by [1] as the "canonical" time domain
46-
representation. The seasonal component can be expressed:
49+
A seasonal effect is any pattern that repeats at fixed intervals. There are several ways to model such effects;
50+
here, we present two models that are straightforward extensions of those described in [1].
51+
52+
**First model** (``remove_first_state=True``)
53+
54+
In this model, the state vector is defined as:
55+
56+
.. math::
57+
\alpha_t :=(\gamma_t, \ldots, \gamma_{t-d(s-1)+1}), \quad t \ge 0.
58+
59+
This vector has length :math:`d(s-1)`, where:
60+
61+
- :math:`s` is the ``seasonal_length`` parameter, and
62+
- :math:`d` is the ``duration`` parameter.
63+
64+
The components of the initial vector :math:`\alpha_{0}` are given by
65+
66+
.. math::
67+
\gamma_{-l} := \tilde{\gamma}_{k_l}, \quad \text{where} \quad k_l := \left\lfloor \frac{l}{d} \right\rfloor \bmod s \quad \text{and} \quad l=0,\ldots, d(s-1)-1.
68+
69+
Here, the values
4770
4871
.. math::
49-
\gamma_t = -\sum_{i=1}^{s-1} \gamma_{t-i} + \omega_t, \quad \omega_t \sim N(0, \sigma_\gamma)
72+
\tilde{\gamma}_{0}, \ldots, \tilde{\gamma}_{s-2},
5073
51-
Where :math:`s` is the ``seasonal_length`` parameter and :math:`\omega_t` is the (optional) stochastic innovation.
52-
To give interpretation to the :math:`\gamma` terms, it is helpful to work through the algebra for a simple
53-
example. Let :math:`s=4`, and omit the shock term. Define initial conditions :math:`\gamma_0, \gamma_{-1},
54-
\gamma_{-2}`. The value of the seasonal component for the first 5 timesteps will be:
74+
represent the initial seasonal states. The transition matrix of this model is the :math:`d(s-1) \times d(s-1)` matrix
75+
76+
.. math::
77+
\begin{bmatrix}
78+
-\mathbf{1}_d & -\mathbf{1}_d & \cdots & -\mathbf{1}_d & -\mathbf{1}_d \\
79+
\mathbf{1}_d & \mathbf{0}_d & \cdots & \mathbf{0}_d & \mathbf{0}_d \\
80+
\mathbf{0}_d & \mathbf{1}_d & \cdots & \mathbf{0}_d & \mathbf{0}_d \\
81+
\vdots & \vdots & \ddots & \vdots \\
82+
\mathbf{0}_d & \mathbf{0}_d & \cdots & \mathbf{1}_d & \mathbf{0}_d
83+
\end{bmatrix}
84+
85+
where :math:`\mathbf{1}_d` and :math:`\mathbf{0}_d` denote the :math:`d \times d` identity and null matrices, respectively.
86+
87+
**Second model** (``remove_first_state=False``)
88+
89+
In contrast, the state vector in the second model is defined as:
90+
91+
.. math::
92+
\alpha_t=(\gamma_t, \ldots, \gamma_{t-ds+1}), \quad t \ge 0.
93+
94+
This vector has length :math:`ds`. The components of the initial state vector :math:`\alpha_{0}` are defined similarly:
95+
96+
.. math::
97+
\gamma_{-l} := \tilde{\gamma}_{k_l}, \quad \text{where} \quad k_l := \left\lfloor \frac{l}{d} \right\rfloor \bmod s \quad \text{and} \quad l=0,\ldots, ds-1.
98+
99+
In this case, the initial seasonal states :math:`\tilde{\gamma}_{0}, \ldots, \tilde{\gamma}_{s-1}` are required to satisfy the following condition:
100+
101+
.. math::
102+
\sum_{i=0}^{s-1} \tilde{\gamma}_{i} = 0.
103+
104+
The transition matrix of this model is the following :math:`ds \times ds` circulant matrix:
105+
106+
.. math::
107+
\begin{bmatrix}
108+
0 & 1 & 0 & \cdots & 0 \\
109+
0 & 0 & 1 & \cdots & 0 \\
110+
\vdots & \vdots & \ddots & \ddots & \vdots \\
111+
0 & 0 & \cdots & 0 & 1 \\
112+
1 & 0 & \cdots & 0 & 0
113+
\end{bmatrix}
114+
115+
To give interpretation to the :math:`\gamma` terms, it is helpful to work through the algebra for a simple
116+
example. Let :math:`s=4`, :math:`d=1`, ``remove_first_state=True``, and omit the shock term. Then, we have
117+
:math:`\gamma_{-i} = \tilde{\gamma}_{-i}`, for :math:`i=-2,\ldots, 0` and the value of the seasonal component
118+
for the first 5 timesteps will be:
55119
56120
.. math::
57121
\begin{align}
@@ -85,10 +149,38 @@ class TimeSeasonality(Component):
85149
And so on. So for interpretation, the ``season_length - 1`` initial states are, when reversed, the coefficients
86150
associated with ``state_names[1:]``.
87151
152+
In the next example, we set :math:`s=2`, :math:`d=2`, ``remove_first_state=True``, and omit the shock term.
153+
By definition, the initial vector :math:`\alpha_{0}` is
154+
155+
.. math::
156+
\alpha_0=(\tilde{\gamma}_{0}, \tilde{\gamma}_{0}, \tilde{\gamma}_{-1}, \tilde{\gamma}_{-1})
157+
158+
and the transition matrix is
159+
160+
.. math::
161+
\begin{bmatrix}
162+
-1 & 0 & -1 & 0 \\
163+
0 & -1 & 0 & -1 \\
164+
1 & 0 & 0 & 0 \\
165+
0 & 1 & 0 & 0 \\
166+
\end{bmatrix}
167+
168+
It is easy to verify that:
169+
170+
.. math::
171+
\begin{align}
172+
\gamma_1 &= -\tilde{\gamma}_0 - \tilde{\gamma}_{-1}\\
173+
\gamma_2 &= -(-\tilde{\gamma}_0 - \tilde{\gamma}_{-1})-\tilde{\gamma}_0\\
174+
&= \tilde{\gamma}_{-1}\\
175+
\gamma_3 &= -\tilde{\gamma}_{-1} +(\tilde{\gamma}_0 + \tilde{\gamma}_{-1})\\
176+
&= \tilde{\gamma}_{0}\\
177+
\gamma_4 &= -\tilde{\gamma}_0 - \tilde{\gamma}_{-1}.\\
178+
\end{align}
179+
88180
.. warning::
89-
Although the ``state_names`` argument expects a list of length ``season_length``, only ``state_names[1:]``
90-
will be saved as model dimensions, since the 1st coefficient is not identified (it is defined as
91-
:math:`-\sum_{i=1}^{s} \gamma_{t-i}`).
181+
Although the ``state_names`` argument expects a list of length ``season_length`` times ``duration``,
182+
only ``state_names[duration:]`` will be saved as model dimensions, since the first coefficient is not identified
183+
(it is defined as :math:`-\sum_{i=1}^{s-1} \tilde{\gamma}_{-i}`).
92184
93185
Examples
94186
--------
@@ -137,6 +229,7 @@ class TimeSeasonality(Component):
137229
def __init__(
138230
self,
139231
season_length: int,
232+
duration: int = 1,
140233
innovations: bool = True,
141234
name: str | None = None,
142235
state_names: list | None = None,
@@ -146,29 +239,42 @@ def __init__(
146239
if observed_state_names is None:
147240
observed_state_names = ["data"]
148241

242+
if season_length <= 1 or not isinstance(season_length, int):
243+
raise ValueError(
244+
f"season_length must be an integer greater than 1, got {season_length}"
245+
)
246+
if duration <= 0 or not isinstance(duration, int):
247+
raise ValueError(f"duration must be a positive integer, got {duration}")
149248
if name is None:
150-
name = f"Seasonal[s={season_length}]"
249+
name = f"Seasonal[s={season_length}, d={duration}]"
151250
if state_names is None:
152-
state_names = [f"{name}_{i}" for i in range(season_length)]
251+
if duration > 1:
252+
state_names = [
253+
f"{name}_{i}_{j}" for i in range(season_length) for j in range(duration)
254+
]
255+
else:
256+
state_names = [f"{name}_{i}" for i in range(season_length)]
153257
else:
154-
if len(state_names) != season_length:
258+
if len(state_names) != season_length * duration:
155259
raise ValueError(
156-
f"state_names must be a list of length season_length, got {len(state_names)}"
260+
f"state_names must be a list of length season_length*duration, got {len(state_names)}"
157261
)
158262
state_names = state_names.copy()
159263

160264
self.innovations = innovations
265+
self.duration = duration
161266
self.remove_first_state = remove_first_state
267+
self.season_length = season_length
162268

163269
if self.remove_first_state:
164270
# In traditional models, the first state isn't identified, so we can help out the user by automatically
165271
# discarding it.
166272
# TODO: Can this be stashed and reconstructed automatically somehow?
167-
state_names.pop(0)
273+
state_names = state_names[duration:]
168274

169275
self.provided_state_names = state_names
170276

171-
k_states = season_length - int(self.remove_first_state)
277+
k_states = (season_length - int(self.remove_first_state)) * duration
172278
k_endog = len(observed_state_names)
173279
k_posdef = int(innovations)
174280

@@ -230,29 +336,56 @@ def populate_component_properties(self):
230336

231337
def make_symbolic_graph(self) -> None:
232338
k_states = self.k_states // self.k_endog
339+
duration = self.duration
340+
k_unique_states = k_states // duration
233341
k_posdef = self.k_posdef // self.k_endog
234342
k_endog = self.k_endog
235343

236344
if self.remove_first_state:
237345
# In this case, parameters are normalized to sum to zero, so the current state is the negative sum of
238346
# all previous states.
239-
T = np.eye(k_states, k=-1)
240-
T[0, :] = -1
347+
zero_d = pt.zeros((self.duration, self.duration))
348+
id_d = pt.eye(self.duration)
349+
350+
row_blocks = []
351+
352+
# First row: all -1_d blocks
353+
first_row = [-id_d for _ in range(self.season_length - 1)]
354+
row_blocks.append(pt.concatenate(first_row, axis=1))
355+
356+
# Rows 2 to season_length-1: shifted identity blocks
357+
for i in range(self.season_length - 2):
358+
row = []
359+
for j in range(self.season_length - 1):
360+
if j == i:
361+
row.append(id_d)
362+
else:
363+
row.append(zero_d)
364+
row_blocks.append(pt.concatenate(row, axis=1))
365+
366+
# Stack blocks
367+
T = pt.concatenate(row_blocks, axis=0)
241368
else:
242369
# In this case we assume the user to be responsible for ensuring the states sum to zero, so T is just a
243370
# circulant matrix that cycles between the states.
244-
T = np.eye(k_states, k=1)
245-
T[-1, 0] = 1
371+
T = pt.eye(k_states, k=1)
372+
T = pt.set_subtensor(T[-1, 0], 1)
246373

247374
self.ssm["transition", :, :] = pt.linalg.block_diag(*[T for _ in range(k_endog)])
248375

249376
Z = pt.zeros((1, k_states))[0, 0].set(1)
250377
self.ssm["design", :, :] = pt.linalg.block_diag(*[Z for _ in range(k_endog)])
251378

252379
initial_states = self.make_and_register_variable(
253-
f"coefs_{self.name}", shape=(k_states,) if k_endog == 1 else (k_endog, k_states)
380+
f"coefs_{self.name}",
381+
shape=(k_unique_states,) if k_endog == 1 else (k_endog, k_unique_states),
254382
)
255-
self.ssm["initial_state", :] = initial_states.ravel()
383+
if k_endog == 1:
384+
self.ssm["initial_state", :] = pt.extra_ops.repeat(initial_states, duration, axis=0)
385+
else:
386+
self.ssm["initial_state", :] = pt.extra_ops.repeat(
387+
initial_states, duration, axis=1
388+
).ravel()
256389

257390
if self.innovations:
258391
R = pt.zeros((k_states, k_posdef))[0, 0].set(1.0)

0 commit comments

Comments
 (0)