-
Notifications
You must be signed in to change notification settings - Fork 76
Draft (new feature) : Model to estimate when a intervention had effect #480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Draft (new feature) : Model to estimate when a intervention had effect #480
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #480 +/- ##
==========================================
+ Coverage 94.40% 94.42% +0.01%
==========================================
Files 29 29
Lines 2075 2241 +166
==========================================
+ Hits 1959 2116 +157
- Misses 116 125 +9 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…over intervention's distributions
b7b91de
to
ee701f2
Compare
NoteIn my last PR, I added functionality allowing users to specify priors for the effect of the intervention. This provides more flexibility for Bayesian modeling and allows users to incorporate domain knowledge directly into the inference process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @JeanVanDyk. I'm very excited about the new addition, thanks for putting it together. Bear with me if I'm sometimes slow to reply - busy dad life!
At this point, could I request that you put together an ipynb
file that showcases the functionality. I'm assuming this PR will evolve and go through a couple of iterations, so this notebook will end up being a new docs page to help users understand the new functionality. If you edit the ./docs/source/notebooks/index.md
file, you can add the notebook name under the interrupted time series section. That way the notebook will render if you either build the docs locally, but it should also render in the remote docs preview build.
Did you still want feedback on the GitHub Discussion at this point?
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Simplified
|
@JeanVanDyk I just did a quick update from Hoping to start taking a decent look at this now. Will try to provide some comments even though I know this is not 100% finished. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Can you run
make uml
? This will generate updated uml plots (e.g.classes.png
) - Looking at that generated class diagram (and not yet having looked at the
InterventionTimeEstimator
the hope would be to avoid overwriting the methodscalculate_impact
,predict
,score
andfit
methods. These are hopefully very generic and should hopefully be able to be dealt with in thePyMCModel
base class. Like I say, I've not yet taken a look at the code itself, but this would be the hope. - I cannot currently run the new notebook. When the first model fit is attempted I get this
Traceback
TypingError Traceback (most recent call last) Cell In[4], [line 10](vscode-notebook-cell:?execution_count=4&line=10) 2 from causalpy.pymc_models import InterventionTimeEstimator as ITE 4 model = ITE( 5 time_variable_name="t", 6 treatment_type_effect={"level": []}, 7 sample_kwargs={"sample_seed": seed}, 8 ) ---> [10](vscode-notebook-cell:?execution_count=4&line=10) result = ITS( 11 data=df, 12 treatment_time=None, 13 formula="y ~ 1 + t", 14 model=model, 15 )File ~/git/CausalPy/causalpy/experiments/interrupted_time_series.py:284, in InterruptedTimeSeries.init(self, data, treatment_time, formula, model, **kwargs)
281 raise ValueError("Model type not recognized")
283 # score the goodness of fit to the pre-intervention data
--> 284 self.score = self.model.score(X=self.pre_X, y=self.pre_y)
286 # Postprocessing with handler
287 self.datapre, self.datapost, self.pre_y, self.pre_X, self.treatment_time = (
288 self.handler.data_postprocessing(
289 self.model, data, idata, treatment_time, self.pre_y, self.pre_X
290 )
291 )
File ~/git/CausalPy/causalpy/pymc_models.py:773, in InterventionTimeEstimator.score(self, X, y)
771 mu_ts = az.extract(mu_ts, group="posterior_predictive", var_names="mu_ts").T
772 # Note: First argument must be a 1D array
--> 773 return r2_score(y.data, mu_ts)
File ~/mambaforge/envs/CausalPy/lib/python3.13/site-packages/arviz/stats/stats.py:1167, in r2_score(y_true, y_pred)
1134 def r2_score(y_true, y_pred):
1135 """R² for Bayesian regression models. Only valid for linear models.
1136
1137 Parameters
(...) 1165
1166 """
-> 1167 r_squared = r2_samples(y_true=y_true, y_pred=y_pred)
1168 return pd.Series([np.mean(r_squared), np.std(r_squared)], index=["r2", "r2_std"])
File ~/mambaforge/envs/CausalPy/lib/python3.13/site-packages/arviz/stats/stats.py:1127, in r2_samples(y_true, y_pred)
1125 var_e = _numba_var(svar, np.var, (y_true - y_pred))
1126 else:
-> 1127 var_y_est = _numba_var(svar, np.var, y_pred, axis=1)
1128 var_e = _numba_var(svar, np.var, (y_true - y_pred), axis=1)
1129 r_squared = var_y_est / (var_y_est + var_e)
File ~/mambaforge/envs/CausalPy/lib/python3.13/site-packages/arviz/utils.py:372, in _numba_var(numba_function, standard_numpy_func, data, axis, ddof)
350 """Replace the numpy methods used to calculate variance.
351
352 Parameters
(...) 369
370 """
371 if Numba.numba_flag:
--> 372 return numba_function(data, axis=axis, ddof=ddof)
373 else:
374 return standard_numpy_func(data, axis=axis, ddof=ddof)
File ~/mambaforge/envs/CausalPy/lib/python3.13/site-packages/arviz/stats/stats_utils.py:535, in stats_variance_2d(data, ddof, axis)
533 var = np.zeros(a_a)
534 for i in range(a_a):
--> 535 var[i] = stats_variance_1d(data[i], ddof=ddof)
536 else:
537 var = np.zeros(b_b)
File ~/mambaforge/envs/CausalPy/lib/python3.13/site-packages/arviz/utils.py:205, in maybe_numba_fn.call(self, *args, **kwargs)
203 """Call the jitted function or normal, depending on flag."""
204 if Numba.numba_flag:
--> 205 return self.numba_fn(*args, **kwargs)
206 else:
207 return self.function(*args, **kwargs)
File ~/mambaforge/envs/CausalPy/lib/python3.13/site-packages/numba/core/dispatcher.py:424, in _DispatcherBase._compile_for_args(self, *args, **kws)
420 msg = (f"{str(e).rstrip()} \n\nThis error may have been caused "
421 f"by the following argument(s):\n{args_str}\n")
422 e.patch_message(msg)
--> 424 error_rewrite(e, 'typing')
425 except errors.UnsupportedError as e:
426 # Something unsupported is present in the user code, add help info
427 error_rewrite(e, 'unsupported_error')
File ~/mambaforge/envs/CausalPy/lib/python3.13/site-packages/numba/core/dispatcher.py:365, in _DispatcherBase._compile_for_args..error_rewrite(e, issue_type)
363 raise e
364 else:
--> 365 raise e.with_traceback(None)
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
non-precise type pyobject
During: typing of argument at /Users/benjamv/mambaforge/envs/CausalPy/lib/python3.13/site-packages/arviz/stats/stats_utils.py (517)
File "../../../../../mambaforge/envs/CausalPy/lib/python3.13/site-packages/arviz/stats/stats_utils.py", line 517:
def copy(self, deep=True): # pylint:disable=overridden-final-method
@conditional_jit(nopython=True)
^
During: Pass nopython_type_inference
This error may have been caused by the following argument(s):
- argument 0: Cannot determine Numba type of <class 'xarray.core.dataarray.DataArray'>
- I think there should be no changes to
its_skl.ipynb
right? Can we revert those changes back to the state on main, so there's no change in that file in this PR. - Same for
its_covid.ipynb
?
@@ -40,6 +40,7 @@ did_pymc_banks.ipynb | |||
its_skl.ipynb | |||
its_pymc.ipynb | |||
its_covid.ipynb | |||
its_no_treatment_time.ipynb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good. But in the notebook can you ensure you only have ONE top level markdown header. Otherwise it adds these as additional entries into the How To index page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, sure. I'll fix that!
As for the two notebooks you mentioned, I don't recall making any changes to them. I only ran them to check whether the modifications contained any small mistakes. I'll try reverting those changes.
Here is the graph @drbenvincent : |
Hi @drbenvincent, I've made some changes and will summarize them here, addressing the points you raised:
|
New Feature:
InterventionTimeEstimator
for Unknown Treatment TimingThis PR introduces a new model,
InterventionTimeEstimator
, designed to estimate when an intervention has an effect in a time series — especially in cases where the exact treatment time is unknown.Use Case
This addition gives users a flexible, Bayesian approach to model treatment timing uncertainty directly within the CausalPy framework.
Notes / Open Questions
Where should this model fit into the CausalPy workflow?
I’m unsure whether
InterventionTimeEstimator
should be integrated within theInterruptedTimeSeries
(ITS) feature, or used as a standalone tool.This affects how a user-defined model could be supported.
Depending on the intended usage, I can propose a solution to allow users to inject their own custom models.
Custom model usage — base vs. intervention
Should users be able to:
Covariates
I considered adding time-varying covariates to improve the fit. Would that be useful or out of scope?
Multivariate Time Series
It's relatively easy to extend the model for multivariate input. Let me know if this is something you'd like to see.
Model Summary
Inputs:
t
: 1D array of time pointsy
: 1D array of observed valuesspan
: restricts the window for switchpoint detectioncoords
: can includeseasons
for modeling periodic effectseffect
: list of components, e.g."trend"
,"level"
,"impulse"
grain_season
: number of time steps per seasonModel Components:
intercept + trend + seasonal
effect
componentsFeel free to share any feedback or suggestions! I'm happy to refine the model or explore extensions based on your input.
📚 Documentation preview 📚: https://causalpy--480.org.readthedocs.build/en/480/