Skip to content

Conversation

alexander-pv
Copy link
Collaborator

@alexander-pv alexander-pv commented Sep 26, 2025

Proposed changes

Hi!
I am happy to write that I successfully finished the solution for #775 😄

Types of changes

  • CausalTreeRegressor and CausalRandomForestRegressor now support treatment vector with an arbitrary number of groups. The Cython part is written as close as possible to scikit-learn version leveraging multioutput option as a workaround to store outcomes per each group without additional tricks.
  • DepthFirstCausalTreeBuilder and BestFirstCausalTreeBuilder got rid of GIL context manager. It may speed up tree growth for large chunks of data. However, I didn't estimate the actual performance change.
  • sample_weight argument in fit() method now can indeed introduce weights for observations
  • New code example causal_trees_with_synthetic_data_multiple_treatment_groups.ipynb
  • Now plots support matplotlib Axes object as an optional argument. I thought it would be helpful to make a subplot for each treatment group (see new Jupyter notebook).

What types of changes does your code introduce to CausalML?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have read the CONTRIBUTING doc
  • I have signed the CLA
  • Lint and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in downstream modules

Further comments

An example how the outcome vector $y$ is now internally represented:

Suppose, we have n observations and m groups. The first column, Group 0, will always store outcomes for a control group. Other columns are responsible for outcomes resulted after a particular type of treatment.

Observation Group 0 Group 1 Group 2 ... Group m-1
0 value NaN NaN ... NaN
1 NaN value NaN ... NaN
2 NaN NaN value ... NaN
... ... ... ... ... ...
n-1 NaN ... NaN ... value

@alexander-pv alexander-pv self-assigned this Sep 26, 2025
@alexander-pv alexander-pv added enhancement New feature or request refactoring Code refactoring example labels Sep 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request example refactoring Code refactoring

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant