-
Notifications
You must be signed in to change notification settings - Fork 31
DOC: Simple example and comprehensive how-to notebook #1054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This PR adds the namespace infrastructure to GETTSIM. - [x] Write `policy_function` decorator (rename `policy_info` and change behavior so that a `PolicyFunction` instance is returned). ~Apply to all TT functions.~ (that should be part of renamings) - [x] Check that functions in module with same simple_name have the correct start_date, end_date specs (this was removed from the policy_info decorator). - [x] Remove doubled levels in the functions tree automatically (to avoid writing functions in `__init__.py`). - [x] Go over type hints for aggregation functions. - [x] Refactor interface module. - [x] Implement some safety checks - [x] No function should have the same name as a module in the same directory - [x] No trailing underscores in module names (for [DAGS PR](OpenSourceEconomics/dags#17)) --------- Co-authored-by: Marvin Immesberger <immesberger@posteo.de> Co-authored-by: Tim Mensinger <mensingertim@gmail.com> Co-authored-by: Hans-Martin von Gaudecker <hmgaudecker@gmail.com>
The way we implemented the loading of namespaces in #780 does not quite work. We want to have them at the directory level to balance use of namespaces and reducing the amount of qualified names. Additionally, we had to change the order of the upsert operations in `combine_policy_functions_and_derived_functions`. Doesn't affect the happy path, but in case of conflicts the previous behaviour did not make sense. --------- Co-authored-by: Marvin Immesberger <immesberger@posteo.de>
This reverts commit fd2d696.
### What problem do you want to solve? Uses the qualified name instead of the leaf name to look for rounding specs in the params file. This is a temporary solution until we have tackled #823.
### What problem do you want to solve? This PR provides the necessary renamings of taxes and transfers functions for #804. ToDo: - [x] Create new directory structure - [x] Rename all function arguments - [x] Set namespace of basic input variables - [x] Update `pyproject.toml` to reflect new file structure - [x] Make sure tests run (#841) - [x] `kinderfreibetragempfänger` $\rightarrow$ `kinderfreibetragsempfänger` - [x] Link issue #842 in relevant docstrings --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Hans-Martin von Gaudecker <hmgaudecker@gmail.com>
### What problem do you want to solve? This PR implementes the distinction between TTSIM (basically the infrastructure) and DE (the German taxes and transfers) components of GETTSIM. This was discussed [here](#780 (comment)). In particular, I - Move modules from `_gettsim` to `ttsim/` or leave them in `_gettsim` - Remove the `taxes` and `transfers` subdirs - Split up `config.py` into a TTSIM and a DE part - Adjust the loader accordingly - Also split up tests in TTSIM and DE parts. - Introduce quarters For tests, the distinction is not always super sharp. There are some tests that test a specific feature of the infrastructure (e.g. vectorization), but do this by loading the functions tree from the DE part. Still, I chose to label those tests as `ttsim`. Similarly, we don't test `aggregate_by_p_id` directly in the `ttsim` part, but do it by testing specific components of the TT system. I put them in the `de` dir. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Hans-Martin von Gaudecker <hmgaudecker@gmail.com> Co-authored-by: Tim Mensinger <mensingertim@gmail.com>
### What problem do you want to solve? Will close #852. Adapts tests to match GETTSIM src structure.
### What problem do you want to solve? This PR makes a step towards separating TTSIM and GETTSIM by testing the TTSIM infrastructure with its own instance of a fictitious taxes and transfers system that makes use of all features. --------- Co-authored-by: Hans-Martin von Gaudecker <hmgaudecker@gmail.com> Co-authored-by: Tim Mensinger <mensingertim@gmail.com>
`fg_id` creation did not work correctly for some orderings of adults (#801). Now adds fg_id for both the einstandspartner and his children at the same time. - [x] Fix loop - [x] Add test case for special case mentioned in #801 --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
This is a huge PR, which started innocently as a fix to #833. In the end, it turned out to be very difficult to change things locally, so in the process of the intense sprint during the week 7-12 April 2025, this ended up including the following: - Updated type hierarchy (`TTSIMObject` as basic building block, `PolicyInput` and `TTSIMFunction` inheriting from that, `TTSIMFunction` has further subclasses for policy, aggregation, ...). - Further separation of tests in ttsim / _gettsim. Including Middle Earth Taxes an Transfers SIMulator METTSIM as tiny example for the ttsim-side of tests (#856) and sensible structure for `_gettsim_tests` (#858) - Sensible treatment of Einnahmen / Einkünfte (#862) - Specify rounding in a dataclass to be provided in the decorators rather than referencing the yaml files from there (#859) - Improve structure for AggregationSpecs, including an Enum for the type of Aggregations (#860) --------- Co-authored-by: Hans-Martin von Gaudecker <hmgaudecker@gmail.com> Co-authored-by: Marvin Immesberger <immesberger@posteo.de> Co-authored-by: Marvin Immesberger <74215010+MImmesberger@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
We put some effort into trying to convert types. However, the code was a mess and it would be a pain to maintain it. What Python/Pandas/Numpy/Jax do is more than good enough for GETTSIM, too. Now that we have the explicitly annotated `policy_inputs`, it will be easy to check and throw errors if users want to be strict. This PR removes the code which has been stale for the last week, anyhow. --------- Co-authored-by: Marvin Immesberger <immesberger@posteo.de>
### What problem do you want to solve? Fix #870 and related things. In particular, defer some checks so that they are only done for variables that are present / set start/end dates of explicit aggregation functions so they are derived from source object. --------- Co-authored-by: Marvin Immesberger <74215010+MImmesberger@users.noreply.github.com> Co-authored-by: Tim Mensinger <mensingertim@gmail.com> Co-authored-by: Max Jahn <max.jahn45@gmail.com>
### What problem do you want to solve? Tests in `test_jax_jit_kindergeld.py` were failing because policy functions were not jittable. ### Problems and Solutions #### Non-Hashable Function in `jit` The policy functions were non-jittable because the dataclasses were non-frozen and had the equality argument set to True. This implies that the dataclass get an equality method which compares the fields. To not break the equality/hash contract (a == b implies hash(a) == hash(b)), a dataclass with equality method that is not frozen has a deactivated hash. This does not work with `jax.jit`, because for caching JAX requires a hash of the object. By freezing the dataclasses they get their hash back, and everything works nicely again with JAX. > [!NOTE] > Frozen dataclasses cannot have standard assignments in the post init method. For this I had to implement a `frozen_safe_update_wrapper`. ### Todo - [x] Freeze ttsim_objects dataclasses and update post init of `TTSIMFunction` to be compatible - [x] Understand why list `single_test` in `kindergeld_policy_test` fixture has only one entry, although the yaml file says there are two outputs - [x] Fix `test_compute_taxes_and_transfers_kindergeld` --------- Co-authored-by: Hans-Martin von Gaudecker <hmgaudecker@gmail.com>
In limited set of experiments, it produced exactly the same result. `ast.unparse` is available since Python 3.9, so it's fine to use.
- [x] Add a json (yaml) schema based on GEP-03 - [x] Make sure manual validation of parameters passes - [x] make a pre-commit hook out of this
### What problem do you want to solve? Users can easily create a NestedDataDict by providing a mapper from the paths used in the TTSIM instance to a column in the DataFrame or a pandas Series or a single value. --------- Co-authored-by: Hans-Martin von Gaudecker <hmgaudecker@gmail.com>
### What problem do you want to solve? Fix ruff complaints by - Ignoring `synthetic.py` and its test file because those will be rewritten soon. Same is true for `test_docs.py`. - Removing old tests: - We don't need to test for type conversions anymore. - We don't need to explicitly test for the handling of (qualified) source column names for aggregation functions because those are handled by `dags` now. - Some minor adjustments to the rest of the code
### What problem do you want to solve? Rough fix of the readthedocs build. - Remove list of "typical outputs" (because `_gettsim.functions` does not exist anymore) - Remove list of all policy functions (because `_gettsim.functions` does not exist anymore) - Move outdated tutorials to a different folder. (all of them rely on `create_synthetic_data` or the visualisation mechanic, none of them works anymore (and we have rewrite them soon anyhow)) --------- Co-authored-by: Hans-Martin von Gaudecker <hmgaudecker@gmail.com>
for more information, see https://pre-commit.ci
### What problem do you want to solve? Infer groupings from the objects tree. This needs to be done by looking for names in the top-level namespace that end with "_id". Filtering for `group_creation_functions` does not work because this would miss `hh_id`. **Changes** - Removed `SUPPORTED_GROUPINGS` global everywhere - Removed explicit `groupings` argument from `compute_taxes_and_transfers` - Added the `grouping_levels` property to the policy environment. - Moved the `_fail_if_group_ids_are_outside_top_level_namespace` check to the policy environment. --------- Co-authored-by: Hans-Martin von Gaudecker <hmgaudecker@gmail.com>
Add `fail_if.backend_has_changed`. Lessons learned: - Numpy can handle Jax arrays (see test) - Jax can handle NumPy arrays that are passed as the processed data (see test) - The problematic case are parameters that are partialled to functions. Unfortunately, these are typically custom objects. We to loop over them and check whether any of them happens to be a numpy array
(#1048) Check whether the structure of the paths matches. E.g.: - `input_data={"df_and_mapper": None}`: Fails because there needs to be a dict below "df_and_mapper" - `input_data={"not_around": None}`: Fails because `not_around` is not a valid child of `input_data` - `not_around=None`: Fails because not around is not a valid root node (already taken care of by Python itself when calling `main`, but let's be pedantic...)
### What problem do you want to solve? - [x] Add a GEP for the revamped interface - [x] Update earlier GEPs to reflect the changes that have become necessary after GEP 6 (since our documentation is small, it does not make sense to keep outdated things around). - [x] Add the finalised schema from #880 as an appendix to GEP 3 [Resolution on Zulip.](https://gettsim.zulipchat.com/#narrow/channel/309998-GEPs/topic/GEP.2007/near/530389224) --------- Co-authored-by: Marvin Immesberger <immesberger@posteo.de>
In sync with [TTSIM PR 1](ttsim-dev/ttsim#1), this leaves just GETTSIM in here. Also includes the renamings in [TTSIM PR 3](ttsim-dev/ttsim#3), which are on PyPI as 1.0a1 Fixes #1003.
…g object by hand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow! This is basically a tour de force of all GETTSIM features -- much more than what I had imagined here.
Maybe the title should be something like "Documentation of Taxes & Transfers Objects and how to modify a policy environment?".
Please include in the docs.
I just modified the ConsecutiveInt...
example to use the function for setting it up.
Ah, and TTSIM #5 reminds me why I wanted to make the same comment for the piecewise polynomial thing... Please rewrite the examples so they only use |
FYI I added this line in the piecewise part:
But should be a credible promise, right? |
@MImmesberger, I added the |
Thanks! Sorry for always forgetting about the changelog recently |
What problem do you want to solve?
Closes #1034
Tutorial on how to upsert