Skip to content

What's the best way to do backwards compatibility for existing configs? #479

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
carmocca opened this issue Mar 26, 2024 · 13 comments
Open
Labels
question Further information is requested

Comments

@carmocca
Copy link
Contributor

If I have a CLI implementation (before.py) with a Foo.a argument

class Foo:
    def __init__(self, a=2):
        ...

def fn(foo: Foo = Foo()):
    ...

from jsonargparse import ArgumentParser, ActionConfigFile

parser = ArgumentParser()
parser.add_argument("-c", "--config", action=ActionConfigFile)
parser.add_function_arguments(fn)
args = parser.parse_args()

python before.py --print_config > a.yaml

Where a.yaml is:

foo:
  class_path: __main__.Foo
  init_args:
    a: 2

And then create an after.py file whose only difference is renaming a to b

$ diff before.py after.py 
2c2
<     def __init__(self, a=2):
---
>     def __init__(self, b=2):

What's the best or recommended way to support loading a.yaml with after.py, where a is remapped to b? It currently fails as expected with:

$ python after.py --config a.yaml
usage: after.py [-h] [-c CONFIG] [--print_config[=flags]] [--foo.help CLASS_PATH_OR_NAME] [--foo FOO]
error: Parser key "foo":
  Problem with given class_path '__main__.Foo':
    Validation failed: No action for key "a" to check its value.

Context

This is currently blocking a rename in Lightning-AI/litgpt#1156

@mauvilsa
Copy link
Member

My ideal solution would be to keep the feature of having a sync between the source code and the CLIs, and not make up something new specific for CLIs. That is, it would be nice to adopt some deprecated standard, so that deprecated decorators are added to the code, and based on that jsonargparse runs additional logic. For example if a parameter a is renamed to b, then jsonargparse would accept a, and if an input a is given, then when serializing, the a would be converted to b. The moment that the deprecated decorators are removed, then automatically the logic deprecation logic in the CLI is removed as well. Unfortunately, from what I know, there isn't such a deprecation standard to adopt. There is PEP 702, but that does not include deprecation of function parameters.

In projects that I have worked on, there are three approaches we have taken, though they haven't used specific features of jsonargparse.

  1. Have migration scripts which get run when the software is upgraded to a new version. This means, manually writing a script that converts config files from the old structure to the new one and written to disk.
  2. Have on the fly migrations, which get run every time, so old configs are supported but not persisted.
  3. The third approach which is specific to parameters is to add the new parameter name to the signature and also **kwargs. Then inside do old_name_value = kwargs.pop("old_name"). And probably implement some code so that if old_name is used, a deprecation warning is printed. The point of **kwargs is just to prevent old parameter names showing up in the docs. Probably this is not a good solution because when serializing a config, both the old an new names would be included.

What to recommend, I am not sure. But we can brainstorm further.

@carmocca
Copy link
Contributor Author

carmocca commented Mar 27, 2024

2) is what I would prefer. But how does one integrate it into the parser? Is there an extension point in ActionConfigPath so that an arbitrary transformation can be applied on the config file before it's validated against the parser types?

@mauvilsa
Copy link
Member

  1. is what I would prefer. But how does one integrate it into the parser? Is there an extension point in ActionConfigPath so that an arbitrary transformation can be applied on the config file before it's validated against the parser types?

Currently there isn't such a feature. But note that it isn't just ActionConfigFile. People could provide from the command line --old_name=value. Also old_name could be nested in a group which can be provided by the user in a subconfig, which is not handled by ActionConfigFile. The migration function would need to handle all cases: getting a full config, getting a subconfig, getting a single value. If there are subcommands, then it is one more case to handle.

@mauvilsa
Copy link
Member

mauvilsa commented Mar 27, 2024

Actually, there is a feature that could be used, though it wasn't intended for this: custom-loaders.

You can implement a custom loader. The default yaml_load has some logic which avoids weird behaviors of pyyaml. Your custom loader can simply call the default one, and then add logic on top. A caveat is that yaml_load is not part of the public API.

All parsed values go through the loader. So depending on what is loaded, the function would need to decide if a transformation is required.

@carmocca
Copy link
Contributor Author

We'll try this and report back. Only handling configs is acceptable at the moment since it's how we recommend that people interact with the program.

@mauvilsa
Copy link
Member

mauvilsa commented Jul 3, 2024

With #543 now it is possible to get the default yaml loader.

@carmocca
Copy link
Contributor Author

carmocca commented Jul 5, 2024

Thank you Mauricio. I won't be able to check this with my original script. Feel free to close the issue if the current implementation is good enough.

@mauvilsa
Copy link
Member

mauvilsa commented Jul 5, 2024

I didn't expose the loader particularly for this issue. But if you aren't going to look into this further, then yes I guess we can close the issue.

@mauvilsa mauvilsa closed this as completed Jul 5, 2024
@carmocca
Copy link
Contributor Author

carmocca commented Mar 3, 2025

@mauvilsa I'm trying this feature now (sorry! it's been a while). Can you re-open this?

Here's my updated code to use the loader feature

class Foo:
    def __init__(self, b=2):
        ...

def fn(foo: Foo = Foo()):
    ...

from jsonargparse import ArgumentParser, ActionConfigFile
from jsonargparse import get_loader, set_loader

def apply_backwards_compatibility_transformations(config):
    print("Transforming", config)
    if not isinstance(config, dict):
        # it is loading something else
        return config
    config = config.copy()
    config["foo"]["init_args"].setdefault("b", config["foo"]["init_args"].pop("a", None))
    return config

def enable_backwards_compatibility(parser: ArgumentParser) -> None:
    loader = get_loader(mode=parser.parser_mode)

    def custom(config_input: str):
        config = loader(config_input)
        return apply_backwards_compatibility_transformations(config)

    set_loader("backwards_compatibility_loader", custom)
    parser.parser_mode = "backwards_compatibility_loader"

parser = ArgumentParser(parser_mode="yaml")
enable_backwards_compatibility(parser)

parser.add_argument("-c", "--config", action=ActionConfigFile)
parser.add_function_arguments(fn)
args = parser.parse_args()
print(args)
# before.yaml
foo:
  class_path: __main__.Foo
  init_args:
    a: 2

Now, python after.py --config before.yaml works as expected

However, if I override an argument: python after.py --config before.yaml --foo.a 123

Then, my code doesnt work because 123 is passed separately without giving me an opportunity to tweak its key.

What if jsonargparse provided an extension point that allowed this sort of transformation on the final args? After all the inputs have been loaded but before any config validation or class instantiation.

Thanks!

@mauvilsa
Copy link
Member

mauvilsa commented Mar 7, 2025

What if jsonargparse provided an extension point that allowed this sort of transformation on the final args? After all the inputs have been loaded but before any config validation or class instantiation.

This is not simple. You could subclass ArgumentParser and override _parse_common. But _parse_common is called in many contexts, so it would be difficult for you to know what cases to consider, and might end up making assumptions about internal behavior, so could break in later releases. This would be similar if a hook were provided to officially do this. And then there is #427 which I would like to fix some day. Very likely this fix would mean an internal refactoring, which would validate arguments as they are found, since this is how argparse works.

A solution could be a decorator to mark parameter deprecations, even though there isn't a standard for this in python yet. Is there any parsing library that does something like this, be it python or other language? Would be good to know for inspiration. Still, not the most simple topic, so I would still go for something like #479 (comment) option 3.

@mauvilsa mauvilsa reopened this Mar 7, 2025
@carmocca
Copy link
Contributor Author

carmocca commented Mar 10, 2025

I wouldn't want option 3 for the follwing reasons:

  • the config compatibility leaks into the code, which so far jsonargparse has done a great job at avoiding. If the developer who maintains Foo is not the developer who maintains the CLI/config parsing, both need to understand these details.
  • **kwargs should be avoided whenever possible IMO. They are very prone to bugs down the line in my experience.
  • You might want to have a different class getting the old argument. In my snippet at the top, Foo changed it's variable name, but what if instead the change was one level up?
class Bar:
    def __init__(self, a=2):
        ...

def fn(bar: Bar = Bar()):
    ...

You could think of arbitrary transformations to the config that would be very difficult to support just by using **kwargs

Overall having a single global opportunity to transform the config would be ideal.

@mauvilsa
Copy link
Member

Overall having a single global opportunity to transform the config would be ideal.

I have thought about this carefully and I think it is not a good solution. There are several reasons:

  1. As I have mentioned, this is not simple to implement: it would rule out fixing other issues and restrict internal refactorings.
  2. The usability wouldn't be great. This hook would depend on a specific version of the code, which means that if the code is changed, then you need to remember to change this hook also. The idea has been to keep code and CLIs automatically in sync.
  3. Depending on how deeply nested (subcommands, subclasses) and how many deprecations there are, this hook could be rather complex.
  4. Deprecation of parameters is not something CLI specific. It can also relate to runtime deprecation warnings and type checkers. So in my opinion the correct location for this is the code.
  5. Deprecation of parameters is something relatively simple from the view of individual function signatures. So it wouldn't be a good motivation to introduce arbitrarily complex transformations of configs in jsonargparse.

I have some ideas I can propose something which wouldn't require custom **kwargs handling. But note that this implies being able to change the code.

@mauvilsa
Copy link
Member

I have created a package to illustrate my idea. You can do pip install "deprecated-parameters==0.1.0.dev2" to try it out. Then taking from the example above:

from deprecated_parameters import ParameterRename, deprecated_parameters

class Foo:
    @deprecated_parameters(
        ParameterRename(old_name="a", new_name="b"),
    )
    def __init__(self, b=2):
        self.b = b

foo = Foo(a=3)
print(foo.b)

During runtime, by default the decorator takes care of changing the argument from a=3 to b=3, without **kwargs. Since a is not in the signature, then Foo(a=3) would show up as an error by type checkers, so good to hint people about the need to fix the code. Furthermore, in deprecated-parameters I also tried to implement a mypy plugin so that mypy tells about the deprecation, making it clearer for users on what need to be changed. Though, for now I only managed to make the plugin work for functions. It doesn't work yet for the example above. I guess other plugins could be created, e.g. for pylance. Finally, in deprecated-parameters when using deprecated arguments, a warning is automatically printed. This could be made configurable.

Anyway, the point is that deprecated-parameters can be implemented in a way that jsonargparse can interpret the decorator and automatically do necessary transformations. Right now ParameterRename has by default transform="reassign". If set to None the runtime transformation can be disabled. One idea could be that optionally transform can be given a function to do a custom transformation, which would be used by the decorator during runtime, and also by jsonargparse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants