Skip to content

Try simple-minded call expression cache #19505

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 28, 2025

Conversation

ilevkivskyi
Copy link
Member

This gives a modest 1% improvement on self-check (compiled), but it gives almost 40% on mypy -c "import colour". Some comments:

  • I only cache CallExpr, ListExpr, and TupleExpr, this is not very principled, I found this as a best balance between rare cases like colour, and more common cases like self-check.
  • Caching is fragile within lambdas, so I simply disable it, it rarely matters anyway.
  • I cache both messages and the type map, surprisingly the latter only affects couple test cases, but I still do this generally for peace of mind.
  • It looks like there are only three things that require cache invalidation: binder, partial types, and deferrals.

In general, this is a bit scary (as this a major change), but also perf improvements for slow libraries are very tempting.

This comment has been minimized.

@ilevkivskyi
Copy link
Member Author

Nice! Interpreted colour in mypy_primer got 3x faster.

@ilevkivskyi
Copy link
Member Author

Hm, both changes in the primer however look suspicious, I will take a look at them later.

# Deeply nested generic calls can deteriorate performance dramatically.
# Although in most cases caching makes little difference, in worst case
# it avoids exponential complexity.
# TODO: figure out why caching within lambdas is fragile.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also discovered that in #19408. Accepting a lambda has explicit type context dependency (infer_lambda_type_using_context), so generic calls do really require re-accepting the lambda every time, context from outer generic may have propagated later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks for the pointer. I will update the comment.

@@ -931,7 +932,8 @@ def prefer_simple_messages(self) -> bool:
if self.file in self.ignored_files:
# Errors ignored, so no point generating fancy messages
return True
for _watcher in self._watchers:
if self._watchers:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain this change? Watchers used to be additive and that sounded reasonable to me...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, if any of the active watchers was ignoring errors, we could use simpler messages, but in presence of caching this is not valid anymore. For example, we can accept an expression when there is enclosing ignoring watcher, but then the caching watcher will record simple message, and if next time we, by chance, accept same expression in same type context, but without the ignoring watcher, an incorrect (i.e. way too terse) error message will be pulled from the cache.

Without this change 6 tests fail because of terse/simplistic error messages are used.

# Although in most cases caching makes little difference, in worst case
# it avoids exponential complexity.
# TODO: figure out why caching within lambdas is fragile.
elif isinstance(node, (CallExpr, ListExpr, TupleExpr)) and not (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be difficult to allow dicts and sets here? Inline dictionaries are relatively common and even heavier than lists, and sets just for consistency.

Also operator exprs can be really heavy (#14978) and are fundamentally similar to CallExpr, are they worth considering?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with dicts/sets is that I see around 0.3% regression on self-check when I add them (but maybe this is just noise). My reasoning is that most code has a bunch of shallow dictionaries, and for those caching is just busy-work that will never be used (note caching is not free, since mypyc is slow on creating local type maps and watchers).

Anyway, I am open to considering more expression kinds to cache, but lets put those in separate PR(s).

@ilevkivskyi
Copy link
Member Author

OK, it looks like I found the problem with mitmproxy, it was a pre-existing bug in multiassign_from_union, will push a fix in a minute.

Copy link
Contributor

Diff from mypy_primer, showing the effect of this PR on open source code:

werkzeug (https://github.com/pallets/werkzeug)
+ src/werkzeug/datastructures/structures.py:193: error: Generator has incompatible item type "tuple[Union[K, Any], Union[list[V], list[Any]]]"; expected "tuple[K, V]"  [misc]

@ilevkivskyi
Copy link
Member Author

OK, the two previous changes in mypy_primer are fixed, and the new one error is correct, it was previously hidden by the bug in multi-assign from union.

@sterliakov
Copy link
Collaborator

LG! I'm still a bit scared to approve this now, will take another look at this tomorrow:)

@ilevkivskyi
Copy link
Member Author

@sterliakov Did you have time for another look?

Btw comparison with current master went below noise level, probably because of #19515. But I also played with some "micro-benchmarks" involving deeply nested calls, and this PR gives performance improvements on order of 100x (no exaggeration). If there are no concrete suggestions, I think we should give this a try.

Copy link
Collaborator

@sterliakov sterliakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sorry, I think this is good to go!

comparison with current master went below noise level

that sounds OK: mypy itself isn't generic-heavy, so it shouldn't demonstrate huge improvements. colour changes speak for themselves:)

@ilevkivskyi ilevkivskyi merged commit bd94bcb into python:master Jul 28, 2025
20 checks passed
@ilevkivskyi ilevkivskyi deleted the call-expr-cache branch July 28, 2025 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants