You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On certain workloads, profiling shows a surprisingly high fraction of
inference time spent looking up bindings in modules. Bindings use
a hash table, so they're expected to be quite fast, but certainly
not zero. A big contributor to the problem is that we do basically
treat it as zero, looking up bindings for GlobalRefs multiple times
for each statement (e.g. in `isconst`, `isdefined`, to get the types,
etc). This PR attempts to improve the situation by adding an extra
field to GlobalRef that caches this lookup. This field is not serialized
and if not set, we fallback to the previous behavior. I would expect
the memory overhead to be relatively small, since we do intern GlobalRefs
in memory to only have one per binding (rather than one per use).
# Benchmarks
The benchmarks look quite promising. Consider this artifical example
(though it's actually not all that far fetched, given some of the
generated code we see):
```
using Core.Intrinsics: add_int
const ONE = 1
@eval function f(x, y)
z = 0
$([:(z = add_int(x, add_int(z, ONE))) for _ = 1:10000]...)
return add_int(z, y)
end
g(y) = f(ONE, y)
```
On master:
```
julia> @time @code_typed g(1)
1.427227 seconds (1.31 M allocations: 58.809 MiB, 5.73% gc time, 99.96% compilation time)
CodeInfo(
1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64
└── return %1
) => Int64
```
On this PR:
```
julia> @time @code_typed g(1)
0.503151 seconds (1.19 M allocations: 53.641 MiB, 5.10% gc time, 33.59% compilation time)
CodeInfo(
1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64
└── return %1
) => Int64
```
I don't expect the same speedup on other workloads, but there should be
a few % speedup on most workloads still.
# Future extensions
The other motivation for this is that with a view towards #40399,
we will want to more clearly define when bindings get resolved. The
idea here would then be that binding resolution replaces generic
`GlobalRefs` by GlobalRefs with a set binding cache, and any
unresolved bindings would be treated conservatively by inference
and optimization.
0 commit comments