Fix sizing of Maps #3868

kilink · 2025-08-01T17:23:47Z

Fix several instances where Maps were sized incorrectly when the number of elements to insert is known; specifying the capacity without taking into account the load factor meant it would always result in a rehash / resize operation.

To this end, add a simple Maps utility class that handles the capacity calculation, much like the static HashMap.newHashMap helper that was added in JDK19.

Checklist

Before we can review and merge your changes, please go through the checklist below. If you're still working on some items, feel free to submit your pull request as a draft—our CI will help guide you through the remaining steps.

✅ Required checks

License: I confirm that my changes are submitted under the Apache License, Version 2.0.
Commit signatures: All commits are signed and verifiable. (See GitHub Docs on Commit Signature Verification).
Code formatting: The code is formatted according to the project’s style guide.
Build & Test: I verified that the project builds and all unit tests pass.

How to build the project

Run: ./mvnw verify

See the build instructions for details.

🧪 Tests (select one)

I have added or updated tests to cover my changes.
No additional tests are needed for this change.

📝 Changelog (select one)

I added a changelog entry in src/changelog/.2.x.x. (See Changelog Entry File Guide).
This is a trivial change and does not require a changelog entry.

Fix several instances where Maps were sized incorrectly when the number of elements to insert is known; specifying the capacity without taking into account the load factor meant it would always result in a rehash / resize operation. To this end, add a simple Maps utility class that handles the capacity calculation, much like the static HashMap.newHashMap helper that was added in JDK19.

vy · 2025-08-02T14:21:48Z

Thanks so much for the patch. Would you mind sharing some context, please? Did you observe a particular performance issue?

…

Message ID: ***@***.***>

kilink · 2025-08-05T17:35:33Z

Did you observe a particular performance issue?

I don't recall how I stumbled upon it or if it showed up in any performance profiles (it definitely has for other libraries). It's just a common thing I've noticed across various projects (e.g., spring and grpc-java) since the HashMap constructor is a foot gun and it's not easy to do correctly until Java 19. Since the intent here was to presize the map, and we know how many elements to insert, it just seemed to make sense to size it so we always avoid the resize / rehash operations.

github-actions · 2025-08-06T08:33:40Z

Job	Requested goals	Build Tool Version	Build Outcome
build-macos-latest	clean install	3.9.8	❌
build-ubuntu-latest	clean install	3.9.8	❌
build-windows-latest	clean install	3.9.8	❌

Generated by gradle/develocity-actions

ppkarwasz · 2025-08-06T09:40:36Z

Hi @kilink,

Adding a new utility class to log4j-api and then using it in log4j-core reinforces an unnatural dependency between the two modules, and would introduce a hard incompatibility between Log4j Core 2.26.0 and Log4j API 2.25.x. The Core module should implement the API, not depend on it for internal utilities. I realize this kind of coupling already exists in some places, but I’d prefer not to increase it.

Could you instead:

Move Maps to o.a.l.l.util.internal to make it clear it’s not intended for external use. While o.a.l.l.util is theoretically internal, it remains exported due to past API leaks (e.g., o.a.l.l.util.Supplier).
Create a separate o.a.l.l.core.util.internal.Maps helper for log4j-core, so it can be used independently without introducing additional cross-module dependencies.

I agree with @vy that, without clear benchmark results, the value of this enhancement is questionable. We avoid using HashMap in performance-critical paths (such as the logging context map) because its instantiation and operation costs are high for small maps. Instead, we use o.a.l.l.util.StringMap, which offers enhanced features like mutability control (freeze()) and a foreach method (introduced for Map only in JDK 8). However, StringMap represents a significant technical debt we would like to eliminate. If you are interested in map performance, could you explore the feasibility of replacing StringMap with Map in version 3.x?

vy · 2025-08-06T10:56:29Z

I'm against enhancements without convincing figures obtained from benchmarks mapping to real-world use cases. If this optimization yields a tangible benefit on the hot path, provide a reproducer please.

kilink · 2025-08-06T16:44:02Z

I personally don't think it's worthwhile to spend time writing a benchmark for what amounts to a bug fix / correctness issue. If you'd rather not accept the change, feel free to close the issue. I'll leave my rationale for fixing it however:

Without a benchmark, we can see that the incorrect supplied capacity will cause a resize / rehash, and result in a map that is over-sized for the number of items inserted. It's easy to avoid with a trivial calculation which takes into account the load factor.
It's true that not all instances of this I changed in the PR are likely to be on the hot path, but fixing it everywhere is more of a consistency thing; if the helper is there, why not, otherwise it would just be more confusing.
There are likely callers in the wild that at least would be hitting the toMap implementation of the ReadOnlyStringMap, e.g. likely through ThreadContextMap (I can confirm such instances internally, as well as in OSS projects such as Sentry).
Whoever wrote those lines of code was attempting to size the HashMaps precisely by passing in a capacity, it just so happens that the number of items to insert != the map capacity, so seems like an easy mistake to have made. I assume care was taken here though since log4j generally tries to be optimal in terms of allocations and performance.

I assume the main point of contention is the addition of a utility class, and I agree it is ugly, but it's a stopgap until (?) Log4j can use the static HashMap.newHashMap added in Java 19, I presume when the baseline JDK version log4j requires allows us to use that, we can just switch to it and remove any such shim utility class. Or another way to put it, I presume you would readily accept this change if it were simply delegating to the JDK helpers (assuming we could use it)?

vy · 2025-08-09T07:05:50Z

log4j-api/src/main/java/org/apache/logging/log4j/CloseableThreadContext.java

@@ -206,7 +207,7 @@ public void close() {
        }

        private void closeMap() {
-            final Map<String, String> valuesToReplace = new HashMap<>(originalValues.size());
+            final Map<String, String> valuesToReplace = Maps.newHashMap(originalValues.size());


I've reviewed all log4j-api and log4j-core changes, and found this to be only change that one might claim to be of importance. I prefer a PR that

Replaces all HashMap#new(int) calls with a HashMap#new() (Since they're all incorrectly implemented, it is better to just stick to defaults.)

Replaces only this line in CloseableThreadContext with new HashMap((int) Math.ceil(originalValues.size() / 0.75))

@ppkarwasz, WDYT?

github-project-automation bot added this to Log4j bug tracker Aug 1, 2025

github-project-automation bot moved this to To triage in Log4j bug tracker Aug 1, 2025

Merge branch '2.x' into map-sizing-helpers

6dc466a

vy reviewed Aug 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix sizing of Maps #3868

Fix sizing of Maps #3868

Uh oh!

kilink commented Aug 1, 2025 •

edited

Loading

Uh oh!

vy commented Aug 2, 2025 via email

Uh oh!

kilink commented Aug 5, 2025

Uh oh!

github-actions bot commented Aug 6, 2025

Uh oh!

ppkarwasz commented Aug 6, 2025

Uh oh!

vy commented Aug 6, 2025

Uh oh!

kilink commented Aug 6, 2025

Uh oh!

vy Aug 9, 2025

Uh oh!

Uh oh!

Uh oh!

Fix sizing of Maps #3868

Are you sure you want to change the base?

Fix sizing of Maps #3868

Uh oh!

Conversation

kilink commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

✅ Required checks

🧪 Tests (select one)

📝 Changelog (select one)

Uh oh!

vy commented Aug 2, 2025 via email

Uh oh!

kilink commented Aug 5, 2025

Uh oh!

github-actions bot commented Aug 6, 2025

Generated by gradle/develocity-actions

Uh oh!

ppkarwasz commented Aug 6, 2025

Uh oh!

vy commented Aug 6, 2025

Uh oh!

kilink commented Aug 6, 2025

Uh oh!

vy Aug 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kilink commented Aug 1, 2025 •

edited

Loading