feat: Add a ReusableMap that allows reusing the map allocation#38
Open
quodlibetor wants to merge 2 commits intoPSeitz:mainfrom
Open
feat: Add a ReusableMap that allows reusing the map allocation#38quodlibetor wants to merge 2 commits intoPSeitz:mainfrom
quodlibetor wants to merge 2 commits intoPSeitz:mainfrom
Conversation
This adds a pair of new public types: `ReusableMap` and `BorrowedMap`.
Reusable map stores an ObjectAsVec and provides a `deserialize` method
which returns a `BorrowedMap` that uses unsafe to tie lifetimes to both
the input json str reference and the deserializer.
When the `BorrowedMap` is dropped it runs `clear()` on the internal
`ObjectAsVec`, freeing it to be reused.
Benches show a 4-15% increase in Avg and a 4-12% increase in median
throughput compared to OwnedValue on my machine:
parse
simple_json
serde_json parse only Avg: 98.270 MB/s Median: 97.457 MB/s [79.288 MB/s .. 129.84 MB/s]
serde_json_borrow::OwnedValue parse only Avg: 126.92 MB/s Median: 131.21 MB/s [64.485 MB/s .. 277.54 MB/s]
serde_json_borrow::ReusableMap parse only Avg: 145.78 MB/s Median: 147.55 MB/s [86.206 MB/s .. 202.15 MB/s]
SIMD_json_borrow parse only Avg: 82.997 MB/s Median: 84.742 MB/s [58.033 MB/s .. 136.56 MB/s]
hdfs
serde_json parse only Avg: 258.02 MB/s Median: 266.40 MB/s [186.76 MB/s .. 302.37 MB/s]
serde_json_borrow::OwnedValue parse only Avg: 366.20 MB/s Median: 375.88 MB/s [231.04 MB/s .. 496.03 MB/s]
serde_json_borrow::ReusableMap parse only Avg: 407.99 MB/s Median: 404.68 MB/s [320.25 MB/s .. 559.69 MB/s]
SIMD_json_borrow parse only Avg: 258.31 MB/s Median: 257.01 MB/s [197.53 MB/s .. 301.19 MB/s]
hdfs_with_array
serde_json parse only Avg: 308.14 MB/s Median: 314.23 MB/s [275.76 MB/s .. 320.35 MB/s]
serde_json_borrow::OwnedValue parse only Avg: 505.74 MB/s Median: 524.64 MB/s [356.95 MB/s .. 548.81 MB/s]
serde_json_borrow::ReusableMap parse only Avg: 540.35 MB/s Median: 544.24 MB/s [466.89 MB/s .. 578.15 MB/s]
SIMD_json_borrow parse only Avg: 310.45 MB/s Median: 312.34 MB/s [300.17 MB/s .. 316.79 MB/s]
wiki
serde_json parse only Avg: 589.71 MB/s Median: 614.37 MB/s [369.78 MB/s .. 679.87 MB/s]
serde_json_borrow::OwnedValue parse only Avg: 627.86 MB/s Median: 689.41 MB/s [215.13 MB/s .. 782.33 MB/s]
serde_json_borrow::ReusableMap parse only Avg: 680.08 MB/s Median: 706.46 MB/s [519.18 MB/s .. 799.92 MB/s]
SIMD_json_borrow parse only Avg: 552.17 MB/s Median: 613.51 MB/s [313.16 MB/s .. 698.85 MB/s]
gh-archive
serde_json parse only Avg: 336.46 MB/s Median: 337.61 MB/s [321.00 MB/s .. 344.31 MB/s]
serde_json_borrow::OwnedValue parse only Avg: 618.58 MB/s Median: 620.74 MB/s [574.05 MB/s .. 645.41 MB/s]
serde_json_borrow::ReusableMap parse only Avg: 643.53 MB/s Median: 648.20 MB/s [556.04 MB/s .. 668.70 MB/s]
SIMD_json_borrow parse only Avg: 605.58 MB/s Median: 619.44 MB/s [502.05 MB/s .. 636.55 MB/s]
Actual improvement will depend a lot on the underlying shape of the
data, especially how many nested objects there are. Now that this exists
there's also a pattern in place, I could imagine storing a freelist of
maps and vecs to reuse for nested objects that could probably squeeze
out a bit more performance.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This adds a pair of new public types:
ReusableMapandBorrowedMap.Reusable map stores an ObjectAsVec and provides a
deserializemethod which returns aBorrowedMapthat uses unsafe to tie lifetimes to both the input json str reference and the deserializer.When the
BorrowedMapis dropped it runsclear()on the internalObjectAsVec, freeing it to be reused.Benches show a 4-15% increase in Avg and a 4-12% increase in median throughput compared to OwnedValue on my machine:
Actual improvement will depend a lot on the underlying shape of the data, especially how many nested objects there are. Now that this exists there's also a pattern in place, I could imagine storing a freelist of maps and vecs to reuse for nested objects that could probably squeeze out a bit more performance.