Skip to content

Optimize JSON.dump argument parsing #616

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 17, 2024
Merged

Conversation

casperisfine
Copy link

JSON.dump looks terrible on micro-benchmarks because the way it handles arguments is quite allocation heavy compared to the actual JSON generation work.

Profiling the small hash benchmarked show 14% of time spent in Array#compact and 34% time spent in JSON::Ext::GeneratorState.new. Only 41% in the actual generate function.

By micro-optimizing JSON.dump, it can look much better:

Before:

== Encoding small nested array (121 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json    91.687k i/100ms
                  oj   205.309k i/100ms
           rapidjson   161.648k i/100ms
Calculating -------------------------------------
                json    941.965k (± 1.4%) i/s    (1.06 μs/i) -      4.768M in   5.062573s
                  oj      2.138M (± 1.2%) i/s  (467.82 ns/i) -     10.881M in   5.091254s
           rapidjson      1.678M (± 1.9%) i/s  (596.04 ns/i) -      8.406M in   5.011931s

Comparison:
                json:   941964.8 i/s
                  oj:  2137586.5 i/s - 2.27x  faster
           rapidjson:  1677737.1 i/s - 1.78x  faster

== Encoding small hash (65 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json   141.737k i/100ms
                  oj   676.871k i/100ms
           rapidjson   373.266k i/100ms
Calculating -------------------------------------
                json      1.491M (± 1.0%) i/s  (670.78 ns/i) -      7.512M in   5.039463s
                  oj      7.226M (± 1.4%) i/s  (138.39 ns/i) -     36.551M in   5.059475s
           rapidjson      3.729M (± 2.2%) i/s  (268.15 ns/i) -     18.663M in   5.007182s

Comparison:
                json:  1490798.2 i/s
                  oj:  7225766.2 i/s - 4.85x  faster
           rapidjson:  3729192.2 i/s - 2.50x  faster

After:

== Encoding small nested array (121 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json   156.832k i/100ms
                  oj   209.769k i/100ms
           rapidjson   162.922k i/100ms
Calculating -------------------------------------
                json      1.599M (± 2.5%) i/s  (625.34 ns/i) -      7.998M in   5.005110s
                  oj      2.137M (± 1.5%) i/s  (467.99 ns/i) -     10.698M in   5.007806s
           rapidjson      1.677M (± 3.5%) i/s  (596.31 ns/i) -      8.472M in   5.059515s

Comparison:
                json:  1599141.2 i/s
                  oj:  2136785.3 i/s - 1.34x  faster
           rapidjson:  1676977.2 i/s - same-ish: difference falls within error

== Encoding small hash (65 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json   216.464k i/100ms
                  oj   661.328k i/100ms
           rapidjson   324.434k i/100ms
Calculating -------------------------------------
                json      2.301M (± 1.7%) i/s  (434.57 ns/i) -     11.689M in   5.081278s
                  oj      7.244M (± 1.2%) i/s  (138.05 ns/i) -     36.373M in   5.021985s
           rapidjson      3.323M (± 2.9%) i/s  (300.96 ns/i) -     16.871M in   5.081696s

Comparison:
                json:  2301142.2 i/s
                  oj:  7243770.3 i/s - 3.15x  faster
           rapidjson:  3322673.0 i/s - 1.44x  faster

Now profiles of the small hash benchmark show 44% in generate and 45% in GeneratorState allocation.

`JSON.dump` looks terrible on micro-benchmarks because the way it
handles arguments is quite allocation heavy compared to the actual
JSON generation work.

Profiling the `small hash` benchmarked show 14% of time spent in `Array#compact`
and `34%` time spent in `JSON::Ext::GeneratorState.new`. Only `41%` in the
actual `generate` function.

By micro-optimizing `JSON.dump`, it can look much better:

Before:

```
== Encoding small nested array (121 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json    91.687k i/100ms
                  oj   205.309k i/100ms
           rapidjson   161.648k i/100ms
Calculating -------------------------------------
                json    941.965k (± 1.4%) i/s    (1.06 μs/i) -      4.768M in   5.062573s
                  oj      2.138M (± 1.2%) i/s  (467.82 ns/i) -     10.881M in   5.091254s
           rapidjson      1.678M (± 1.9%) i/s  (596.04 ns/i) -      8.406M in   5.011931s

Comparison:
                json:   941964.8 i/s
                  oj:  2137586.5 i/s - 2.27x  faster
           rapidjson:  1677737.1 i/s - 1.78x  faster

== Encoding small hash (65 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json   141.737k i/100ms
                  oj   676.871k i/100ms
           rapidjson   373.266k i/100ms
Calculating -------------------------------------
                json      1.491M (± 1.0%) i/s  (670.78 ns/i) -      7.512M in   5.039463s
                  oj      7.226M (± 1.4%) i/s  (138.39 ns/i) -     36.551M in   5.059475s
           rapidjson      3.729M (± 2.2%) i/s  (268.15 ns/i) -     18.663M in   5.007182s

Comparison:
                json:  1490798.2 i/s
                  oj:  7225766.2 i/s - 4.85x  faster
           rapidjson:  3729192.2 i/s - 2.50x  faster
```

After:

```
== Encoding small nested array (121 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json   156.832k i/100ms
                  oj   209.769k i/100ms
           rapidjson   162.922k i/100ms
Calculating -------------------------------------
                json      1.599M (± 2.5%) i/s  (625.34 ns/i) -      7.998M in   5.005110s
                  oj      2.137M (± 1.5%) i/s  (467.99 ns/i) -     10.698M in   5.007806s
           rapidjson      1.677M (± 3.5%) i/s  (596.31 ns/i) -      8.472M in   5.059515s

Comparison:
                json:  1599141.2 i/s
                  oj:  2136785.3 i/s - 1.34x  faster
           rapidjson:  1676977.2 i/s - same-ish: difference falls within error

== Encoding small hash (65 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json   216.464k i/100ms
                  oj   661.328k i/100ms
           rapidjson   324.434k i/100ms
Calculating -------------------------------------
                json      2.301M (± 1.7%) i/s  (434.57 ns/i) -     11.689M in   5.081278s
                  oj      7.244M (± 1.2%) i/s  (138.05 ns/i) -     36.373M in   5.021985s
           rapidjson      3.323M (± 2.9%) i/s  (300.96 ns/i) -     16.871M in   5.081696s

Comparison:
                json:  2301142.2 i/s
                  oj:  7243770.3 i/s - 3.15x  faster
           rapidjson:  3322673.0 i/s - 1.44x  faster
```

Now profiles of the `small hash` benchmark show 44% in `generate` and
`45%` in `GeneratorState` allocation.
@byroot byroot merged commit 45ba1f8 into ruby:master Oct 17, 2024
73 checks passed
@casperisfine casperisfine deleted the opt-small-hash branch October 17, 2024 10:25
@@ -613,26 +613,42 @@ class << self
# Output:
# {"foo":[0,1],"bar":{"baz":2,"bat":3},"bam":"bad"}
def dump(obj, anIO = nil, limit = nil, kwargs = nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One idea (which I learned from @nobu IIRC) to optimize the most common case of no extra arguments are passed is:

def dump(obj, anIO = (no_args_set = true; nil), limit = nil, kwargs = nil)
  unless no_args_set
    ...
  end

  opts = JSON.dump_default_options
  ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice trick.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants