Skip to content

Use a queue for execution #5389

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 37 commits into
base: master
Choose a base branch
from
Open

Use a queue for execution #5389

wants to merge 37 commits into from

Conversation

rmosolgo
Copy link
Owner

@rmosolgo rmosolgo commented Jun 20, 2025

I'm feeling really inspired by @gmac's proof-of-concept over at https://github.com/gmac/graphql-cardinal/, so I thought I'd try again at refactoring execution to use a queue instead of recursive method calls. I have tried before (#4967, #4968, somewhat #3998, #4935) and given up along the way, but I'm going to try again 😅

TODO:

  • Extract some refactors from this branch
  • Make Lazy and Dataloader work
  • Consider steps for continue_value, resolve_list_item
  • Use steps for call_method_on_directives
  • Merge ResolveTypeStep into ResultHash
  • Merge directive checks into steps too?
  • Merge authorized_new lazy usage into ResultHash
  • Look for other after_lazy usages and rebuild them using queue steps
  • Refactor ResultArray and ResultHash into state machines
  • Merge ResultHash into Schema::Object
  • Fix dataloaded arguments
  • RunQueue is shared by the multiplex -- can it be merged into one of the other objects with the same lifecycle? (Multiplex, Interpreter, Dataloader)
  • Tighter integration between Dataloader, Lazy, and Queue?
  • Merge Field extensions into FieldResolveStep to reduce overhead when used?
  • Tests that are still failing:
    • current_field, current_arguments in context
    • rescuing authorized errors
    • batching lazies correctly
    • supporting parallel I/O in dataloader
    • ...

@rmosolgo
Copy link
Owner Author

There's still a lot to do here for compatibility and performance, but the initial benchmark results are surprisingly good (even with the terrible FieldResolutionStep implementation which is hogging memory). This is using a modified version of the benchmark from graphql-cardinal:

  ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [x86_64-darwin22]
  Calculating -------------------------------------
  graphql-ruby: 142 resolvers
-                         399.549 (± 8.5%) i/s    (2.50 ms/i) -      2.000k in   5.048521s
+                         538.453 (± 3.9%) i/s    (1.86 ms/i) -      2.695k in   5.013324s

  graphql-ruby: 140002 resolvers
-                           0.517 (± 0.0%) i/s     (1.93 s/i) -      3.000 in   5.797333s
+                           0.761 (± 0.0%) i/s     (1.31 s/i) -      4.000 in   5.272811s

- Total allocated: 10149304 bytes (70353 objects)
+ Total allocated: 13431224 bytes (97389 objects)

So that's a 34% or 47% speedup. It's using 30% more memory, because it's currently the Simplest Thing That Could Possibly Work. Presumably refactoring it to avoid those repeated allocations will make it even faster!

@rmosolgo
Copy link
Owner Author

I haven't pushed a commit here for a while but I'm still planning on landing this somehow. Here's a quick brain-dump in case anyone is curious:

  • The perf win is not as big as it was initially because I added objects (FieldResolveStep) for each field resolution. Amazingly it's still faster than the previous implementation. But it's 20%-30% IIRC.
  • It might get slower because the shift to this kind of flow will also require queue objects for Argument resolution (basically anywhere that can return a Promise)
  • The time tradeoff comes at the expense of memory which I think is worth it. I have some ideas for reducing memory once this is really working:
    • Merge GraphQLResultHash into GraphQL::Schema::Object -- they have the same lifecycle at runtime so they could be merged.
    • Merge Field::ExtendedState into FieldResolveStep; this will save an allocation in possibly-tight loops when field extensions are added to repeatedly-used fields.
    • Introduce "optimized" field execution. Sometimes we can detect when a field could use a much simpler execution flow, eg "just call a method and insert the result into the hash", without preparing arguments, running extensions, etc. This could cover a lot of basic fields and speed things up, but we still have to be ready for Promises and Dataloader pauses.
  • Merging and releasing this will be tricky. I expect to have full compatibility when I'm done (or very-nearly-full 😅 ) but it's such a huge change that I want to make sure it's a gentle release so nobody gets nasty surprises. Some thoughts:
    • Many of the changes in this branch could be isolated and released separately: merging Dataloader and Lazy resolution; merging ResultHash and Schema::Object; moving execution code into named methods on Step classes;
    • After that, you could switch from stack-based to queue-based using code that was mostly in place. Ideally both implementations would be available (and runtime-swappable) but I think the implementation will have to reveal itself once the other changes have been merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant