Skip to content

Add fiber-based batch loading API #3264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Jan 6, 2021
Merged

Add fiber-based batch loading API #3264

merged 28 commits into from
Jan 6, 2021

Conversation

rmosolgo
Copy link
Owner

@rmosolgo rmosolgo commented Dec 27, 2020

This could be pretty awesome. @bessey first suggested this on Twitter, and combined with a trampolining-like refactor, it might just work!

TL;DR: Use Fiber.yield to halt GraphQL execution in place; allow GraphQL fields to Fiber.yield and then they're resumed once every branch has reached a halt.

The coolest thing is, if we can make Interpreter Fiber-aware, then we lay the groundwork for Ruby 3's Fiber scheduler API, and we'd get parallel IO "for free" (we have to implement a scheduler, and somehow implement that baton-passing).


Goals:

  • User-transparent API to support batch loading (no promises !? 😱 )
  • Total compatibility for everything else (including existing lazy_resove) etc

If this works, I'll drop #2483

TODO:

  • Does it need a fiber pool? I don't think so -- I think Ruby is pretty thrifty with Fibers. If we need this later, we can replace waiting_fibers = [] with a more sophisticated blocking queue.
  • Avoid perf overhead in cases where Fiber.yield isn't used
  • The implementation of no-overhead execution handoffs is a nightmare. How can this be improved?
    • I think I've done what I can here. The fact is, there has to be some object that can produce a continuation fiber, and for performance reasons, it might as well be the Runtime instance itself. I've simplified the code there a lot, and used some simple names, but I can't think of a really better design for it.
  • The bummer of running things inside fibers is that stackprof measures fibers differently. Any way to mitigate this complexity? Support a way to disable fiber-based concurrency?
    • I abstracted into NullDataloader, which runs eagerly -- it's the default
  • What about enqueuing multiple requirements from one resolver? One advantage of graphql-batch is that you can queue up requests for different kinds of data, then graphql-batch runs them as it sees fit. With Fiber.yield, it can't queue up more than one kind of load at a time.
  • More test cases:
    • Multi-level yielding
    • Joining batch-load results in a resolver
    • Multiplexing
    • Load from within a source
    • Batching params
  • Docs -- a sources framework?
    • Some multiplex-level loader pool
    • API for custom sources
    • Update code docs to reflect implementation changes
    • Document Dataloader class?
    • Document comparison to GraphQL-Batch
  • Thread.current[...] is fiber-local - update backtrace, anything else?
    • GraphQL execution may happen on any fiber. Move any Thread.current[...] assignments into context.
  • Benchmark graphql-batch vs fiber
  • Try out Ruby 3's scheduler? I'm going to put this off.
  • Hunt TODOs in the code
  • Support ad-hoc dataloader.yield for manual parallelism

@rmosolgo rmosolgo added this to the 1.12.0 milestone Dec 27, 2020
@rmosolgo rmosolgo self-assigned this Dec 27, 2020
@rmosolgo rmosolgo changed the title Add fiber-based defer API Add fiber-based batch loading API Dec 27, 2020
@rmosolgo
Copy link
Owner Author

rmosolgo commented Jan 5, 2021

I added a really simple benchmark for comparing no batching / graphql-batch / graphql-dataloader. (Any suggestions for improving it?)

It looks like GraphQL-Dataloader has about half the runtime overhead of GraphQL-Batch.

~/code/graphql-ruby % be rake bench:profile_batch_loaders 
Warming up --------------------------------------
      GraphQL::Batch    64.000  i/100ms
 GraphQL::Dataloader    71.000  i/100ms
         No Batching    82.000  i/100ms
Calculating -------------------------------------
      GraphQL::Batch    644.750  (± 2.6%) i/s -      3.264k in   5.066302s
 GraphQL::Dataloader    711.424  (± 1.8%) i/s -      3.621k in   5.091594s
         No Batching    834.033  (± 1.8%) i/s -      4.182k in   5.015796s

Comparison:
         No Batching:      834.0 i/s
 GraphQL::Dataloader:      711.4 i/s - 1.17x  (± 0.00) slower
      GraphQL::Batch:      644.8 i/s - 1.29x  (± 0.00) slower

As for memory, GraphQL::Dataloader uses more memory, but fewer objects (presumably because Fiber objects are heavier, but they track state in a single Ruby object).

  • No batching: 69976 bytes (756 objects)
  • GraphQL-Batch: 90184 bytes (1008 objects)
  • GraphQL-Dataloader: 100960 bytes (899 objects)

The biggest impacts are:

allocated memory by location
-----------------------------------
     12888  /Users/rmosolgo/code/graphql-ruby/lib/graphql/dataloader.rb:54
      5544  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:156
      5376  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:158
      4704  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:294
      4536  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:651
      4368  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:403

Those are:

  • Creating fibers (dataloader.rb)
  • Resuming execute_selections (runtime.rb:156, runtime.rb:158) -- it has added some overhead. How to improve that?

It looks to me like those classes (Fiber, Hash) make up most of the overhead:

  • No batching:

    allocated memory by class
    -----------------------------------
         41536  Hash
         18024  Array
          4560  Proc
           952  Enumerator
           560  BatchLoading::GraphQLNoBatchingSchema::Team
    
  • Dataloader:

    allocated memory by class
    -----------------------------------
         53688  Hash
         19968  Array
         15264  Fiber
          6000  Proc
           952  Enumerator
    

    (funny thing is, it's only 12 fibers!)

cc @swalkinshaw who expressed interest in seeing a benchmark

@rmosolgo
Copy link
Owner Author

rmosolgo commented Jan 5, 2021

I was able to reduce the overhead a bit, now dataloader's memory footprint is smaller than graphql-batch for that benchmark:

========== No Batch Memory ==============
Total allocated: 70096 bytes (758 objects)
...
allocated memory by location
-----------------------------------
      4704  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:306
      4536  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:167
      4536  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:663
      4368  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:415
      3392  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:230

========== Dataloader Memory =================
Total allocated: 89480 bytes (851 objects)
...
allocated memory by location
-----------------------------------
      7160  /Users/rmosolgo/code/graphql-ruby/lib/graphql/dataloader.rb:56
      4704  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:306
      4536  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:167
      4536  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:663
      4368  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:415
      4056  /Users/rmosolgo/code/graphql-ruby/lib/graphql/dataloader.rb:162
      3392  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:230


========== GraphQL-Batch Memory ==============
Total allocated: 90304 bytes (1010 objects)
...
allocated memory by location
-----------------------------------
      4704  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:306
      4640  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/lazy.rb:30
      4536  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:167
      4536  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:663
      4368  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:415
      3392  /Users/rmosolgo/code/graphql-ruby/lib/graphql/execution/interpreter/runtime.rb:230
      3384  /Users/rmosolgo/code/graphql-ruby/lib/graphql/schema/warden.rb:270

@rmosolgo rmosolgo changed the base branch from master to 1.12-dev January 6, 2021 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant