-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Lazy load breadth-first instead of depth-first #3694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi! Thanks for the great question/suggestion. It reminds me of #1981, which recommended a similar approach for a similar goal. I ended up taking a different approach, but it does support parallel IO, in its way: https://graphql-ruby.org/dataloader/sources.html#example-loading-in-a-background-thread Did you try an approach like that? Here's how it might be done for the example above: batch_test.rb
require "bundler/inline"
gemfile do
gem "graphql", "1.12.18"
gem "concurrent-ruby", require: "concurrent"
end
class StringSource < GraphQL::Dataloader::Source
def initialize(word, letter)
@word = word
@letter = letter
end
def load(keys)
future = Concurrent::Promises.future do
sleep 1
puts "#{@word} #{@letter}"
keys
end
dataloader.yield
future.value
end
end
class Query < GraphQL::Schema::Object
field :field_a, String, null: false
def field_a
_hello = context.dataloader.with(StringSource, "Hello", "A").load("A")
_world = context.dataloader.with(StringSource, "World", "A").load("A")
"Hello"
end
field :field_b, String, null: false
def field_b
_hello = context.dataloader.with(StringSource, "Hello", "B").load("B")
_world = context.dataloader.with(StringSource, "World", "B").load("B")
"World"
end
end
class Schema < GraphQL::Schema
query(Query)
use GraphQL::Dataloader
end
puts "Start query: #{Time.now.to_f}"
pp Schema.execute("{ fieldA fieldB }")
puts "End query: #{Time.now.to_f}" Of course, it's a slightly different flow, since the work is gathered by GraphQL::Dataloader, then handed off to
Besides that, I'm working on leveraging Ruby 3's Fiber scheduler for GraphQL execution, too, which would support real parallelism for I/O without the need for Concurrent-Ruby. (#3482 -- stuck on CI right now 😖 ) Here's an example of that: async-dataloader batch_test.rb
require "bundler/inline"
gemfile do
gem "graphql", github: "rmosolgo/graphql-ruby", ref: "async-dataloader"
gem "libev_scheduler"
end
require "fiber"
Fiber.set_scheduler Libev::Scheduler.new
class StringSource < GraphQL::Dataloader::Source
def initialize(word, letter)
@word = word
@letter = letter
end
def load(keys)
sleep 1
puts "#{@word} #{@letter}"
keys
end
end
class Query < GraphQL::Schema::Object
field :field_a, String, null: false
def field_a
_hello = context.dataloader.with(StringSource, "Hello", "A").load("A")
_world = context.dataloader.with(StringSource, "World", "A").load("A")
"Hello"
end
field :field_b, String, null: false
def field_b
_hello = context.dataloader.with(StringSource, "Hello", "B").load("B")
_world = context.dataloader.with(StringSource, "World", "B").load("B")
"World"
end
end
class Schema < GraphQL::Schema
query(Query)
use GraphQL::Dataloader, nonblocking: true
end
puts "Start query: #{Time.now.to_f}"
pp Schema.execute("{ fieldA fieldB }")
puts "End query: #{Time.now.to_f}"
All that to say, I'm definitely interested in improving the parallel I/O story for GraphQL-Ruby, but I'm currently exploring another approach. What do you think, would that do the job for you? If not, I'd love to hear why -- maybe there's another design that would work better for your case. |
Thanks for the response! I wanted to be able to easily compare this with my existing graphql-batch loaders so I opted to build a small shim layer which acts like GraphQL::Batch on top of dataloader and it seems to work like a charm. Takes a bit of getting used to, but it does seem to have significant advantages (at least on CRuby, it looks like TruffleRuby might struggle with this many Fibers unless the Loom work silently landed already…) I'd say this more than solves my issue, so I'm gonna close this! |
Uh oh!
There was an error while loading. Please reload this page.
Is your feature request related to a problem? Please describe.
I want to parallelize batched lazy-loading using
Concurrent::Promises
while maintaining (most) compatibility withgraphql-batch
. To achieve this, I was attempting to create a Promise chain:The hope being that the inner
Concurrent::Promises.future
block would run in parallel with sibling fields, releasing GVL while the database works.In reality, however, resolution is done depth-first, so graphql-ruby waits for the inner block to resolve before resolving sibling fields.
Describe the solution you'd like
I would like to propose a change from depth-first to breadth-first resolution of lazy values. That is, resolve siblings before you resolve a nested lazy value. This would enable this use case, as well as improve batching in cases where you have a second layer of resolvers. For example, you might have multiple fields like this:
Currently, the OtherLoader won't batch across fields, but with a breadth-first approach it would.
Describe alternatives you've considered
One alternative might be to create a resolution queue. As fields return lazy values, push them onto the queue, and resolve them in a first-in-first-out manner similar to the JS event loop until there's nothing left.
Example
Assuming
lazy_resolve(::Concurrent::Promises::Future, :value!)
, and given the following type:For the query
{ fieldA fieldB }
what we want to see (across 5 seconds) is:But what we see instead (across 10 seconds) is
This is obviously a bit contrived, but this same parallelism would be beneficial to real applications as well, since it would interleave database queries
The text was updated successfully, but these errors were encountered: