-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
Is your feature request related to a problem? Please describe.
If an error is raised while resolving the field of an object in a list, the entire response will error out.
We would like to contribute new API that lets you say "remove this whole object from the list", which will return the other successful items. It would be opt-in on a per-field basis, to preserve backwards compatibility.
This is similar to the existing nil propagation feature built into the interpreter (and it's implemented in much the same way), but for list types instead of nullable types.
Describe the solution you'd like
Everything here is work-in-progress, all API spellings are tentative, but the overall behaviour is tested and working. This prototype adds new skip_nodes_on_raise
and on_raise
arguments to the field
API, like so:
field :my_varied_content, [VariendContent],
skip_nodes_on_raise: true,
on_raise: ->(err, current_object, args, context, current_field, nearest_skippable_list_item) {
# Log, emit metrics, notify Bugsnag, etc.
log("Had to skip over #{nearest_skippable_list_item} because it raised #{err}")
# this instructs the interpreter to skip `nearest_skippable_list_item`
context.skip_from_parent_list
}
PoC Pull request: Shopify#6
I would love to get your feedback!
Advantages:
- Doesn't clutter the schema-level
rescue_from
callback - Since it's specific to the list field with the errored item, it's clear where the error happened
Disadvantages:
- If a GraphQL API has lots of these skippable lists, then these callbacks could get pretty repetitive.
- Users can extract the shared callback into a module and pass the method as the callback
on_raise: method(:the_shared_callback)
- Users can extract the shared callback into a module and pass the method as the callback
Describe alternatives you've considered
This feature will introduce new public API to the library that lets library users opt-in and configure this new behaviour.
We've given this a fair bit of thought, so sharing our research might be valuable. I put these into collapsible sections so it doesn't spam everything out.
Extend the existing schema-level rescue_from
callback
rescue_from
callbackYou can intercept errors with the schema-level rescue_from
callback, but your only options today are:
- Return a value which will be used as a substitute value for the field.
- This seems pretty circumstantial. Not many fields have default values that would make sense. E.g. what's a default value for a
Person
'sfirstName: String!
?
- This seems pretty circumstantial. Not many fields have default values that would make sense. E.g. what's a default value for a
- Return
context.skip
, causing that field to be omitted from that object.- It's easy to confuse this with the list item skipping behaviour we want, but it's quite different. When you
context.skip
, that immediate field will be removed from that object. If the field was not nullable, this would be really weird for the client. There's no propagation behaviour to say "remove this whole object from the list", which is what we want to build.
- It's easy to confuse this with the list item skipping behaviour we want, but it's quite different. When you
raise
, to propagate the error- The
schema.execute
call will end up raising this error up to the caller, e.g. your Rails controller, which then trigger's Rails' error-handling mechanisms.
- The
There is no way to say "just skip the whole list item". We can add something like context.skip_from_parent_list
, but it's still quite clunky:
-
rescue_from
already has 5 parameters, and we probably don't want to add more, especially if they're not going to be used most of the time.- We need to inform the user that skipping a list item is an option for the particular error. Perhaps it could be part of the
context
? - We need to tell the user which object would be skipped (e.g. so they can report to Bugsnag which kind of item has issues). Perhaps this would also be part of the context?
- We need to inform the user that skipping a list item is an option for the particular error. Perhaps it could be part of the
-
To make skipping work, every schema would need boilerplate like this, even if they have no other error logic:
rescue_from(StandardError) do |err, obj, args, ctx, field| if ctx.can_skip_list_item? next ctx.skip_from_parent_list end raise end
If library users didn't add this, then their errorred items will always raise, and never be skipped, which would be a bit surprising.
Upsides:
- Reuses and API surface that already exists, and is well-known and discoverable.
Add a new schema-level on_skip
callback
on_skip
callback(on_skip
naming is tentative)
Similar to rescue_from
. Since it's specialized to this item-skipping use case, we can give its own set of parameters that has all the data needed for this use case.
Not ideal however, because it would require repeating a lot of rescue_from
boilerplate, like https://github.com/rmosolgo/graphql-ruby/blob/master/lib/graphql/execution/errors.rb
Add a new callback to the Field Extension API
Doc: https://rubydoc.info/gems/graphql/2.0.22/GraphQL/Schema/FieldExtension
- Something like
#on_list_item_raise
- This is probably the best bet, but I'm not as familiar with Field Extensions.
- This would declutter the arguments to
Field#initailize
. - The interpreter would still need to be tweaked to support the correct propagation behaviour, so this field extension will be coupled and not truly self-contained/modular.
Additional context
Motivation
Shopify has use cases that involve returning lists/connections of highly varied content, sourced from different data sources and teams. As the number of these grows, the failure surface gets bigger, and there's a risk that any one piece of content can bring down the entire request.
To increase resilience, it's imperative that failures in one piece of content can be contained, and don't take down the entire request. If resolving a list item's fields leads to an exception being raised, we'd like to skip that item entirely, but keep the other successful list items. "The show must go on."
Other considerations
This can't be solved in user code
Solving this problem is not as simple as just making users wrap up all their resolvers into a large rescue
block; most of the failures don't come from the resolver logic that constructs the list items, but rather the resolution of the child fields from those list items, which can be deeply nested.
This is a process users fundamentally don't control, it's the GraphQL interpreter that's going over the query, deciding which fields need to be selected, and then resolving those fields on our objects. There's no way for "userland" code to rescue
this; this requires modifications to the GraphQL interpreter, to wrap each field resolution in a rescue
block, and provide an API for configuring how/when list items should be skipped.