Skip to content

Add a way to skip items of a list when resolving their fields raises an error. #4496

@amomchilov

Description

@amomchilov

Is your feature request related to a problem? Please describe.

If an error is raised while resolving the field of an object in a list, the entire response will error out.

We would like to contribute new API that lets you say "remove this whole object from the list", which will return the other successful items. It would be opt-in on a per-field basis, to preserve backwards compatibility.

This is similar to the existing nil propagation feature built into the interpreter (and it's implemented in much the same way), but for list types instead of nullable types.

Describe the solution you'd like

Everything here is work-in-progress, all API spellings are tentative, but the overall behaviour is tested and working. This prototype adds new skip_nodes_on_raise and on_raise arguments to the field API, like so:

field :my_varied_content,  [VariendContent], 
  skip_nodes_on_raise: true, 
  on_raise: ->(err, current_object, args, context, current_field, nearest_skippable_list_item) { 
    # Log, emit metrics, notify Bugsnag, etc.
    log("Had to skip over #{nearest_skippable_list_item} because it raised #{err}")
    
    # this instructs the interpreter to skip `nearest_skippable_list_item`
    context.skip_from_parent_list 
  } 

PoC Pull request: Shopify#6

I would love to get your feedback!

Advantages:

  1. Doesn't clutter the schema-level rescue_from callback
  2. Since it's specific to the list field with the errored item, it's clear where the error happened

Disadvantages:

  1. If a GraphQL API has lots of these skippable lists, then these callbacks could get pretty repetitive.
    • Users can extract the shared callback into a module and pass the method as the callback on_raise: method(:the_shared_callback)

Describe alternatives you've considered

This feature will introduce new public API to the library that lets library users opt-in and configure this new behaviour.

We've given this a fair bit of thought, so sharing our research might be valuable. I put these into collapsible sections so it doesn't spam everything out.

Extend the existing schema-level rescue_from callback

You can intercept errors with the schema-level rescue_from callback, but your only options today are:

  1. Return a value which will be used as a substitute value for the field.
    • This seems pretty circumstantial. Not many fields have default values that would make sense. E.g. what's a default value for a Person's firstName: String!?
  2. Return context.skip, causing that field to be omitted from that object.
    • It's easy to confuse this with the list item skipping behaviour we want, but it's quite different. When you context.skip, that immediate field will be removed from that object. If the field was not nullable, this would be really weird for the client. There's no propagation behaviour to say "remove this whole object from the list", which is what we want to build.
  3. raise, to propagate the error
    • The schema.execute call will end up raising this error up to the caller, e.g. your Rails controller, which then trigger's Rails' error-handling mechanisms.

There is no way to say "just skip the whole list item". We can add something like context.skip_from_parent_list, but it's still quite clunky:

  • rescue_from already has 5 parameters, and we probably don't want to add more, especially if they're not going to be used most of the time.

    • We need to inform the user that skipping a list item is an option for the particular error. Perhaps it could be part of the context?
    • We need to tell the user which object would be skipped (e.g. so they can report to Bugsnag which kind of item has issues). Perhaps this would also be part of the context?
  • To make skipping work, every schema would need boilerplate like this, even if they have no other error logic:

    rescue_from(StandardError) do |err, obj, args, ctx, field|
      if ctx.can_skip_list_item?
        next ctx.skip_from_parent_list 
      end
      
      raise
    end

    If library users didn't add this, then their errorred items will always raise, and never be skipped, which would be a bit surprising.

Upsides:

  • Reuses and API surface that already exists, and is well-known and discoverable.

Add a new schema-level on_skip callback

(on_skip naming is tentative)

Similar to rescue_from. Since it's specialized to this item-skipping use case, we can give its own set of parameters that has all the data needed for this use case.

Not ideal however, because it would require repeating a lot of rescue_from boilerplate, like https://github.com/rmosolgo/graphql-ruby/blob/master/lib/graphql/execution/errors.rb

Add a new callback to the Field Extension API

Doc: https://rubydoc.info/gems/graphql/2.0.22/GraphQL/Schema/FieldExtension

  • Something like #on_list_item_raise
  • This is probably the best bet, but I'm not as familiar with Field Extensions.
  • This would declutter the arguments to Field#initailize.
  • The interpreter would still need to be tweaked to support the correct propagation behaviour, so this field extension will be coupled and not truly self-contained/modular.

Additional context

Motivation

Shopify has use cases that involve returning lists/connections of highly varied content, sourced from different data sources and teams. As the number of these grows, the failure surface gets bigger, and there's a risk that any one piece of content can bring down the entire request.

To increase resilience, it's imperative that failures in one piece of content can be contained, and don't take down the entire request. If resolving a list item's fields leads to an exception being raised, we'd like to skip that item entirely, but keep the other successful list items. "The show must go on."

Other considerations

This can't be solved in user code

Solving this problem is not as simple as just making users wrap up all their resolvers into a large rescue block; most of the failures don't come from the resolver logic that constructs the list items, but rather the resolution of the child fields from those list items, which can be deeply nested.

This is a process users fundamentally don't control, it's the GraphQL interpreter that's going over the query, deciding which fields need to be selected, and then resolving those fields on our objects. There's no way for "userland" code to rescue this; this requires modifications to the GraphQL interpreter, to wrap each field resolution in a rescue block, and provide an API for configuring how/when list items should be skipped.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions