Skip to content

Instrumentation #354

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Nov 1, 2016
Merged

Instrumentation #354

merged 6 commits into from
Nov 1, 2016

Conversation

rmosolgo
Copy link
Owner

@rmosolgo rmosolgo commented Oct 26, 2016

A hot take on #186

  • Apply field instrumentation at build time
  • Apply query instrumentation at runtime
  • Very specific implementation of instrumentation hooks

Buyer beware: if you modify the passed-in type or field, it will be changed everywhere it's used (since it's the same object). I want to provide a shallow-copy API for these objects so that wrapping them is easier.

If there just happened to be any widely used extensions to this gem, they could accept instrumentation of their own:

MySchema = GraphQL::Schema.define do 
  instrument(:batch, MyBatchInstrumentation) 
end 

And fetch it out & apply it as they see fit:

MySchema.instrumenters[:batch] # => array of instrumenters (empty if `instrument` was not called)

Todo

  • Benchmark: what's the overhead if you don't use it?
  • Shallow-copy API for schema members

@rmosolgo
Copy link
Owner Author

Looks like no overhead if you don't use it, that much is good:

~/code/graphql $ ruby -Ilib:spec/support introspection_benchmark.rb
Warming up --------------------------------------
           run query     3.000  i/100ms
Calculating -------------------------------------
           run query     33.123  (± 3.0%) i/s -    168.000  in   5.077633s
~/code/graphql $ ruby -Ilib:spec/support introspection_benchmark.rb
Warming up --------------------------------------
           run query     3.000  i/100ms
Calculating -------------------------------------
           run query     33.429  (± 3.0%) i/s -    168.000  in   5.030551s
~/code/graphql $ git co master
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
~/code/graphql $ ruby -Ilib:spec/support introspection_benchmark.rb
Warming up --------------------------------------
           run query     3.000  i/100ms
Calculating -------------------------------------
           run query     33.007  (± 3.0%) i/s -    165.000  in   5.000966s
~/code/graphql $ ruby -Ilib:spec/support introspection_benchmark.rb
Warming up --------------------------------------
           run query     3.000  i/100ms
Calculating -------------------------------------
           run query     33.365  (± 3.0%) i/s -    168.000  in   5.040369s
~/code/graphql $ ruby -Ilib:spec/support introspection_benchmark.rb
Warming up --------------------------------------
           run query     3.000  i/100ms
Calculating -------------------------------------
           run query     33.162  (± 3.0%) i/s -    168.000  in   5.068237s

def instrument(type, field)
if field.metadata[:timed]
old_resolve_proc = field.resolve_proc
new_resolve_proc = ->(obj, args, ctx) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was slightly confused by this.

The instrumenter is an object which responds to #instrument(type, field). #instrument should return a GraphQL::Field instance which will be used during query execution.

It seems like this method will either return nil or a proc, no?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Derp, documentation fail. Let me rebase and fix this example :S

end
```

The instrumenter must implement `#before_query(query)` and `#after_query(query)`. The return value of these methods are not used. They receive the `GraphQL::Query` instance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than calculating query time, can you think of a use case of providing this before/after granularity? Made me wonder if we should do like http://guides.rubyonrails.org/active_support_instrumentation.html#process-action-action-controller and just provide def query and the query time would be calculate already.

How would one be able to get the start_time when after_query gets triggered? Would this involve keeping a hash like: @queries[query] = Time.now and in after_query it would do: Time.now - @queries[query] and @queries.delete(query)?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I can imagine in before / after is some setup & teardown for logging. Going along with the second question, you could prepare some logging state in before_query and do something with it in after_query, eg

def before_query(query) 
  query.context[:query_logger] = QueryLogger.new
end 

def after_query(query) 
  query.context[:query_logger].flush 
end 

In the mean time, other instrumentation is using context[:query_logger] to track stuff.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would be considering using hrtimes and may end up using a native library to get them so I don't think just having the query time in ms would be ideal for us, as an aside.

@rmosolgo
Copy link
Owner Author

There are some ways this API could be extended:

# calls `after_query` only
instrument(:after_query) { |query| ... }

# Used as a new `resolve_proc`
instrument(:resolve, if: -> { |type, field| ... }) { |inner_resolve| do_stuff(); inner_resolve.call() } 

These are sugar on top of the current implementation, we could add them later too.

@tmeasday
Copy link

A few questions:

  1. Is it possible to attach instrumenters to a schema after creation? This would make for a slightly easier integration for users (OpticsAgent.instrument_schema(MySchema))
  2. I wonder why not instrument(:before/after_field also? I guess that's just sugar on instrument(:field..

@tmeasday
Copy link

Also, I guess it depends on what the instrument(:batch, ...) API looks like, but I wonder how easy it will be to associate a field with the batch it runs in. Right now I have a difficulty in determining exactly when the loader that a field utilizes starts it's perform step (as this can be backed up by other loaders that execute in the same execution "cycle").

@rmosolgo
Copy link
Owner Author

attach instrumenters to a schema after creation?

Oops, I was about to say:

Yes, if you take a look at the instrument hook, you can basically make the same call by hand:

schema.instrumenters[type] << instrumenter

But I see that it wouldn't bust the cache of instrumented fields.

I can think of a couple options:

  • Monkey-patch 🐒 one of the involved methods (#initialize, #define) to add an instrumenter
  • Provide an API for adding instrumentation after the hook (I'm not crazy about it -- I'd like to lean on the assumption that schemas are immutable after .define)
  • Provide an API for "default instrumenters", where we reach someplace and change for instrumenters to have been put there by someone else
  • Don't do any of the above, but force people to "manually" attach instrumentation. This is 👍 from a simplicity point of view, but 👎 from a "just works" point of view.

Hmm ... anything else?

instrument(:before/:after_field) ... just sugar

Yes, we could add it if the basic framework seems ok!

@dylanahsmith
Copy link
Contributor

Also, I guess it depends on what the instrument(:batch, ...) API looks like

YAGNI

For graphql-batch I would just recommend using Promise.map_value to get the end of the field resolution:

class MyFieldInstrumentation
  def instrument(type, field)
    old_resolve_proc = field.resolve_proc
    new_resolve_proc = ->(obj, args, ctx) {
      Rails.logger.info("#{type.name}.#{field.name} START: #{Time.now.to_i}")
      result = old_resolve_proc.call(obj, args, ctx)
      Promise.map_value(result) do |value|
        Rails.logger.info("#{type.name}.#{field.name} END: #{Time.now.to_i}")
        value
      end
    }

    field.redefine do
      resolve(new_resolve_proc)
    end
  end
end

I wonder how easy it will be to associate a field with the batch it runs in.

The batch loader isn't associated with a field, because it can be used from multiple different fields. Essentially I would visualize it as executing for all those fields in parallel.

Right now I have a difficulty in determining exactly when the loader that a field utilizes starts it's perform step (as this can be backed up by other loaders that execute in the same execution "cycle").

Actually, there could be multiple perform steps. For instance, I might need to preload two different associations in order to actually calculate the result.

I'm not sure how exactly batch executor should be instrumented, but this seems like an issue to open on the graphql-batch repo, rather than something that should be further discussed here.

@tmeasday
Copy link

@rmosolgo I'd defer to those with a bit more of a feel for what people are used to in the ruby community in terms of the scale between zero-config magic 🐒-patching and having to manually wire everything up as a user. Happy to work with whatever you give me here ;)

@dylanahsmith I created at ticket on graphql-batch to take the discussion further there.

@rmosolgo
Copy link
Owner Author

rmosolgo commented Nov 1, 2016

I think this'll do the job for now, I'll put it in the next release and work towards phasing out middleware over the next little while.

As described over at graphql-batch, hopefully a recent release will offer built-in support for batched loading, and I'll make sure the instrumentation picture is good for that feature!

@rmosolgo rmosolgo merged commit 0342949 into master Nov 1, 2016
@rmosolgo rmosolgo deleted the instrumentation branch November 1, 2016 02:21
@rmosolgo rmosolgo mentioned this pull request Nov 1, 2016
@rmosolgo rmosolgo modified the milestone: 1.1.0 Nov 1, 2016
@tmeasday
Copy link

tmeasday commented Nov 8, 2016

@rmosolgo thinking about it a bit more, now I've come to look at it; I'm not sure the solutions involving monkey patching or default instrumenters are great, as it seems possible that people could have multiple schemas in their app and not want to instrument all of them (am I over thinking this?).

That leaves us with

a) Make it possible to redefine schemas
b) Make people add (a series of) instrument(X, Y) calls when defining their schemas.

From my perspective, (b) is not great as the more steps we have to ask people to follow in setting up the agent, the more scope there is for them to go wrong. So I'd advocate towards (a). I wonder what your latest thoughts are?

@tmeasday
Copy link

tmeasday commented Nov 8, 2016

@rmosolgo you can see the agent setup is starting to get complex (https://github.com/apollostack/optics-agent-ruby/pull/32):

https://github.com/apollostack/optics-agent-ruby/blob/00ddc36a48026390fe058e25a403c0e87eb85777/README.md#rails

Specifically

agent = Rails.application.config.optics_agent
YourSchema = GraphQL::Schema.define do
  query QueryType
  mutation MutationType
  # ... etc

  agent.instrument self
end
agent.instrument_schema(YourSchema, no_middleware: true)

(I can get rid of the no_middleware if there's some way to detect the schema is instrumented; or drop the entire instrument_schema line if there's some way to get full access to the schema from the proxy object inside the define block?)

I would like to get rid of that whole section and just ask people to do

config.optics_agent = OpticsAgent::Agent.new
config.optics_agent.instrument(YourSchema)

As the less touchpoints I have, the better.

@rmosolgo
Copy link
Owner Author

rmosolgo commented Nov 8, 2016

Thanks for taking a look! A couple of thoughts:

not want to instrument all of them

I think you're right, and I wouldn't be surprised if there were places where the cost of measurement exceeded the cost of calculation 😆

What kind of API can you imagine for people opting in/out of metrics? eg

PersonType = GraphQL::ObjectType.define do 
  name "Person" 
  field :name, optics: false 
end 

?

get rid of that whole section

It sounds like supporting Schema#instrument(instrumenter) outside of the define block would do the trick, right? (It could re-build the field map, including the new instrumenter).

@tmeasday
Copy link

tmeasday commented Nov 9, 2016

What kind of API can you imagine for people opting in/out of metrics? eg

I was actually meaning opting out of of instrumenting the entire schema if you had two schemas in your app (not sure we've seen anyone actually do this; seems like two microservices would be better, but who knows?).

Opting out of fields could look like what you had above, although we don't support it yet.

It sounds like supporting Schema#instrument(instrumenter) outside of the define block would do the trick, right? (It could re-build the field map, including the new instrumenter).

That would be great! However:

I did think of an alternative; we could add a default instrumenter (or monkey patch Schema.define), and allow you to opt out with:

X = Schema.define do
  optics false

  ...
end

If you'd prefer we went that way, I'll start on the patching / we can discuss defaults somewhere.

@tmeasday
Copy link

So I implemented the monkey patch via a kind of convoluted mechanism:

https://github.com/apollostack/optics-agent-ruby/pull/32/files#diff-b6521b21d80f12062f431849ba6118f3

In short;

  1. I create a default field instrumenter that waits to be told about the agent it belongs to
  2. When you pass your schema to the agent, it reaches in, and attaches itself to the field instrumenter.

This is all pretty wacky and would be much better if I could just attach the instrumenter after the fact of course ;)

@rmosolgo
Copy link
Owner Author

Sorry I haven't been too attentive to this issue, I'm on the road this week! I'll take a look at after-the-fact instrumentation soon. I want to take close look to make sure I don't end up with inconsistent memoized values!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants