-
Notifications
You must be signed in to change notification settings - Fork 256
Metrics feedback #1662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
👋 This issue has been marked as stale because it has been open with no activity. You can: comment on the issue or remove the stale label to hold stale off for a while, add the |
Hi there 👋 I've started some prototype implementation of metrics in a Rails app, and I could get some metrics exported from the SDK example. However, From the README it is not clear where and when I should pull the metrics and shutdown the provider? OpenTelemetry.meter_provider.metric_readers.each(&:pull)
OpenTelemetry.meter_provider.shutdown What we'd want to do is adding a counter metric each time a specific controller action is called. I can do this with a standard |
Hi @chloe-meister! Thanks for reaching out! The README provides an example for a script that would run once, pull the metric, and exit the program. If you're open to it, I think a periodic metric reader is a better fit for a Rails application. It'll collect the metrics and pass them to the exporter on a set interval. The process will live as long as the Rails application does, or until Assuming you're using the OTLP exporter, rather than calling: opentelemetry-ruby/examples/metrics_sdk/metrics_collect_otlp.rb Lines 23 to 25 in 2f87a1d
You would call something like: # config/initializers/opentelemetry.rb
reader = OpenTelemetry::SDK::Metrics::Export::PeriodicMetricReader.new(
exporter: OpenTelemetry::Exporter::OTLP::Metrics::MetricsExporter.new,
export_interval_millis: 3000,
export_timeout_millis: 10000)
OpenTelemetry.meter_provider.add_metric_reader(reader) You can create the meter and the counter the same way as in the example. Using arguments above, your metrics would be exported every three seconds and would have a 10-second timeout. Keep in mind that if you're using the OTLP exporter, you also need to install the |
Hi @kaylareopelle! Thank you very much for the detailed answer. This helps a lot 🙏 |
Hi @kaylareopelle! Thanks to your guidance I managed to get something exported. We use Grafana and I can see something matching my meter provider's attributes. However, there doesn't seem to be any metric attached: no matter how the I wonder if you could potentially have an idea of what could be wrong here? Below you can find my setup: # config/initializers/open_telemetry.rb
OpenTelemetry::SDK.configure do |config|
config.service_name = ENV.fetch('SERVICE_NAME', 'unknown')
group = { 'service.group' => ENV.fetch('SERVICE_GROUP', 'unknown') }
config.resource = OpenTelemetry::SDK::Resources::Resource.create(group)
end
otlp_metric_exporter = OpenTelemetry::Exporter::OTLP::Metrics::MetricsExporter.new
reader = OpenTelemetry::SDK::Metrics::Export::PeriodicMetricReader.new(
exporter: otlp_metric_exporter,
export_interval_millis: 15_000,
export_timeout_millis: 30_000
)
OpenTelemetry.meter_provider.add_metric_reader(reader)
queue_latency_meter = OpenTelemetry.meter_provider.meter('QUEUE_LATENCY')
QueueLatency = queue_latency_meter.create_histogram('queue_latency', unit: 'job_latency', description: 'cumulative latency for all queues') I use this in a background job, collecting this information: QueueLatency.record(queue.latency, attributes: { queue_name: queue.name }) |
Hi @chloe-meister! Thanks for sharing these details! Sorry things aren't working as expected. Everything you've shared looks good to me. I copied your setup and created a simple Rails app using Sidekiq for the background jobs and was able to record data points to the I was surprised that two resources were created: one for the Rails process and one for the Sidekiq process. The I haven't worked with Grafana before, but I can do some testing with that observability backend next week. Perhaps there's something we're missing that environment needs. Earlier this week, we released a fix for a bug in the metrics SDK that prevented users who set the It might also be helpful to see the raw data for the metric you're sending, either by creating a periodic metric reader with a console exporter, or by setting up a collector with the debug exporter set to Here's how to set up the console exporter: # config/initializers/opentelemetry.rb
# ...
console_metric_exporter = OpenTelemetry::SDK::Metrics::Export::ConsoleMetricPullExporter.new
console_reader = OpenTelemetry::SDK::Metrics::Export::PeriodicMetricReader.new(
exporter: console_metric_exporter,
export_interval_millis: 15_000,
export_timeout_millis: 30_000
)
OpenTelemetry.meter_provider.add_metric_reader(console_reader) Either of these options should provide a plain text output of the data being sent over the wire. Would you be able to update your application to generate this output and paste the output (with sensitive information removed) on this issue? Seeing the structure of your data might help us pinpoint any malformed or missing elements that could be preventing the data points from being ingested. |
Hello @kaylareopelle! I am a co-worker of @chloe-meister. We did not set the Followed your advise and used the Right after that I also get a line saying Just a bit of extra information, there is some data that does seem to reach Grafana and that we can see there. The attributes shown here: Thank you for all the help provided! |
Hi @eduAntequera, thank you for these screenshots! Everything looks good in this context, so as of yet, it does not seem like the problem is in the metrics SDK. This is puzzling! For the next phase of debugging, I think we need to focus on the OTLP exporter specifically. In order to do that, I recommend we move to the collector setup with a debug exporter set to a detailed verbosity. Could you give that a try and send me the output for a payload that isn't showing up in Grafana as expected? Here's some example output from my reproduction app:
|
Hi @kaylareopelle, sorry for the delay and thanks for the guidance. I hope they help |
Hi @eduAntequera! Thanks for the screenshots! I'm about to take some time off through next Monday, so perhaps someone else can jump in here to help in the meantime. These screenshots helped me realize that both exporters are recording the metric only once, and it seems to have a value of |
Hi @kaylareopelle thanks for the reply! Here are some of the results when sending randomly generated values: Thanks for all the help and hope you enjoy your time off 😄 |
Hi @eduAntequera! Thank you for this test and for your patience. I took another look at the output and unfortunately I'm not seeing any issues with the output. This might be an issue related to Grafana's ingest process rather than the OpenTelemetry Ruby metrics gems. I reached out on the CNCF Slack to someone who works for Grafana and they're going to take a look at the issue. If you're able to open a support ticket with a representative from the company, that may also be a good next step. If they have anything they'd like us to test or change with the metrics code to help with troubleshooting, let us know! |
@kaylareopelle Thank you so much for looking into it and contacting them 🙇 |
For those having issues sending metrics to Grafana/Prometheus with the Ruby SDK: try setting the
|
I'm debugging an issue where metrics produced my app aren't visible in Grafana (using Alloy as the OpenTelemetry Collector). Now I'm using the debug collector in development, as recommended here, to verify that my setup is correct. In a Rails initializer I do this: exporter = OpenTelemetry::Exporter::OTLP::Metrics::MetricsExporter.new
reader = OpenTelemetry::SDK::Metrics::Export::PeriodicMetricReader.new(exporter: exporter)
OpenTelemetry.meter_provider.add_metric_reader(reader) Then, in a controller action I do this: meter = OpenTelemetry.meter_provider.meter('my_meter')
scores_posted = meter.create_counter('my_counter')
scores_posted.add(1, attributes: { attributes: 'my_attributes' }) In the debug collector logs I see this:
Is this expected, for |
@kaylareopelle, any chance you could take a look at this? Any suggestions for what to investigate? |
Hi @akahn, thanks for the ping! I'm sorry about the delay here! In this scenario, I believe For the Grafana ingest issue, I wonder if multiple exporters may be causing some sort of block. It should be possible to have multiple exporters, but perhaps we have a bug. The Metrics SDK gem now creates and configures an OTLP exporter by default. This can be controlled using For debugging purposes, could you run some tests using the following scenarios:
Can you let me know if either of those changes allow your data to reach Grafana? |
I'm not sure what changed, but I am now seeing my metrics in Grafana, thank you for looking at this! Some unrelated feedback: We're noticing that when shutting down our app, we have to wait the full metrics export interval before the app exits. That's 60 seconds by default, which slows down our deploys considerably, since they run a few rake tasks. I would expect that on app shutdown, we'd wait up to 60 seconds for any pending metrics to be exported, but it seems like the delay is always 60 seconds (due to this sleep call). As a workaround, we've set our export interval to 3 seconds, to minimize the slowdown of running these tasks, but that seems quite a bit more frequent than what we'd want once we're up and running with many custom metrics. |
@akahn - Hooray! 🎉 Happy to hear the ingest is working now! As far as the shutdown goes, thanks for letting us know! That's not how we'd like it to behave. We'll take a closer look. In the meantime, could you try adding the following to your
I haven't tested this, but I believe that should call |
I think it's caused by calling |
That would be awesome, @xuan-cao-swi. We are hesitant to start using this in production because of the shutdown delay. |
@kaylareopelle we already have that block running in our app, so it doesn't seem to help. |
@kaylareopelle @xuan-cao-swi here is an attempt at faster shutdown, using the |
Hi @akahn , thank you for the PR. During sig meeting, we were discussing to use |
We are starting to use histogram metrics more in our Rails app and seeing confusing issues where metrics do not report accurately. We see requests going over the wire on metrics export but it seems the data is empty as no data points end up visible in Prometheus. We will continue investigating on our end but at this point we're wondering if our Puma web server is part of the issue. We use clustered mode, so there are multiple processes, each of which spawns multiple threads. Can you provide any guidance on the correct way to set up metrics reporting in a Rails app that uses Puma in this way? |
We want your feedback on the
opentelemetry-metrics-sdk
andopentelemetry-metrics-api
gems! Understanding how these libraries are used by the community will help us prioritize future work on Metrics.The text was updated successfully, but these errors were encountered: