itest: use notification profiler API instead of main profiler slot#232
itest: use notification profiler API instead of main profiler slot#232korniltsev-grafanista merged 10 commits intomainfrom
Conversation
Replace CORECLR_ENABLE_PROFILING / CORECLR_PROFILER / CORECLR_PROFILER_PATH
with CORECLR_ENABLE_NOTIFICATION_PROFILERS / CORECLR_NOTIFICATION_PROFILERS
(path={GUID} format) in both itest.Dockerfile and IntegrationTest/Makefile.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 71cb235968
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…ofiler loading The notification profiler API requires the profiler to implement ICorProfilerCallback11 and return TRUE from LoadAsNotificationOnly. Without this the runtime refuses to load the profiler via CORECLR_NOTIFICATION_PROFILERS. - Inherit CorProfilerCallback from ICorProfilerCallback11 (superset of 10) - Add ICorProfilerCallback11 to QueryInterface - Implement LoadAsNotificationOnly returning TRUE Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6a89bade85
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| HRESULT STDMETHODCALLTYPE CorProfilerCallback::LoadAsNotificationOnly(BOOL* pbNotificationOnly) | ||
| { | ||
| *pbNotificationOnly = TRUE; |
There was a problem hiding this comment.
Return notification mode only when explicitly needed
This now marks every profiler load as notification-only, not just the new integration-test path. Existing deployment scripts still load the profiler as the main profiler via CORECLR_PROFILER (profiler/src/ProfilerEngine/DeployResources/DDProf-SetEnv.sh:15-17), and those runs can request capabilities like COR_PRF_ENABLE_REJIT when managed code cache is enabled (CorProfilerCallback.cpp:1517-1520); however COR_PRF_ENABLE_REJIT is not included in COR_PRF_ALLOWABLE_NOTIFICATION_PROFILER (shared/src/native-lib/coreclr/src/inc/corprof.idl:597-617). Forcing notification mode here can therefore make event-mask setup fail and stop profiling in non-itest configurations.
Useful? React with 👍 / 👎.
Add an integration test that verifies Pyroscope works as a notification profiler when OTEL .NET auto-instrumentation occupies the classic profiler slot. This proves the coexistence story for PR #232. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
I tested this with OTEL auto-instrumentation running as the primary profiler and Pyroscope through the notification system. It worked, here it the testing code if you wanted to add it: |
* itest: add dual-profiler integration test (OTEL + Pyroscope) Add an integration test that verifies Pyroscope works as a notification profiler when OTEL .NET auto-instrumentation occupies the classic profiler slot. This proves the coexistence story for PR #232. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Adapt OTEL integration test to Go testcontainers pattern - Add OTEL support to itest/integration_test.go: - Add isOTEL() helper to check OTEL env var - Update rideshareImage() and serviceName() to append -otel suffix when OTEL=true - Add TestRideshareProfilesWithOTEL() function - Rewrite itest/Makefile with explicit targets: - Remove flexible FLAVOUR/DOTNET_VERSION variables - Add explicit targets for all 8 combinations: - 4 main-profiler targets (glibc/musl × 6.0/8.0) - 4 notification-profiler targets (glibc/musl × 6.0/8.0) - Each target builds appropriate image and runs specific test - Update .github/workflows/integration_test.yml: - Keep main-profiler matrix job (4 combinations) - Add notification-profiler matrix job (4 combinations) - Both use explicit make targets - Remove deprecated bash-based infrastructure: - Delete Dockerfile.load-generator and load-generator.py - Mark docker-compose-itest.yml as manual-only Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Restore load generator files (used by docker-compose) The load-generator.py and Dockerfile.load-generator files are still referenced in docker-compose-itest.yml for manual testing scenarios. Keep them available even though the Go tests use an in-process goroutine instead of a Docker container for load generation. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * profiler: support notification profiler mode (OTEL coexistence) (#238) * itest: upload pprofs per vehicle type as GitHub Actions artifacts After label verification, download a merged pprof for each vehicle (bike, car, scooter) via profilecli query merge and save them to a per-matrix directory. The integration_test workflow then uploads that directory as a named artifact (pprofs-{flavour}-net{version}) so profiles are available for inspection after each run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * itest: rewrite integration tests in Go with stacktrace assertions (#235) * itest: rewrite integration tests in Go with stacktrace assertions Replace the bash-based verify_profiles.sh with a Go test suite that: - Queries Pyroscope via the Connect API (SelectMergeStacktraces) - Deserializes tree-format profiles into collapsed stacks - Asserts expected function names appear in stacktraces per vehicle label (OrderService.FindNearestVehicle for all, CheckDriverAvailability for car) - Manages docker-compose lifecycle (up/down) from the test itself - Provides a self-contained Makefile (build-profiler, build-app, itest) The CI workflow is simplified to: checkout, setup Go, make itest. Tree/minheap helpers are copied from grafana/otel-profiling-java. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * itest: switch to testcontainers-go, fix stacktrace assertions Replace docker-compose orchestration with testcontainers-go so the test manages container lifecycle directly (same pattern as otel-profiling-java). This eliminates the need for docker-compose-itest.yml at test time. Also fix stacktrace assertions: the .NET profiler emits frames in the format |ct:ClassName |cg:... |fn:MethodName |fg:... |sg:(...) not as simple "ClassName.MethodName" strings. Add frameContains() helper that finds a frame matching both class and method name within the same semicolon-delimited entry. Increase polling timeout to 3 minutes and go test timeout to 15 minutes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * itest: address review comments - Replace load generator container with Go goroutine that cycles through /bike, /scooter, /car HTTP requests directly from the test process. No Docker build needed for the load generator. - Simplify frameContains: split collapsed stacks by ";" and use a regex to match frames containing both |ct:ClassName and |fn:MethodName. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * itest: increase app startup timeout for .NET 6.0 Switch from HTTP health check on /bike to port-listening check with 120s timeout. The /bike endpoint is CPU-intensive and .NET 6.0 with the profiler attached can take over 60s to become ready. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * Adapt OTEL integration test to Go testcontainers pattern - Add OTEL support to itest/integration_test.go: - Add isOTEL() helper to check OTEL env var - Update rideshareImage() and serviceName() to append -otel suffix when OTEL=true - Add TestRideshareProfilesWithOTEL() function - Rewrite itest/Makefile with explicit targets: - Remove flexible FLAVOUR/DOTNET_VERSION variables - Add explicit targets for all 8 combinations: - 4 main-profiler targets (glibc/musl × 6.0/8.0) - 4 notification-profiler targets (glibc/musl × 6.0/8.0) - Each target builds appropriate image and runs specific test - Update .github/workflows/integration_test.yml: - Keep main-profiler matrix job (4 combinations) - Add notification-profiler matrix job (4 combinations) - Both use explicit make targets - Remove deprecated bash-based infrastructure: - Delete Dockerfile.load-generator and load-generator.py - Mark docker-compose-itest.yml as manual-only Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Restore load generator files (used by docker-compose) The load-generator.py and Dockerfile.load-generator files are still referenced in docker-compose-itest.yml for manual testing scenarios. Keep them available even though the Go tests use an in-process goroutine instead of a Docker container for load generation. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: Tolya Korniltsev <anatoly.korniltsev@grafana.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * Revert "profiler: support notification profiler mode (OTEL coexistence) (#238)" This reverts commit 94ee37d. --------- Co-authored-by: Aleksandar Petrov <8142643+aleks-p@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Tolya Korniltsev YOLO vibecoder <anatoly.korniltsev+githubvibecoder@grafana.com>
* Remove unused COR_PRF_ENABLE_REJIT from event mask COR_PRF_ENABLE_REJIT is not allowed for notification profilers (not in COR_PRF_ALLOWABLE_NOTIFICATION_PROFILER). The profiler never calls RequestReJIT/RequestRevert — the managed code cache only passively observes JITCompilationFinished callbacks, which are covered by COR_PRF_MONITOR_JIT_COMPILATION alone. Without this fix, SetEventMask2 returns E_INVALIDARG when running as a notification profiler with UseManagedCodeCache enabled (default on ARM64), causing the profiler to fail to initialize. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add comment explaining COR_PRF_ENABLE_REJIT removal Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
I've reviewed the changes in the context of the CoreCLR runtime source code. The implementation of However, there is a potential blocking issue on ARM64. 1. ARM64 IncompatibilityOn ARM64, the profiler forces // Configuration.cpp
bool defaultUseManagedCodeCache =
#if ARM64
true;
#else
false;
#endifWhen // CorProfilerCallback.cpp
if (_pConfiguration->UseManagedCodeCache())
{
eventMask |= COR_PRF_MONITOR_JIT_COMPILATION | COR_PRF_ENABLE_REJIT;
}The Problem: When 2. x64/x86 SafetyOn x64/x86, RecommendationYou should verify the behavior on ARM64. If You might need to add logic to |
This reverts commit 70ac340.
|
@codex review |
|
Codex Review: Didn't find any major issues. Swish! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
What
Switch the integration test setup from registering Pyroscope as the main CLR profiler to registering it as a notification profiler.
Changed files:
itest.Dockerfile— Docker-based integration test environmentIntegrationTest/Makefile— localmake runtargetBefore
After
Why
The .NET runtime only allows a single main profiler (
CORECLR_PROFILER). Using the notification profiler API (CORECLR_NOTIFICATION_PROFILERS) allows Pyroscope to coexist with other profilers or tracers that already occupy the main profiler slot (e.g. OpenTelemetry, Datadog tracer). This makes the integration tests reflect real-world deployment scenarios more accurately.LD_PRELOADforPyroscope.Linux.ApiWrapper.x64.sois unchanged — it is still required for the CPU/wall-time sampling to work correctly on Linux.