-
Notifications
You must be signed in to change notification settings - Fork 761
feat(compute_ctl): CPU Continuous Profiling support #12307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ac9d2d3
to
193cd7b
Compare
No tests were run or test report is not availableTest coverage report is not availableThe comment gets automatically updated with the latest test results
8dbf5a8 at 2025-07-15T11:00:25.949Z :recycle: |
What is with the licenses? https://github.com/neondatabase/neon/actions/runs/15777664502/job/44475727628?pr=12307 |
0d7c759
to
680156b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM so far, but I'd like one more review
b98d2ef
to
bdb009a
Compare
bdb009a
to
28bc14c
Compare
Need to figure out the path to |
47a7048
to
eaf610e
Compare
86cc306
to
d2fb3d9
Compare
2f712c9
to
ef8df9b
Compare
Exposes an endpoint "/profile/cpu" for profiling the postgres processes (currently spawned and the new ones) using "perf". Adds the corresponding python test to test the added endpoint and confirm the output expected is the profiling data in the expected format. Add "perf" binary to the sudo list. Fix python poetry ruff Address the clippy lints Document the code Format python code Address code review Prettify Embed profile_pb2.py and small code/test fixes. Make the code slightly better. 1. Makes optional the sampling_frequency parameter for profiling. 2. Avoids using unsafe code when killing a child. Better code, better tests More tests Separate start and stop of profiling Correctly check for the exceptions Address clippy lint Final fixes. 1. Allows the perf to be found in $PATH instead of having the path hardcoded. 2. Changes the path to perf in the sudoers file so that the compute can run it properly. 3. Changes the way perf is invoked, now it is with sudo and the path from $PATH. 4. Removes the authentication requirement from the /profile/cpu/ endpoint. hakari thing Python fixes Fix python formatting More python fixes Update poetry lock Fix ruff Address the review comments Fix the tests Try fixing the flaky test for pg17? Try fixing the flaky test for pg17? PYTHON Fix the tests Remove the PROGRESS parameter Remove unused Increase the timeout due to concurrency Increase the timeout to 60 Increase the profiling window timeout Try this Lets see the error Just log all the errors Add perf into the build environment uijdfghjdf Update tempfile to 3.20 Snapshot Use bbc-profile Update tempfile to 3.20 Provide bpfcc-tools in debian Properly respond with status Python check Fix build-tools dockerfile Add path probation for the bcc profile Try err printing Refactor Add bpfcc-tools to the final image Add error context sudo not found? Print more errors for verbosity Remove procfs and use libproc Update hakari Debug sudo in CI Rebase and adjust hakari remove leftover Add archiving support Correct the paths to the perf binary Try hardcoded sudo path Add sudo into build-tools dockerfile Minor cleanup Print out the sudoers file from github Stop the tests earlier Add the sudoers entry for nonroot, install kmod for modprobe for bcc-profile Try hacking the kernel headers for bcc-profile Redeclare the kernel version argument Try using the kernel of the runner Try another way Check bpfcc-tools
ef8df9b
to
d8b0c08
Compare
Squashed the 71 commits into one to make the rebases easier the next time. |
18d4d6f
to
2ac828a
Compare
2ac828a
to
463429a
Compare
I am about to leave for a couple of weeks, so this is where I stopped.
This is almost a completely working code, apart from the autoscaling issue I mention in this PR: neondatabase/autoscaling#1401 which didn't allow me to test further until I ran out of time. I, however, do not expect any other compute-related code changes anymore. |
This provides the binaries and libraries required for the continuous profiling at runtime.
Description
Adds the profiling module for the compute code and exposes a compute endpoint "/profile/cpu" with three methods:
GET
,POST
andDELETE
, for checking the status of, starting and stopping the profiling.The added profilers supported are
perf
andbcc-profile
.perf
is less stable thanbcc-profile
.Problem
We need to allow for continuous profiling of compute (postgres).
Summary of changes