Skip to content

Conversation

bajtos
Copy link
Member

@bajtos bajtos commented Nov 21, 2024

Setup Sentry Performance Profiling for spark-stats to allow us to learn why
are we hitting Fly's CPU throttling limits. Depending on the outcome of this
experiment, we can roll out profiling to other services, e.g. spark-observer.

Sentry docs:

@bajtos bajtos requested a review from juliangruber November 21, 2024 16:14
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
@juliangruber
Copy link
Member

Let's watch CPU usage after deploying

@bajtos bajtos enabled auto-merge (squash) November 27, 2024 13:10
@bajtos bajtos merged commit 026a7c6 into main Nov 27, 2024
9 checks passed
@bajtos bajtos deleted the performance-profiling branch November 27, 2024 13:11
@bajtos
Copy link
Member Author

bajtos commented Nov 27, 2024

The CPU usage seems to be fine.

However, it seems that spark-stats is not reporting any performance information to Sentry 😢

@bajtos
Copy link
Member Author

bajtos commented Nov 27, 2024

https://docs.sentry.io/platforms/javascript/guides/node/profiling/#enable-continuous-profiling

The current profiling implementation stops the profiler automatically after 30 seconds (unless you manually stop it earlier). Naturally, this limitation makes it difficult to get full coverage of your app's execution. We now offer an experimental continuous mode, where profiling data is periodically uploaded while running, with no limit on how long the profiler may run.

These new APIs do not offer any sampling functionality—every call to start the profiler will run and start sending profiling data. If you are interested in reducing the amount of profiles that run, you must take care to do it at the callsites.

Continuous profiling has implications for your org's billing structure.

I think we need to research the above ☝🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants