-
-
Notifications
You must be signed in to change notification settings - Fork 359
Structured Logs: Buffering & flushing of logs #5628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5628 +/- ##
=============================================
+ Coverage 86.292% 86.568% +0.275%
=============================================
Files 417 418 +1
Lines 35441 35670 +229
Branches 16624 16822 +198
=============================================
+ Hits 30583 30879 +296
+ Misses 4810 4745 -65
+ Partials 48 46 -2
... and 62 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
Performance metrics 🚀
|
Revision | Plain | With Sentry | Diff |
---|---|---|---|
c585b1e | 1232.88 ms | 1249.45 ms | 16.58 ms |
0759f32 | 1237.10 ms | 1257.35 ms | 20.25 ms |
2691350 | 1224.92 ms | 1255.82 ms | 30.90 ms |
bce9765 | 1229.42 ms | 1243.49 ms | 14.07 ms |
fd5961e | 1210.59 ms | 1235.57 ms | 24.98 ms |
884b224 | 1233.41 ms | 1259.50 ms | 26.09 ms |
035974f | 1225.89 ms | 1251.23 ms | 25.34 ms |
5ec90e0 | 1235.57 ms | 1258.45 ms | 22.88 ms |
a2a3bfb | 1227.94 ms | 1261.26 ms | 33.32 ms |
d637379 | 1226.43 ms | 1250.77 ms | 24.34 ms |
App size
Revision | Plain | With Sentry | Diff |
---|---|---|---|
c585b1e | 23.75 KiB | 879.95 KiB | 856.21 KiB |
0759f32 | 23.75 KiB | 880.20 KiB | 856.46 KiB |
2691350 | 23.75 KiB | 850.73 KiB | 826.98 KiB |
bce9765 | 23.74 KiB | 874.06 KiB | 850.32 KiB |
fd5961e | 23.74 KiB | 874.07 KiB | 850.32 KiB |
884b224 | 23.75 KiB | 879.60 KiB | 855.86 KiB |
035974f | 23.74 KiB | 874.07 KiB | 850.33 KiB |
5ec90e0 | 23.74 KiB | 872.67 KiB | 848.92 KiB |
a2a3bfb | 23.75 KiB | 872.67 KiB | 848.92 KiB |
d637379 | 23.75 KiB | 855.38 KiB | 831.63 KiB |
@@ -6,14 +6,92 @@ import Foundation | |||
@_spi(Private) public class SentryLogBatcher: NSObject { | |||
|
|||
private let client: SentryClient | |||
private let flushTimeout: TimeInterval | |||
private let maxBufferSize: Int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker, but do we have a limit on the log length size?
It isn't the same to have 100 logs with 100 bytes each, or 100 logs with a couple of kb (I imagine someone logging a codable model by mistake).
This buffer size seems to only limit the number of logs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checked the developer docs for logs. While not specifically mentioning a limit, they are referencing to the Batch Processor Spec, where various approaches to limit size are mentioned, also with regards to limits on the envelopes that can be ingested. Knowing the Cocoa SDK better than me @philipphofmann, what do you think would be a reasonable approach here? Is there already handling for this on the envelope level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already call encodeToJSONData
in dispatch
here. So what you can do is call encodeToJSONData
already in add and keep track of the serialized logs, which is basically data. Instead of checking the number of logs, you check the sum of the data of the logs. We should implement this in a way so we don't have to iterate over all serialized logs every time we check them. We should call encodeToJSONData
on a BG thread, because serialization can be CPU-intensive and it can potentially block the main thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that sounds like a great approach. Will also change to either flushing after the 5 sec timeout (started when first log is added) or when the 1MB size limit of the data is exceeded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but added a question we might want to answer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this @denrase, but I found some issues. It could be that the develop docs in https://develop.sentry.dev/sdk/telemetry/spans/batch-processor/#specification need some clarification. Please let me know if they aren't clear enough and I will update them.
@@ -6,14 +6,92 @@ import Foundation | |||
@_spi(Private) public class SentryLogBatcher: NSObject { | |||
|
|||
private let client: SentryClient | |||
private let flushTimeout: TimeInterval | |||
private let maxBufferSize: Int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already call encodeToJSONData
in dispatch
here. So what you can do is call encodeToJSONData
already in add and keep track of the serialized logs, which is basically data. Instead of checking the number of logs, you check the sum of the data of the logs. We should implement this in a way so we don't have to iterate over all serialized logs every time we check them. We should call encodeToJSONData
on a BG thread, because serialization can be CPU-intensive and it can potentially block the main thread.
private func scheduleFlush() { | ||
let flushId = UUID() | ||
|
||
logBufferLock.synchronized { | ||
currentFlushId = flushId | ||
} | ||
|
||
dispatchQueue.dispatch(after: flushTimeout) { [weak self] in | ||
self?.executeFlushIfMatching(flushId: flushId) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
h
: I'm afraid that this approach won't work. As we're only flushing in executeFlushIfMatching
when currentFlushId == flushId
, we're only flushing if the user didn't add logs for the flushTimeout. So we're resetting the timer every time we add a new log. Instead, we should start a timer for when adding the first log. Otherwise, we always wait until the buffer is full, if users continuously add logs.
Also this approach, has the downside of dispatching every time we add a log. So when users log plenty of logs, we dispatch many items to a BG thread, which causes unnecessary memory overhead. Instead, we could only call dispatch after for the first log.
Instead, we should start a timer the first time we receive a log, and flush when it expires. This has the advantages of fewer calls to dispatch after and sending logs sooner.
I think I need to update the develop docs to make this clearer: https://develop.sentry.dev/sdk/telemetry/spans/batch-processor/#specification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the implementation to only dispatch on the first log. I'm also using a WorkItem
, which is canceled when flush is called. So in the case where we exceed the buffers max size (now 1mb of logs) before the timer would fire, we flush immediatley and also do NOT execute the dispatch timer afterwards.
#skip-changelog
📜 Description
Implements batching of logs.
💡 Motivation and Context
Closes #5580
💚 How did you test it?
Unit tests
📝 Checklist
You have to check all boxes before merging:
sendDefaultPII
is enabled.