-
Notifications
You must be signed in to change notification settings - Fork 4
Add daily retrieval result codes #244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|> range(start: 0) | ||
|> filter(fn: (r) => r["_measurement"] == "retrieval_stats_honest") | ||
|> filter(fn: (r) => strings.hasPrefix(v: r._field, prefix: "result_rate_")) | ||
|> aggregateWindow(every: 1d, fn: mean, createEmpty: false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will likely produce slightly incorrect data.
My understanding:
- spark-evaluate creates a new set of
result_rate_*
datapoints every round. Eachresult_rate_*
value is a fraction of how many measurements reported that result code. - In your query, you are aggregating these per-round data rates to per-day rates by calculating the mean average of the fractions.
Consider the following case (simplified):
- On round 1, we collect 400k measurements, of which 40k report OK. result_rate_OK is 10%.
- On round 2, we collect 600k measurements, of which 50k report OK. result_rate_OK is 8.33%.
Mean average: (10+8.33)/2 = 9.165%
Precise value: (40k+50k)/(400k+600k) = 9%
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What we are doing in other places:
- In spark-evaluate, we keep track of both per-result count and total count
- In spark-stats, we calculate the % rate from those numbers
For example: per-miner OK results:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, the mean over relative values will be off with different measurement count. Since measurement count is constant-ish now, I think this doesn't matter a lot. But it should still be improved.
Do you have a suggestion how to solve this with Influx queries? I assume we'd need to join over the measurement count.
I have added a comment in 818285d, are you ok shipping like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a suggestion how to solve this with Influx queries? I assume we'd need to join over the measurement count.
I don't think you can solve this at the query level. Maybe you could write a query that multiplies each result_rate_XXX
value with measurements
(total count) to reconstruct the number of measurements that reported the result code XXX
.
IMO, result_rate_XXX
points are not the right data source for this feature. We can keep pushing in this direction to see if we can find a fix. It may be better to step back and add a new stats writer to spark-evaluate, as we already do for other stats.
I have added a comment in 818285d, are you ok shipping like this?
What are the ramifications of shipping like this:
- The chart will show incorrect data. (Hopefully, the inaccuracy is too small for anybody to notice. But we don't know how bad it will be, do we?)
- We are introducing a technical debt. In the worst-case scenario, we will need to remove the InfluxDB observer and implement something else. Since we are talking about ~200 LOC, I think this is not a big deal.
While I am not excited about shipping it like this, I think it's an acceptable trade-off to make. 👍🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you can solve this at the query level. Maybe you could write a query that multiplies each result_rate_XXX value with measurements (total count) to reconstruct the number of measurements that reported the result code XXX.
Yes this is what I was thinking about. But this join is going to be expensive.
IMO, result_rate_XXX points are not the right data source for this feature. We can keep pushing in this direction to see if we can find a fix. It may be better to step back and add a new stats writer to spark-evaluate, as we already do for other stats.
Yeah post LabWeek we can work on the proper solution for these.
What are the ramifications of shipping like this:
- The chart will show incorrect data. (Hopefully, the inaccuracy is too small for anybody to notice. But we don't know how bad it will be, do we?)
I propose let's first see how this looks in the dashboard and then decide if it's too wrong to ship.
- We are introducing a technical debt. In the worst-case scenario, we will need to remove the InfluxDB observer and implement something else. Since we are talking about ~200 LOC, I think this is not a big deal.
The worst case is one revert away so I think this is ok. Also, from a business perspective, if we're asked about why we didn't add this chart that was requested last LabWeek, technical debt is not a good answer. Do you have a realistic alternative proposal?
While I am not excited about shipping it like this, I think it's an acceptable trade-off to make. 👍🏻
I'm at the moment not operating at the level of excitement :D I just want to ship (obviously without shipping something too bad) so we have more useful things to show.
No description provided.