-
Notifications
You must be signed in to change notification settings - Fork 51
Simplify adapting pystats #708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I wouldn't worry about breakage. The main use of stats is to guide improvements and tell us about individual changes. We don't care about long term trends (at least, I don't). Using the same dot notation everywhere is fine, but some stats are tables, like execution counts, and some have tables within fields and/or fields within tables. Instead of just |
Most of stats are simple counts, with a few arrays of counts, a few histograms, and a few nested structs (SpecializationStats) or arrays of structs (UOpStats). As a first step, would it make sense to simplify/automate the stats which are simple counts, and leave the historgrams/arrays/nested stats as a manual/special case? My thought would be to make the structs in
I've been poking a this a little, though as I'm going on paternity leave any day now, I can't promise I'll make that much more progress. But happy to try if this seems like a decent first-step. |
Having an easy way to add statistics would be great! Recently I had the need myself and created a structure to add statistics with a single line in the code like
or
(the branch is python/cpython@main...eendebakpt:cpython:wip_stats, although I would not advice that as a starting point for any implementation) My take-aways:
|
@markshannon Currently, Similarly, the Jumps are special-cased to not be marked as specializable. Is it still worth preserving that special-casing as well? |
Adding a new pystat currently involves:
PyStats
struct (or one of its substructs)print_stats
(or one of its subfunctions)STAT_INC
(or friend)summarize_stats.py
(3) is just sort of required, but the other steps could probably be merged into one through use of macros and/or codegen, thereby making it much easier to add new stats and understand what we have, and how they flow through the system etc.
The main thing that will be hard to solve will be to maintain the use of sort of "English phrases" in the output file. For example, the
stats.rare_event.set_class
statistic is written to the file asRare event (set_class):
. I think it would be better if we just used the same dot notation everywhere, only converting to friendly English phrasing at the very last step insummarize_stats.py
.I think it's fair to say that
pystats
is an implementation detail that only interpreter hackers care about, and we are free to change the file format at a whim. This would create a "hard break" before and after such a change, but comparing stats across long time periods is cumbersome anyway. But perhaps comparingmain
against 3.13.0 is something we still care about.Thoughts, @brandtbucher, @markshannon (others...)?
The text was updated successfully, but these errors were encountered: