-
Notifications
You must be signed in to change notification settings - Fork 5
Adding Benchmark instrument #157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@tharittk I’ll take a look at this soon. Regarding the 2 points of ongoing work, I think these kind of extensive benchmark tests do not belong to the main repo but they are better suited for the one we created to put test codes… here we should add some minimal tests to verify the correctness of the benchmark decorator and mark functionality |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tharittk good start! I think we will probably need a few iterations here to get something that is functional and looks nice, and I am playing myself with it to get a better feeling 😄
Here are some initial comments as they come:
- definitely we should not have benchmark scripts in the main repo, so
Kirchhoff_benchhas to be ultimately removed from here and put in separate repo - we should try to add a new page to the documentation called
Benchmarkingwhere we can provide an explanation of this decorator, how to use it, and show a small example (a bit like we have in theGPU SupportandAdding New Operatorspages) - I tried to add a benchmark decorator to
nccl_allreducebut what I see is that the messages of benchmark come all before those of mark.. is this expected/intended? - In general I am never in favour of re-inventing the wheel.. so i have been wondering if we really need to write code to print/write on file and after a few tests I think going for the route of using
loggingis probably better for the long run (we can get lots of things for free including writing to file. A little example:
In kirchhoff_bench:
import logging
logging.basicConfig(level=logging.INFO, force=True) or logging.basicConfig(filename='kirch_log.log', level=logging.INFO, force=True)
numba_logger = logging.getLogger('numba')
numba_logger.setLevel(logging.WARNING) # silence numba (can do any other library we get logs one doesn't want to see)
and in benchmark
import logging
logger = logging.getLogger(__name__)
...
print("".join(output)) -> logging.info("".join(output))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tharittk well done, I like the new formatting with messages from nested benchmark appearing in the right order 😄
I left a few additional comments/thoughts for you... once you have addressed the remaining points that you have in the PR main comment, I'll do a final review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @tharittk ! I just left a few minor comments, but I think we have converged to a final solution 😄
Try to handle the two remaining tasks and then we should be ready to merge this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tharittk very good, I like the tutorial 😄 I just left some minor comments related to typos and unclear text...
I just noticed though that you have not yet handled this
# TODO (tharitt): later move to env file or something
ENABLE_BENCHMARK = True
which is the reason why I was asking to write a page like https://github.com/tharittk/pylops-mpi/blob/benchmark_instrument/docs/source/adding.rst more than a tutorial... note that I would maybe call it BENCH_PYLOPS_MPI (always good to have the name of the library so the env variable scope is not ambiguous
I think it is fine to have the tutorial but we definitely need a page where we can replicate a bit the same info and more importantly explain how one can turn on/off benchmark; this however brings me to my last more general question: as we progressed I kind of lost a bit track of the original goal of this (lol) which is timing our communication methods in _nccl; at this point we have not added any decorator to them so how do you envision doing it - a developer goes it manually, decorates them when doing some benchmarking and then takes away the decorators? I guess that is a doable approach, although a bit clunky... the other way could be to decorate all of them since we have the ENABLE_BENCHMARK /BENCH_PYLOPS_MPI variable that is generally set to 0 (False) and that the method benchmark checks to decide whether to activate or not... as the way we ended up creating benchmark/mark would always log the time if we add it and print it, and that is not what we want in a normal run.
Any thoughts?
I prefer not to pre-decorate the function in the package for these reasons:
So I would say I think we can keep the environment variable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tharittk I totally agree with your judgement, let's go this way... and add the BENCH_PYLOPS_MPI variable that when set to 0 will skip benchmark... as you say, if unset default to 1. However, you do need to modify benchmark.py to handle the new env variable, which does not seem to be done yet.
But I am not super happy with the benchmarking.rst file yet. I left a detailed comment that hopefully can help you improve it and make it a bit more different f 😄
This PR introduces a
benchmarkdecorator that enables users to placemark(label)calls anywhere inside a function to record execution time at specific code points. The timing marks are collected and (as of now) printed in a human-readable format after the function completes.The design is inspired by C++-style instrumentation, where developers can inject performance markers directly into the code.
Changes made:
benchmark.pyA benchmark decorator for lightweight instrumentation[Removed]
kirchhoff_bench.pyA sample script to demonstrate how to use the decorator -- this belongs to the other repo and thus removedOngoing work:
New/Updated documentation to illustrate how to use
benchmarkformatted print and logging
Take a look more closely on CuPy / NCCL Synchronization issue as pointed in the comment