Skip to content

Commit cfa2cef

Browse files
committed
Run profiling under perf evals times
I don't know how accurate perf is for functions that take nanoseconds. My understanding is that running a function `evals` times should take at least a couple microseconds so I think this should improve profiling accuracy with neglibile extra time taken.
1 parent 1ee189b commit cfa2cef

File tree

2 files changed

+5
-2
lines changed

2 files changed

+5
-2
lines changed

docs/src/manual.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ You can pass the following keyword arguments to `@benchmark`, `@benchmarkable`,
8585
- `gcsample`: If `true`, run `gc()` before each sample. Defaults to `BenchmarkTools.DEFAULT_PARAMETERS.gcsample = false`.
8686
- `time_tolerance`: The noise tolerance for the benchmark's time estimate, as a percentage. This is utilized after benchmark execution, when analyzing results. Defaults to `BenchmarkTools.DEFAULT_PARAMETERS.time_tolerance = 0.05`.
8787
- `memory_tolerance`: The noise tolerance for the benchmark's memory estimate, as a percentage. This is utilized after benchmark execution, when analyzing results. Defaults to `BenchmarkTools.DEFAULT_PARAMETERS.memory_tolerance = 0.01`.
88-
- `enable_linux_perf`: If `true`, profile using perf once. Defaults to `BenchmarkTools.DEFAULT_PARAMETERS.enable_linux_perf = false`.
88+
- `enable_linux_perf`: If `true`, profile using perf `evals` times. Defaults to `BenchmarkTools.DEFAULT_PARAMETERS.enable_linux_perf = false`.
8989

9090
To change the default values of the above fields, one can mutate the fields of `BenchmarkTools.DEFAULT_PARAMETERS`, for example:
9191

src/execution.jl

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -604,8 +604,11 @@ function generate_benchmark_definition(
604604
$(setup)
605605
try
606606
$LinuxPerf.enable!(__linux_perf_bench)
607-
# We'll just run it one time.
607+
# We'll run it evals times.
608608
__return_val_2 = $(invocation)
609+
for __iter in 2:__evals
610+
$(invocation)
611+
end
609612
$LinuxPerf.disable!(__linux_perf_bench)
610613
# trick the compiler not to eliminate the code
611614
if rand() < 0

0 commit comments

Comments
 (0)