Skip to content

Lightweight benchmarks for Turing #1534

Closed
@luiarthur

Description

@luiarthur

Currently, the contents of benchmarks/ are outdated. It and the accompanying workflow file (.github/workflows/MicroBenchmarks.yml) needs updating.

A set of lightweight benchmarks are needed to measure performance (speed) for standard inference algorithms and models between (1) consecutive releases and (2) a PR and the latest release.

These benchmarks should run automatically (via GitHub Actions) whenever a PR is made and when a new release is created.

Benchmarks (timings) for each release should be stored (e.g. as a release asset) for regression testing.

Ideally, a warning would be raised if regressions are detected. Differences in timings between releases should also be stored/recorded.

Things to consider:

  • Results from the benchmarks could be stored as GitHub release assets. Open to suggestions for other locations.
  • Visualizing the benchmarks
    • Some users are interested in how model performance scales for a particular model, for example, by data size, number of features, etc. Useful visuals will be helpful for digesting the benchmarks.
  • Models to benchmark
    • Models of a wide variety for potentially different data sizes should be considered. But we hope to run all the tests rather quickly (well under an hour). Inference algorithms won't be run to convergence, just long enough to get decent timings.
  • Inference algorithms to benchmark
    • We want to avoid algorithms that adapt in a way that influence timings. For example, NUTS adapts the number of leapfrog steps, which would result in unpredictable timings. HMC, ADVI, GibbsConditional, MH, PG, for example, are fair game.
  • AD backends to benchmark
  • Other PPLs to compare against Turing

Resources for Benchmarking in Julia

  1. BenchmarkTools.jl manual
  2. BenchmarkTools.jl API
  3. Github actions API

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions