Skip to content
Merged
155 changes: 105 additions & 50 deletions .github/workflows/integration-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,15 @@ concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
Copy link
Collaborator Author

@jsign jsign Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the integration test refactoring, as a side effect, the CI runs are a bit better organized in the UI:
image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pipeline:
image

cancel-in-progress: true

permissions:
contents: read
pages: write
id-token: write

env:
CARGO_TERM_COLOR: always
ERE_TAG: 0.0.12-7ef4598
OPENVM_RUST_TOOLCHAIN: nightly-2025-08-07

jobs:
jobs:
witness-generator:
name: Generate EEST benchmark fixtures
runs-on: ubuntu-latest
Expand Down Expand Up @@ -43,8 +46,7 @@ jobs:
fi

stateless-validator:
name: "${{ format('{0} / {1} / {2}', matrix.el, matrix.test, matrix.zkvm) }}"
runs-on: [self-hosted-ghr, size-xl-x64]
name: "${{ format('{0} / {1} / {2}', matrix.el, matrix.zkvm, matrix.test) }}"
strategy:
fail-fast: false
matrix:
Expand All @@ -55,13 +57,17 @@ jobs:
- prove_empty_block
zkvm: [sp1, risc0, pico, zisk, openvm]
el: [reth, ethrex]
include:
- zkvm: openvm
threads: 2
Comment on lines +61 to +62
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an old setup since OpenVM uses quite a lot of RAM per case, so going very far in parallelization makes the CI machine struggle. But since OpenVM emulation is very fast, doesn't hurt much CI duration.

For the rest, I configured 12 threads to avoid going with 16 which is the default for the CI machine. Mainly to give some stability to wall-clock time runs.

- threads: 12
exclude:
# Pico
- zkvm: pico
test: prove_empty_block # See https://github.com/eth-act/ere/issues/173
# ZisK
- zkvm: zisk
test: prove_empty_block # ere image intentionally doesn't bake big proving key for CI
test: prove_empty_block # ere image intentionally doesn't bake big proving key for CI
# Ethrex
- el: ethrex
test: execute_mainnet_blocks # Still quite heavy to run in CI
Expand All @@ -74,37 +80,17 @@ jobs:
- el: ethrex
zkvm: pico # See https://github.com/eth-act/ere/issues/174
- el: ethrex
zkvm: zisk # See https://github.com/eth-act/ere/issues/XXX


steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Install Rust toolchain
uses: dtolnay/rust-toolchain@nightly

- name: Install C toolchain dependencies
run: |
sudo apt-get update
sudo apt-get install -y build-essential clang libclang-dev

- name: Pull ere images
run: |
for variant in "base" "base-${{ matrix.zkvm }}" "cli-${{ matrix.zkvm }}"; do
src="ghcr.io/eth-act/ere/ere-${variant}:${ERE_TAG}"
dst="ere-${variant}:${ERE_TAG}"
docker pull "$src"
docker tag "$src" "$dst"
done

- name: Run benchmark
run: |
${{ matrix.zkvm == 'openvm' && 'RAYON_NUM_THREADS=1' || '' }} RUST_LOG=warn,benchmark_runner=info ZKVM=${{ matrix.zkvm }} EL=${{ matrix.el }} cargo test --release -p integration-tests -- --test-threads=1 ${{ matrix.test }}
zkvm: zisk # See https://github.com/eth-act/ere/issues/186
uses: ./.github/workflows/run-benchmark.yml
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of deleted code in this file was extracted to this separate component for better modularity and avoid repetition.

with:
test: ${{ matrix.test }}
zkvm: ${{ matrix.zkvm }}
el: ${{ matrix.el }}
upload_results: ${{ startsWith(matrix.test, 'execute_') }}
threads: ${{ matrix.threads }}

custom-guest:
name: "${{ format('{0} / {1}', matrix.test, matrix.zkvm) }}"
runs-on: [self-hosted-ghr, size-xl-x64]
strategy:
fail-fast: false
matrix:
Expand All @@ -120,27 +106,96 @@ jobs:
zkvm: zisk
- test: prove_panic_guest
zkvm: zisk
uses: ./.github/workflows/run-benchmark.yml
with:
test: ${{ matrix.test }}
zkvm: ${{ matrix.zkvm }}
el: 'none'
threads: 12

generate-benchmark-website:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and next jobs are new ones pulling uploaded artifacts, merging, generating website and publishing.

Pending to only run this on master -- for now is always running to test the PR.

name: Generate Benchmark Website
needs: [stateless-validator]
if: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' || github.event_name == 'workflow_dispatch' }}
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Install Rust toolchain
uses: dtolnay/rust-toolchain@nightly
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'

- name: Install C toolchain dependencies
run: |
sudo apt-get update
sudo apt-get install -y build-essential clang libclang-dev

- name: Pull ere images
- name: Download all benchmark artifacts
uses: actions/download-artifact@v4
with:
pattern: benchmark-results-*
path: ./downloaded-artifacts

- name: Merge results and generate website
run: |
for variant in "base" "base-${{ matrix.zkvm }}" "cli-${{ matrix.zkvm }}"; do
src="ghcr.io/eth-act/ere/ere-${variant}:${ERE_TAG}"
dst="ere-${variant}:${ERE_TAG}"
docker pull "$src"
docker tag "$src" "$dst"
mkdir -p ./merged-results
echo "Downloaded artifacts:"
ls -la ./downloaded-artifacts/ || echo "No artifacts found"

# Extract all tar.gz files and merge them
for artifact_dir in ./downloaded-artifacts/*/; do
if [ -d "$artifact_dir" ]; then
echo "Processing $artifact_dir"
for tarfile in "$artifact_dir"/*.tar.gz; do
if [ -f "$tarfile" ]; then
echo "Extracting $tarfile"
tar -xzf "$tarfile" -C ./merged-results --skip-old-files 2>/dev/null
fi
done
fi
done

- name: Run benchmark
run: CI=1 RUST_LOG=warn,benchmark_runner=info ZKVM=${{ matrix.zkvm }} cargo test --release -p integration-tests -- --test-threads=1 ${{ matrix.test }}

echo "Merged results structure:"
ls -la ./merged-results/
if [ -d ./merged-results/zkevm-metrics ]; then
echo "Contents of zkevm-metrics:"
ls -la ./merged-results/zkevm-metrics/
fi

echo "Generating website..."
python3 scripts/generate-website.py -i ./merged-results -o index.html

- name: Upload website artifact
uses: actions/upload-artifact@v4
with:
name: benchmark-website
path: index.html
retention-days: 90

deploy-pages:
name: Deploy Benchmark Website
needs: generate-benchmark-website
if: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' || github.event_name == 'workflow_dispatch' }}
runs-on: ubuntu-latest
permissions:
contents: read
pages: write
id-token: write
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
- name: Configure GitHub Pages
uses: actions/configure-pages@v5

- name: Download website artifact
uses: actions/download-artifact@v4
with:
name: benchmark-website
path: ./website

- name: Upload Pages artifact
uses: actions/upload-pages-artifact@v3
with:
path: ./website

- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4
76 changes: 76 additions & 0 deletions .github/workflows/run-benchmark.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
name: Run Benchmark
on:
workflow_call:
inputs:
test:
required: true
type: string
description: Test to run
upload_results:
required: false
type: boolean
default: false
description: Whether to create and upload benchmark results
zkvm:
required: true
type: string
description: ZKVM to use
el:
required: false
type: string
default: ''
description: Execution layer client
threads:
required: true
type: number
description: Number of threads for RAYON_NUM_THREADS

jobs:
run-benchmark:
name: ${{ inputs.zkvm }} - ${{ inputs.test }}
runs-on: [self-hosted-ghr, size-xl-x64]
env:
ERE_TAG: 0.0.12-7ef4598
OPENVM_RUST_TOOLCHAIN: nightly-2025-08-07
steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Install Rust toolchain
uses: dtolnay/rust-toolchain@nightly

- name: Install C toolchain dependencies
run: |
sudo apt-get update
sudo apt-get install -y build-essential clang libclang-dev

- name: Pull ere images
run: |
for variant in "base" "base-${{ inputs.zkvm }}" "cli-${{ inputs.zkvm }}"; do
src="ghcr.io/eth-act/ere/ere-${variant}:${ERE_TAG}"
dst="ere-${variant}:${ERE_TAG}"
docker pull "$src"
docker tag "$src" "$dst"
done

- name: Run benchmark
run: |
RAYON_NUM_THREADS=${{ inputs.threads }} \
RUST_LOG=warn,benchmark_runner=info \
ZKVM=${{ inputs.zkvm }} \
EL=${{ inputs.el }} \
WORKLOAD_OUTPUT_DIR=./zkevm-metrics \
cargo test --release -p integration-tests -- --test-threads=1 ${{ inputs.test }}

- name: Create results archive
if: ${{ inputs.upload_results }}
run: |
cd tests && tar -czvf benchmark-results.tar.gz ./zkevm-metrics

- name: Upload benchmark results
if: ${{ inputs.upload_results }}
uses: actions/upload-artifact@v4
with:
name: benchmark-results-${{ inputs.el }}-${{ inputs.zkvm }}-${{ inputs.test }}
path: tests/benchmark-results.tar.gz
retention-days: 90
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ The workspace is organized into several key components:
- **`ere-guests/`**: Directory containing guest program implementations organized by program type, with each type containing implementations for different zkVM platforms. See the [Guest Program Types](#guest-program-types) section for detailed information about each type.
- **`zkevm-fixtures`**: (Git submodule) Contains the Ethereum execution layer test fixtures used by `witness-generator-cli`.
- **`zkevm-fixtures-input`**: Default directory where `witness-generator-cli` saves individual fixture files (`.json`) that are consumed by `ere-hosts`.
- **`zkevm-metrics`**: Directory where benchmark results (cycle counts) are stored by the host programs, organized by zkVM type.
- **`zkevm-metrics`**: Directory where benchmark results (cycle counts) are stored by the host programs. For `stateless-validator` guest programs, results are organized by execution client (EL) then zkVM type (e.g., `zkevm-metrics/reth/sp1/`, `zkevm-metrics/ethrex/risc0/`). For other guest program types, results are organized directly by zkVM type.
- **`scripts`**: Contains helper scripts (e.g., fetching fixtures).
- **`xtask`**: Cargo xtask runner for automating tasks.

Expand Down Expand Up @@ -61,7 +61,7 @@ Each zkVM benchmark implementation follows a common pattern using the EreDockeri
- For stateless-validator, supports multiple execution clients (`--execution-client reth` or `--execution-client ethrex`).
- Automatically handles zkVM compilation and execution through Docker containers.
- Collects cycle count metrics reported by each zkVM platform.
- Saves results using the `metrics` crate into the appropriate subdirectory within `zkevm-metrics/`.
- Saves results using the `metrics` crate into the appropriate subdirectory within `zkevm-metrics/`. For `stateless-validator` guest programs, results are organized by execution client then zkVM type. For other guest program types, results are organized directly by zkVM type.

3. **Automatic zkVM Management:**
- All zkVMs are now managed through EreDockerized, eliminating the need for manual toolchain setup.
Expand Down
3 changes: 3 additions & 0 deletions crates/benchmark-runner/src/runner.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ use crate::guest_programs::{GuestIO, GuestMetadata, OutputVerifier, OutputVerifi
pub struct RunConfig {
/// Output folder where benchmark results will be stored
pub output_folder: PathBuf,
/// Optional subfolder within the output folder
pub sub_folder: Option<String>,
/// Action to perform: either proving or executing
pub action: Action,
/// Force rerun benchmarks even if output files already exist
Expand Down Expand Up @@ -70,6 +72,7 @@ where
let zkvm_name = format!("{}-v{}", zkvm.name(), zkvm.sdk_version());
let out_path = config
.output_folder
.join(config.sub_folder.as_deref().unwrap_or(""))
.join(format!("{zkvm_name}/{}.json", io.name));

if !config.force_rerun && out_path.exists() {
Expand Down
2 changes: 1 addition & 1 deletion crates/benchmark-runner/src/stateless_validator.rs
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ use zkvm_interface::Input;
use crate::guest_programs::{GuestIO, GuestMetadata, OutputVerifier, OutputVerifierResult};

/// Execution client variants.
#[derive(Debug, Clone, PartialEq, Eq, EnumString, AsRefStr)]
#[derive(Debug, Copy, Clone, PartialEq, Eq, EnumString, AsRefStr)]
#[strum(ascii_case_insensitive)]
pub enum ExecutionClient {
/// Reth stateless block validation guest program.
Expand Down
6 changes: 3 additions & 3 deletions crates/ere-hosts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ cargo run --release -- stateless-validator --execution-client reth --action prov

### Force Rerun

By default, the benchmarker will skip tests that already have output files in the `zkevm-metrics/` directory to avoid redundant computation. Use `--force-rerun` to override this behavior:
By default, the benchmarker will skip tests that already have output files in the `zkevm-metrics/` directory to avoid redundant computation. For `stateless-validator` benchmarks, output files are organized by execution client (e.g., `zkevm-metrics/reth/`, `zkevm-metrics/ethrex/`). Use `--force-rerun` to override this behavior:

```bash
# Skip tests that already have results (default behavior)
Expand All @@ -142,7 +142,7 @@ cargo run --release -- stateless-validator --execution-client reth --output-fold
cargo run --release -- stateless-validator --execution-client reth --output-folder /tmp/benchmark-results
```

The benchmark results will be organized by zkVM type within the specified folder (e.g., `my-custom-results/sp1/`, `my-custom-results/risc0/`, etc.).
The benchmark results will be organized by execution client then zkVM type for `stateless-validator` benchmarks (e.g., `my-custom-results/reth/sp1/`, `my-custom-results/ethrex/risc0/`), and directly by zkVM type for other guest program types (e.g., `my-custom-results/sp1/`, `my-custom-results/risc0/`).

### Combined Examples

Expand Down Expand Up @@ -211,7 +211,7 @@ cargo run --release -- empty-program
| `--resource` | `-r` | Choose compute resource type | `cpu` | `cpu`, `gpu` |
| `--action` | `-a` | Select benchmark operation | `execute` | `execute`, `prove` |
| `--input-folder` | `-i` | Input folder containing fixture files (stateless-validator and block-encoding-length) | `zkevm-fixtures-input` | Any valid directory path |
| `--output-folder` | `-o` | Output folder for benchmark results | `zkevm-metrics` | Any valid directory path |
| `--output-folder` | `-o` | Output folder for benchmark results (organized by execution client for stateless-validator) | `zkevm-metrics` | Any valid directory path |
| `--loop-count` | - | Number of times to loop the benchmark (block-encoding-length only) | Required for block-encoding-length | Any positive integer |
| `--format` | `-f` | Encoding format for block-encoding-length benchmark | Required for block-encoding-length | `rlp`, `ssz` |
| `--force-rerun` | - | Rerun benchmarks even if output files already exist | `false` | `true`, `false` |
Expand Down
Loading
Loading