1.4.1 upgrade #45

Schaeff · 2025-10-28T18:25:12Z

Released under this tag

deployed: https://github.com/openvm-org/openvm/actions/runs/17385905364

closes INT-4936

closes INT-4924

) Resolves INT-4950. Some CUDA 13.0 optimizer choices caused kernel `fri_reduced_opening_tracegen` to use 80 registers (previously 58), causing the total number of registers per block to exceed the allowable amount set by the device. [Benchmark difference](https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/17476936661) on CUDA 12.9 is negligible. Benchmarks done to determine savings gained by switching to CUDA 13.0 will be done later.

…#2122)

…g#2121) - pass `pc`, `instret` and `instret_end`/`max_execution_cost`/`segment_check_insns` by value in execution handlers to get them to be passed in registers - add `likely`, `unlikely` hints for suspension/termination in `tco` [benchmark comparison](https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/17513217695#summary-49747838308) Towards INT-4921

git cache was saving stark-backend build with cuda12.9 which doesn't work on new runner images with cuda13.0 closes INT-4948

) - update `create_tco_handler` macro to `create_handler` which now automatically sets `exit_code` based on whether the execute impl returns `Result::Err`. it acts as a simple wrapper for execute impls that don't return a `Result` - only do exit checks for executors that can exit in tco mode i.e. for execute impls that return `Result` - feature gate all non-tco functions with `#[cfg(not(feature = "tco"))]` [benchmark comparison](https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/17590682650#summary-49971529404) Towards INT-4921

…T refactor (openvm-org#2127) Updates `openvm-cuda-backend` with: - Performance update for better GPU memory usage: openvm-org/stark-backend#114 - Refactor to avoid some sporadic issues with NTT params initialization: openvm-org/stark-backend#123 --------- Co-authored-by: Jonathan Wang <31040440+jonathanpwang@users.noreply.github.com>

…m-org#2137) - add `ProgramAir` height as a constant trace height in metered execution doesn't seem to affect the [benchmark](https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/17747443362) which is good

…g#2108) Co-authored-by: stephenh-axiom-xyz <stephenh@intrinsictech.xyz>

Replace the optimistic execution/segmentation with a checkpointing approach that checkpoints the last `trace_height`/`instret` value that is below the thresholds and use these values for the segments. This should make the segmentation more predictable for downstream usage since the segments should satisfy the thresholds (with the only caveat being if segmentation happens when there is no checkpoint to fall back to i.e. if we overshoot the threshold before the first segmentation check) This requires storing some extra state and results in a higher segment count compared to earlier for the same thresholds. Also makes execution slightly slower since we're doing some extra work now [benchmark run](https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/17750607194) with 0.7B max cells [benchmark run](https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/17771662984) with 1.2B max cells

- log total cells, total interactions and max trace height for each segment

…#2140) - E2 execution supports suspension every segment. [Reth-benchmark](https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/17782856127) doesn't show performance difference. - `VmExecState`/`VmState` supports clone. Benchmark shows cloning a VM state which only uses address space 2 takes 0.3~0.6ms. closes INT-5066 closes INT-5067

…until_suspend` (openvm-org#2144)

### What - updates install instructions link | Before | After | |--------|--------| | <img width="1293" height="775" alt="image" src="https://github.com/user-attachments/assets/f30738c1-94a9-49b1-b0ce-1afc19739970" /> | <img width="1293" height="775" alt="image" src="https://github.com/user-attachments/assets/308f5473-e931-4745-a436-d53c019a4223" /> |

includes openvm-org/stark-backend#131

…penvm-org#2149) Co-authored-by: Jonathan Wang <31040440+jonathanpwang@users.noreply.github.com>

The `two_modular_limbs_list` constant declared by `moduli_init!` and used later in the complex setup macro was not guaranteed to be aligned to a large enough alignment. This caused an issue in some use-case. This change aligns the public constant `two_modular_limbs_list` to the max block size of a modulus among all moduli declared to ensure no memory alignment issues arise. Fixes INT-5135 --------- Co-authored-by: Jonathan Wang <31040440+jonathanpwang@users.noreply.github.com>

For guest programs with MSRV 1.87+, it is convenient for `cargo openvm build` to use a newer Rust toolchain. We chose `2025-08-02` since that is when the last stable Rust 1.90.0 branched from master: https://releases.rs/docs/1.90.0/ Note that users can always override the toolchain version with env var `OPENVM_RUST_TOOLCHAIN`. workflow run: https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/18085897389/job/51456939479

`InterpretedInstance::execute_from_state` for pure execution was using `num_insns` instead of `instret_end` in the `ExecutionCtx` constructor. This was not caught because we currently only ever execute from `instret=0`.

Co-authored-by: Jonathan Wang <31040440+jonathanpwang@users.noreply.github.com>

Currently the invalid memory access is never used, but it's still better to address it so `compute-analyzer` does not complain.

Co-authored-by: Jonathan Wang <31040440+jonathanpwang@users.noreply.github.com> Co-authored-by: Ayush Shukla <ayush@axiom.xyz>

It is not required after a small refactor.

Co-authored-by: Jonathan Wang <31040440+jonathanpwang@users.noreply.github.com>

Renamed `openvm-examples` since it has more than one example.

In some edge case where right after we start `build_async` on the memory merkle subtrees, if the program panics, then the order of drop could be that we drop the `initial_memory` buffers on the default stream first, while the `build_async` kernels are still running and using those buffers. This leads to a deadloop. I fixed it by just forcing the drop to drop subtrees first (which should sync their special streams) before dropping `initial_memory`. compare: https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/18733111153

When building autoprecompile chips in GPU in `powdr-labs/powdr`, we need access to `histogram.cuh` in `openvm/circuit-primitives` to modify counts of periphery chips. This requires exporting include dir in `openvm/circuit-primitives` for downstream crates (`powdr-openvm`). We believe this (and potentially other crates) to be general client extension usage. The way we export `openvm/circuit-primitives` to `powdr-labs/powdr-openvm` is EXACTLY the same as how `stark-backend/cuda-common` is exported to `openvm`: - https://github.com/openvm-org/stark-backend/blob/main/crates/cuda-common/build.rs#L18-L19 - https://github.com/openvm-org/stark-backend/blob/main/crates/cuda-common/Cargo.toml#L7 --------- Co-authored-by: Schaeff <thibaut@powdrlabs.com>

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

- [x] Update workspace version - [x] Update changelog

…oof (openvm-org#2207) closes INT-4314

merge openvm main as of 11/16 (7e94889) over tag `v1.4.1-powdr`

Golovanov399 and others added 30 commits September 1, 2025 10:32

chore: mention "keygen before commit" in the docs (openvm-org#2111)

d1d600d

ci(vocs): update pnpm lock file (openvm-org#2112)

f14d4c4

ci(vocs): hard copy all symlink dirs (openvm-org#2113)

ea42882

deployed: https://github.com/openvm-org/openvm/actions/runs/17385905364

ci(ami): add RunsOn image for cuda 12.9 (openvm-org#2115)

4048fe6

fix(ci/docs): recommend CUDA toolkit 12.9 (openvm-org#2116)

a72f8a7

ci: build docs with cuda feature (openvm-org#2117)

d35aac5

closes INT-4936

ci: add release workflow to upload agg,halo2 pk to s3 (openvm-org#2114)

721516b

closes INT-4924

chore(audit): ignore cargo audit false positive (openvm-org#2120)

a1b95ff

docs: add whitepaper link to vocs (openvm-org#2119)

19cccf3

ci: skip debug during CI air test (openvm-org#2118)

b9f94ff

ci: script + workflow to run clang-tidy during CUDA lints (openvm-org…

ce26f07

…#2122)

docs(vocs): use versioned rustdocs redirect (openvm-org#2125)

4378963

chore(cuda): use newer ami (openvm-org#2131)

628596b

ci(cuda-13): clean git cache (openvm-org#2132)

2e6631f

git cache was saving stark-backend build with cuda12.9 which doesn't work on new runner images with cuda13.0 closes INT-4948

ci(runs-on): set CUDA_ARCH env var in pre-install (openvm-org#2133)

6c11e67

fix: add std feature to openvm-prof serde (openvm-org#2135)

69bb1e1

chore(cuda): update poseidon2 kernel to use memory manager (openvm-or…

849e2ac

…g#2108) Co-authored-by: stephenh-axiom-xyz <stephenh@intrinsictech.xyz>

fix: log more information for segments (openvm-org#2143)

c31f78e

- log total cells, total interactions and max trace height for each segment

chore: Rename execute_metered_until_suspension to `execute_metered_…

9b7e0a7

…until_suspend` (openvm-org#2144)

chore: bump stark-backend to v1.2.1-rc.2 (openvm-org#2148)

f21b21d

includes openvm-org/stark-backend#131

chore(cuda): BabyBear unified + CUDA_DEBUG + opener in natural order (o…

e83607e

…penvm-org#2149) Co-authored-by: Jonathan Wang <31040440+jonathanpwang@users.noreply.github.com>

jonathanpwang and others added 23 commits September 28, 2025 23:58

fix: execution from non-zero instret (openvm-org#2155)

b804923

`InterpretedInstance::execute_from_state` for pure execution was using `num_insns` instead of `instret_end` in the `ExecutionCtx` constructor. This was not caught because we currently only ever execute from `instret=0`.

chore: add sdk app prove without verify (openvm-org#2164)

426091b

Co-authored-by: Jonathan Wang <31040440+jonathanpwang@users.noreply.github.com>

chore(cuda): keccakf memcpy handling of padding (openvm-org#2167)

9663929

Currently the invalid memory access is never used, but it's still better to address it so `compute-analyzer` does not complain.

feat(cuda): VPMM v3 and async app prover (openvm-org#2165)

8141dac

Co-authored-by: Jonathan Wang <31040440+jonathanpwang@users.noreply.github.com> Co-authored-by: Ayush Shukla <ayush@axiom.xyz>

chore: Remove send + sync bound on executor (openvm-org#2156)

2796704

It is not required after a small refactor.

feat(cuda): use VPMM v3.1 (openvm-org#2169)

b4bed6e

Co-authored-by: Jonathan Wang <31040440+jonathanpwang@users.noreply.github.com>

docs(transpiler): clarify itof function definition (openvm-org#2170)

1b073f8

audit: v1.4.1 report (openvm-org#2172)

1c4a0b4

chore(docs): update link to example repo (openvm-org#2176)

2a63e2a

Renamed `openvm-examples` since it has more than one example.

feat(cli): add SegmentationArgs to prove command (openvm-org#2178)

0af2248

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

release: v1.4.1 (openvm-org#2171)

05cb6a1

- [x] Update workspace version - [x] Update changelog

pull 1.4.1

6ea978f

fix: order of fields in metered execution tracing log (openvm-org#2184)

c2e376e

fix log_pc

f7cfa9e

update pc tracing

65f4e69

fix pc logging

60073d7

perf: avoid re-computing memory merkle tree for user public values pr…

11bad01

…oof (openvm-org#2207) closes INT-4314

ci: fix codspeed instrumentation benchmark (openvm-org#2231)

7e94889

Merge remote-tracking branch 'axiom/main' into HEAD

139bccd

Merge pull request #49 from powdr-labs/ovm-11-16

94d5cac

merge openvm main as of 11/16 (7e94889) over tag `v1.4.1-powdr`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

1.4.1 upgrade #45

1.4.1 upgrade #45

Uh oh!

Schaeff commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

1.4.1 upgrade #45

Are you sure you want to change the base?

1.4.1 upgrade #45

Uh oh!

Conversation

Schaeff commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants