forked from openvm-org/openvm
-
Notifications
You must be signed in to change notification settings - Fork 0
1.4.1 upgrade #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Schaeff
wants to merge
53
commits into
main
Choose a base branch
from
1.4.1-upgrade
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
1.4.1 upgrade #45
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
closes INT-4936
) Resolves INT-4950. Some CUDA 13.0 optimizer choices caused kernel `fri_reduced_opening_tracegen` to use 80 registers (previously 58), causing the total number of registers per block to exceed the allowable amount set by the device. [Benchmark difference](https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/17476936661) on CUDA 12.9 is negligible. Benchmarks done to determine savings gained by switching to CUDA 13.0 will be done later.
…g#2121) - pass `pc`, `instret` and `instret_end`/`max_execution_cost`/`segment_check_insns` by value in execution handlers to get them to be passed in registers - add `likely`, `unlikely` hints for suspension/termination in `tco` [benchmark comparison](https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/17513217695#summary-49747838308) Towards INT-4921
git cache was saving stark-backend build with cuda12.9 which doesn't work on new runner images with cuda13.0 closes INT-4948
) - update `create_tco_handler` macro to `create_handler` which now automatically sets `exit_code` based on whether the execute impl returns `Result::Err`. it acts as a simple wrapper for execute impls that don't return a `Result` - only do exit checks for executors that can exit in tco mode i.e. for execute impls that return `Result` - feature gate all non-tco functions with `#[cfg(not(feature = "tco"))]` [benchmark comparison](https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/17590682650#summary-49971529404) Towards INT-4921
…T refactor (openvm-org#2127) Updates `openvm-cuda-backend` with: - Performance update for better GPU memory usage: openvm-org/stark-backend#114 - Refactor to avoid some sporadic issues with NTT params initialization: openvm-org/stark-backend#123 --------- Co-authored-by: Jonathan Wang <31040440+jonathanpwang@users.noreply.github.com>
…m-org#2137) - add `ProgramAir` height as a constant trace height in metered execution doesn't seem to affect the [benchmark](https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/17747443362) which is good
…g#2108) Co-authored-by: stephenh-axiom-xyz <stephenh@intrinsictech.xyz>
Replace the optimistic execution/segmentation with a checkpointing approach that checkpoints the last `trace_height`/`instret` value that is below the thresholds and use these values for the segments. This should make the segmentation more predictable for downstream usage since the segments should satisfy the thresholds (with the only caveat being if segmentation happens when there is no checkpoint to fall back to i.e. if we overshoot the threshold before the first segmentation check) This requires storing some extra state and results in a higher segment count compared to earlier for the same thresholds. Also makes execution slightly slower since we're doing some extra work now [benchmark run](https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/17750607194) with 0.7B max cells [benchmark run](https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/17771662984) with 1.2B max cells
- log total cells, total interactions and max trace height for each segment
…#2140) - E2 execution supports suspension every segment. [Reth-benchmark](https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/17782856127) doesn't show performance difference. - `VmExecState`/`VmState` supports clone. Benchmark shows cloning a VM state which only uses address space 2 takes 0.3~0.6ms. closes INT-5066 closes INT-5067
### What - updates install instructions link | Before | After | |--------|--------| | <img width="1293" height="775" alt="image" src="https://github.com/user-attachments/assets/f30738c1-94a9-49b1-b0ce-1afc19739970" /> | <img width="1293" height="775" alt="image" src="https://github.com/user-attachments/assets/308f5473-e931-4745-a436-d53c019a4223" /> |
…penvm-org#2149) Co-authored-by: Jonathan Wang <31040440+jonathanpwang@users.noreply.github.com>
The `two_modular_limbs_list` constant declared by `moduli_init!` and used later in the complex setup macro was not guaranteed to be aligned to a large enough alignment. This caused an issue in some use-case. This change aligns the public constant `two_modular_limbs_list` to the max block size of a modulus among all moduli declared to ensure no memory alignment issues arise. Fixes INT-5135 --------- Co-authored-by: Jonathan Wang <31040440+jonathanpwang@users.noreply.github.com>
For guest programs with MSRV 1.87+, it is convenient for `cargo openvm build` to use a newer Rust toolchain. We chose `2025-08-02` since that is when the last stable Rust 1.90.0 branched from master: https://releases.rs/docs/1.90.0/ Note that users can always override the toolchain version with env var `OPENVM_RUST_TOOLCHAIN`. workflow run: https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/18085897389/job/51456939479
`InterpretedInstance::execute_from_state` for pure execution was using `num_insns` instead of `instret_end` in the `ExecutionCtx` constructor. This was not caught because we currently only ever execute from `instret=0`.
Co-authored-by: Jonathan Wang <31040440+jonathanpwang@users.noreply.github.com>
Currently the invalid memory access is never used, but it's still better to address it so `compute-analyzer` does not complain.
Co-authored-by: Jonathan Wang <31040440+jonathanpwang@users.noreply.github.com> Co-authored-by: Ayush Shukla <ayush@axiom.xyz>
It is not required after a small refactor.
Co-authored-by: Jonathan Wang <31040440+jonathanpwang@users.noreply.github.com>
Renamed `openvm-examples` since it has more than one example.
In some edge case where right after we start `build_async` on the memory merkle subtrees, if the program panics, then the order of drop could be that we drop the `initial_memory` buffers on the default stream first, while the `build_async` kernels are still running and using those buffers. This leads to a deadloop. I fixed it by just forcing the drop to drop subtrees first (which should sync their special streams) before dropping `initial_memory`. compare: https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/18733111153
When building autoprecompile chips in GPU in `powdr-labs/powdr`, we need access to `histogram.cuh` in `openvm/circuit-primitives` to modify counts of periphery chips. This requires exporting include dir in `openvm/circuit-primitives` for downstream crates (`powdr-openvm`). We believe this (and potentially other crates) to be general client extension usage. The way we export `openvm/circuit-primitives` to `powdr-labs/powdr-openvm` is EXACTLY the same as how `stark-backend/cuda-common` is exported to `openvm`: - https://github.com/openvm-org/stark-backend/blob/main/crates/cuda-common/build.rs#L18-L19 - https://github.com/openvm-org/stark-backend/blob/main/crates/cuda-common/Cargo.toml#L7 --------- Co-authored-by: Schaeff <thibaut@powdrlabs.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- [x] Update workspace version - [x] Update changelog
…oof (openvm-org#2207) closes INT-4314
merge openvm main as of 11/16 (7e94889) over tag `v1.4.1-powdr`
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Released under this tag