Skip to content

Conversation

@psilva261
Copy link
Collaborator

This is still very much in the proof-of-concept state and only TinyGo is an option at this point

Copy link
Contributor

@kevaundray kevaundray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! wamr was the last approach for converting C to riscv -- does this work with out of the box with Golang programs (or maybe you could say what is missing such that only TinyGo works right now)?

cc @marcinbugaj for visibility

@psilva261
Copy link
Collaborator Author

No worries! I'm getting a lot of obscure errors mostly about memory. Maybe it's just a problem in the linker script so that TinyGo works more easily because of the lower footprint. (I've already tried increasing memory) Also a lot of platform functions only have a rudimentary implementation at the moment

By the way, I've also seen the hint in the README, during the development I also had similar relocation errors. Although these don't appear with the current parameters

@psilva261
Copy link
Collaborator Author

I added a few more improvements, so for now at least basic Go programs work out-of-the-box. Although I noticed that the stateless example depends on __floatundisf

@marcinbugaj
Copy link
Collaborator

marcinbugaj commented Dec 8, 2025

@psilva261 , I'm curious if it's possible "stateless.wasm" with WAMR used in this line:

./docker/wasm2c-package.sh examples/build-wasm/go/stateless.wasm build/c-packages/stateless
and compare the performance for other approaches

@psilva261
Copy link
Collaborator Author

@psilva261 , I'm curious if it's possible "stateless.wasm" with WAMR used in this line:

./docker/wasm2c-package.sh examples/build-wasm/go/stateless.wasm build/c-packages/stateless

and compare the performance for other approaches

@marcinbugaj To be precise the WAMR setup isn't doing the C compilation step. Instead there's the LLVM-based AOT step with wamrc. At least the examples can be re-used and the other part of the pipeline is similar

So compiling it with

time ./platform/riscv-wamr-qemu/scripts/wasm2wamr-qemu.sh \
    examples/build-wasm/go/stateless.wasm \
    build/bin/stateless.wamr.elf

takes half an hour (with --opt-level=0). Actually there's a hack to accelerate it by removing --bounds-checks=1.

https://github.com/eth-act/wasrisc/pull/3/files#diff-3a473d492cab22d22c5faf8f3c61e6bce7a890829d541e94219c83ab439129f0R93

Then it takes just 3 minutes but there's a fatal error when returning from main, maybe it's not so complicated to fix that also, also because Go already does bounds checks anyway.

And for running I get:

./docker/docker-shell.sh qemu-riscv64 -plugin /libinsn.so build/bin/stateless.wamr.elf -nographic
...
cpu 0 insns: 3338
total insns: 3338

Or directly:

qemu-system-riscv64 -machine virt -m 1024M -bios none \
    -kernel build/bin/stateless.wamr.elf -nographic
...
Read witness (1212 bytes)
Witness decoded - contains 1 headers, 5 state nodes, 1 code entries
Read block (6130 bytes)
...
Block decoded - #10 with 51 transactions
...
ExecuteStateless succeeded!
State root: 035d6882afc12fcf2cb1e3b862b2db3ed2be74e091b0be25c9a0bda5a131a091
Receipt root: ea25bd6bfa6a11ae4ca5b6a65436b9818d63f01853fdb911688c65d7234d4856
Gas used: 1071000

I'm not sure how much sense this output already makes, but just to give an update. I wonder what's the best way to call stateless / good example input

@marcinbugaj
Copy link
Collaborator

marcinbugaj commented Dec 12, 2025

The results from running qemu-system-riscv64 is correct!

The result from ./docker/docker-shell.sh qemu-riscv64 is incorrect - but that's expected I suppose because your binary is targeting bare metal env so it's not supposed to be run by qemu-riscv64.

I'll try to get number of instructions executed with qemu-system-riscv64

@psilva261
Copy link
Collaborator Author

Awesome, I see that makes sense

@marcinbugaj
Copy link
Collaborator

In this Dockerfile there is COPY directive but the recent main changed from:

docker build -f "$SCRIPT_DIR/Dockerfile" -t "$IMAGE_NAME" "$PROJECT_ROOT"

to

docker build -f "$SCRIPT_DIR/Dockerfile" -t "$IMAGE_NAME" "$SCRIPT_DIR"

to avoid sending possibly multi GB context to docker daemon.

Could you please address that problem in this PR?

@marcinbugaj
Copy link
Collaborator

marcinbugaj commented Dec 15, 2025

When I run

qemu-system-riscv64 -machine virt -m 1024M -bios none -kernel build/bin/stateless.wamr.elf -nographic

the program hangs spinning at the end at

wasm_runtime_unload...

Could that be fixed? Without that it will be difficult to measure number of cpu cycles with -plugin /libinsn.so

@marcinbugaj
Copy link
Collaborator

marcinbugaj commented Dec 15, 2025

With the most most recent changes from 553bdce I tried to build the stateless with the following changes for O3 optimization:

diff --git a/platform/riscv-wamr-qemu/scripts/wasm2wamr-qemu.sh b/platform/riscv-wamr-qemu/scripts/wasm2wamr-qemu.sh
index da746c1..e125613 100755
--- a/platform/riscv-wamr-qemu/scripts/wasm2wamr-qemu.sh
+++ b/platform/riscv-wamr-qemu/scripts/wasm2wamr-qemu.sh
@@ -88,7 +88,7 @@ echo ""
     --target-abi=lp64 \
     --cpu=generic-rv64 \
     --cpu-features='+i,+m,+a' \
-    --opt-level=0 \
+    --opt-level=3 \
     --size-level=1 \
     --bounds-checks=1 \
     -o $OUTPUT.riscv64.wamr $1
@@ -120,7 +120,7 @@ CFLAGS=(
     -D__bool_true_false_are_defined
     -ffunction-sections
     -fdata-sections
-    -O0
+    -O3
     -g
     -Wall
 )

When I run:

./platform/riscv-wamr-qemu/scripts/wasm2wamr-qemu.sh examples/build-wasm/go/stateless.wasm  build/bin/stateless.wamr.elf

I got:

$ qemu-system-riscv64 -machine virt -m 1024M -bios none -kernel build/bin/stateless.wamr.elf -nographic
[00:00:00:000 - 0]: Warning: loader mmap memory address is not in the first 2 Gigabytes of the process address space.
[00:00:00:000 - 0]: Warning: loader mmap memory address is not in the first 2 Gigabytes of the process address space.
[00:00:00:000 - 0]: Warning: loader mmap memory address is not in the first 2 Gigabytes of the process address space.
runtime load module failed: AOT module load failed: relocation truncated to fit R_RISCV_PCREL_LO12_I failed.

and program hangs. Is that a known issue? Is it possible to compile the program with 03 optimizations? The same happens with O1.

@psilva261
Copy link
Collaborator Author

The OpenSBI BIOS (-bios=default instead of -bios=none) is used now to have a shutdown ecall. I wasn't able to get the libisns plugin running but there are now measurements in the code itself. The plugin is included anyway but with a patch in the Dockerfile. (It seems this is already in the pipeline to being patched upstream.) E.g. for stateless:

...
ExecuteStateless succeeded!
State root: 035d6882afc12fcf2cb1e3b862b2db3ed2be74e091b0be25c9a0bda5a131a091
Receipt root: ea25bd6bfa6a11ae4ca5b6a65436b9818d63f01853fdb911688c65d7234d4856
Gas used: 1071000
wasm_runtime_unload...
instructions runtime init:        104948
instructions runtime load:        268799787
instructions runtime instantiate: 26411864
instructions runtime exec:        5109822536
instructions runtime unload:      10621095
instructions in total:            5415760230

runtime load module failed: AOT module load failed: relocation truncated to fit R_RISCV_PCREL_LO12_I failed.

I didn't know the issue is still there. At least now LLVM 20 is referenced which in theory supports code model large. Although more changes would be needed to get it working

In this Dockerfile there is COPY directive but the recent main changed ...

The WAMR code is now cloned from a separate repo

@marcinbugaj
Copy link
Collaborator

marcinbugaj commented Dec 18, 2025

I was able to get the output of libinsn.so by using the following options:

./docker/docker-shell.sh qemu-system-riscv64 -d plugin -machine virt -m 1024M -plugin /libinsn.so -kernel build/bin/stateless.wamr.elf -nographic

The option -d plugin was crucial to get the output.

The tail of the log is:

ExecuteStateless succeeded!
State root: 035d6882afc12fcf2cb1e3b862b2db3ed2be74e091b0be25c9a0bda5a131a091
Receipt root: ea25bd6bfa6a11ae4ca5b6a65436b9818d63f01853fdb911688c65d7234d4856
Gas used: 1071000
wasm_runtime_unload...
instructions runtime init:        5156916
instructions runtime load:        3146282344
instructions runtime instantiate: 352465539
instructions runtime exec:        86502855539
instructions runtime unload:      174458148
instructions in total:            90181218486
cpu 0 insns: 5252625086
total insns: 5252625086

So the measurements on my machine with the instret csr counter are different than yours. "runtime exec" on my machine is 86e9 and 5e9 on yours.

Interestingly enough libinsn.so reported 5e9 which is consistent with your measurement methodology on your machine.

I'd be grateful if you could in the scope of this PR add the steps for wamr compilation and profiling with libinsn.so to go_benchmark.sh. Also please update the go benchmarks section in the README.md. It's worth mentioning that the compilation is with -O0 and that higher optimization levels don't work with wamr.

So far it seams that wasmtime and wasmer with cranelift are the fastest being "only" ~3.5 times slower than the direct compilation from Go to RV64.

@marcinbugaj
Copy link
Collaborator

runtime load module failed: AOT module load failed: relocation truncated to fit R_RISCV_PCREL_LO12_I failed.

I didn't know the issue is still there. At least now LLVM 20 is referenced which in theory supports code model large. Although more changes would be needed to get it working

Are you aware of an existing issue for that problem in https://github.com/bytecodealliance/wasm-micro-runtime/issues ? If not perhaps it makes sense to create one. In the end WAMR with O3 would be rather crucial for the aforementioned benchmarks.

@psilva261
Copy link
Collaborator Author

Yeah the measurement using instret only works when setting -icount shift=0. Ok but great that the plugin can be used anyway

There are open PRs to upgrade LLVM but that's blocked by problems with the xtensa platform...

bytecodealliance/wasm-micro-runtime#4213 (LLVM 19 is needed to fix the compile times)
bytecodealliance/wasm-micro-runtime#4654 (LLVM 20 is the first version with code model large support on RISC-V)

I've seen problems with RISC-V 64 mentioned in the project though

XIP is not fully supported yet on RISCV64, some relocations can not be resolved

https://github.com/bytecodealliance/wasm-micro-runtime/blob/main/.github/workflows/spec_test_on_nuttx.yml#L134

But yeah, I can create a specific ticket regarding the optimization and update the PR.

@psilva261
Copy link
Collaborator Author

@marcinbugaj
Copy link
Collaborator

@kevaundray , @psilva261 is this PR ready to be merged?

@psilva261
Copy link
Collaborator Author

From my side this covers all points I have currently open for WAMR AOT support.

There's also an update in the WAMR ticket and luckily help was offered, so it might make sense to do follow-up works outside of this PR.

Of course WAMR is quite open ended with all the possible features. Some things which can be interesting:

  • WAMR AOT XIP which supports isolated ROM and RAM like in ziskemu. (Although my suggestion would be to wait looking into that at least until all relative relocation types are supported in WAMR AOT itself)

  • WAMR JIT might be also interesting. But of course then executing code from RAM would be a hard requirement

By the way, the WAMR fork referenced from the Dockerfile is also currently in my own namespace: https://github.com/psilva261/wasm-micro-runtime-zkvm (Platform support for baremetal RISC-V, LLVM 20, missing symbols, build file)

@marcinbugaj
Copy link
Collaborator

That's a massive amount of work and I'd be grateful if you could merge the PR. The follow up work could be covered by subsequent PRs

@psilva261
Copy link
Collaborator Author

Awesome, thanks a lot!

@psilva261 psilva261 merged commit 6220f87 into eth-act:master Jan 6, 2026
@psilva261 psilva261 deleted the add-wamr branch January 6, 2026 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants