LLM Ollama Implementation #12

Nanashi-lab · 2025-05-12T09:36:11Z

/closes #6
/claim #6

A video demonstrating the test process: Link to video
Includes both the worker invoke and worker stream

To run the tests, follow these steps:

Ensure Docker is installed and running.

Pull the required Ollama models:

docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
docker exec -it ollama ollama pull moondream # for image test
docker exec -it ollama ollama pull qwen3:1.7b # for text tests

In the main project directory, build the application:
```
cargo make build
```
In the test directory, deploy the application using the specific build profile:
```
golem-cli app deploy --build-profile ollama-debug
```
No need for env, base url defaults to localhost:11434, can be overriden if needed. Create worker and run tests.

Test 1,2,3,5,6: ✅
Test 4: ⚠️ Partial Success
Test 4 (Tool Use): Ollama's streaming doesn't fully support tool calls, Refer to API. The code is there, when Ollama supports this, our code should work as is.
Test 5 (Vision with Moondream): This test uses the moondream model (only one that will work in my system) to see if image understanding (vision) works via Ollama. moondream is an older, small model, which works in my system, it doesn't know Hungarian, but it does reply with some vague idea of the image. (Repeat the test, it gets it somewhat right in 3-4 tries)
Default response for test5 is empty by default, but you can see the reply in the stream. Shouldn't it output the reply ?

Sending Images to Ollama:

Url Image is not supported in Ollama, but base64 is, golem-llm doesnt support base64 image
We first download the image, turns it into the Base64 text format, and then send that text to Ollama the way it needs.
Since Anthropic API can also support Base64, golem-llm having support for Base64 would be nice

Ollama API

Ollama has 2 API: its own API (/generate /chat) and an experimental one that acts like OpenAI's (/v1).
We use the /v1 one (like OpenAI) for chat, vision (/v1/chat/completions), as streaming in default api uses ndjson and not sse

Nanashi-lab · 2025-05-12T13:17:32Z

Could you explain the bit about test framework, does the ticket invoke a pr in golem, to test-framework, which would start a docker, download and run ollama, download the models (vision, text) , and then run new test inspired by test:llm?

This would be a lot of downloads just for a simple test? Currently golem only uses postgress/sqlite docker test within service test, both comparevity smaller downloads.

I have pr with video record of me running all 6 test, @mschuwalow @vigoo please review, thank you.

This test is by creating a component made from test:llm build profile ollama, creating worker, and running test in golem.

mschuwalow · 2025-05-14T15:28:09Z

The video looks very good.
If you could write a script and do what you did in the video in ci, it would perfectly cover the testing requirements.

I'm currently at a conference and have no cycles to do a proper review, but I'll do that in the next few days.

jdegoes · 2025-05-16T16:27:46Z

@Nanashi-lab See #13

jdegoes · 2025-05-19T16:39:26Z

@Nanashi-lab It looks like this implementation was at least partially copied from this one. We do not allow copying from existing work and require that all contributions be original.

Nanashi-lab · 2025-05-19T16:57:32Z

@Nanashi-lab It looks like this implementation was at least partially copied from this one. We do not allow copying from existing work and require that all contributions be original.

@jdegoes It is the otherway around, My Implementation was on 12th, He has forced pushed commit on 16th. He has forced pushed twice, so it hard to see what the original commit was.

Among his git branch there is branch test-25, and two weeks ago when the PR was made his code was -> Link

You can read the whole branch history here - Branch History I made local copy now as a backup.

When the PR was made, he did not have support for Image or Tool Calling, He was using the default api at /api/chat , No Streaming Implementation that I can see, as /api/chat uses ndjson and sse (Which is what ollama expects).

Now - After the 16th force push

Uses openai compatible Api after my PR
Supports Image in the same way as mine, download it then convert into base64 and then passes it
Our codes look awfully same.

mschuwalow · 2025-05-19T17:26:53Z

llm-ollama/Cargo.toml

+license = "Apache-2.0"
+homepage = "https://golem.cloud"
+repository = "https://github.com/golemcloud/golem-llm"
+description = "WebAssembly component for working with Ollama APIs, integrated into Golem Cloud local model support"


Suggested change

description = "WebAssembly component for working with Ollama APIs, integrated into Golem Cloud local model support"

description = "WebAssembly component for working with Ollama APIs, with special support for Golem Cloud"

mschuwalow

Please add automated tests in ci as well. Using the smallest models for ollama should be reasonably fast

mschuwalow · 2025-05-19T17:30:09Z

llm-ollama/src/client.rs

+
+        let response: Response = self
+            .client
+            .request(Method::POST, format!("{}/chat/completions", self.base_url))


Ollama's openai api seems to be experimental https://github.com/ollama/ollama/blob/main/docs/openai.md#openai-compatibility

What is the reason for using that one instead of their native api?

Ollama default Api for streaming uses ndjson, while streaming support in llm is for sse

There should be nothing special about the way that durability is implemented for sse. Please try to use the same approach to implement support for ndjson streams and then switch to using the native ollama api.

mschuwalow · 2025-05-19T17:32:18Z

llm-ollama/src/conversions.rs

+    };
+
+    // Handle tool choice configuration
+    let tool_choice = if let Some(tc) = config.tool_choice.as_ref() {


tool_choice parameter does not seem to be supported https://github.com/ollama/ollama/blob/main/docs/openai.md?plain=1#L245

mschuwalow · 2025-05-19T17:53:44Z

@Nanashi-lab @varshith257

There is definitely a lot of overlap in your implementations, and I think it's very likely that one of you copied from the other.
As there were some force pushes done by @varshith257, I cannot easily tell right now whose implementation came first.

As both of you are still missing some aspects of the bounty, I ask you to implement the missing features. Do this without copying and without force pushes please. Whoever ends up having the first complete implementation will be awarded the bounty.
In case there are more accusations of copying, we'll use the github push timestamps to decide who was first.
If the push timestamps cannot be used due to force pushes, we'll assume the other person came first.

The following features are missing:

Using the native ollama api (which implies adding durability for ndjson streams)
Automated tests as part of the ci pipeline using the smallest possible models

LLM Ollama Implementation

045d034

algora-pbc bot added the 🙋 Bounty claim label May 12, 2025

algora-pbc bot mentioned this pull request May 12, 2025

Add local model support #6

Open

varshith257 mentioned this pull request May 19, 2025

Add local Ollama model support #7

Open

mschuwalow reviewed May 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM Ollama Implementation #12

LLM Ollama Implementation #12

Nanashi-lab commented May 12, 2025 •

edited

Loading

Nanashi-lab commented May 12, 2025

mschuwalow commented May 14, 2025

jdegoes commented May 16, 2025

jdegoes commented May 19, 2025

Nanashi-lab commented May 19, 2025

mschuwalow May 19, 2025

mschuwalow left a comment •

edited

Loading

mschuwalow May 19, 2025

Nanashi-lab May 19, 2025

mschuwalow May 19, 2025 •

edited

Loading

mschuwalow May 19, 2025

mschuwalow commented May 19, 2025

	description = "WebAssembly component for working with Ollama APIs, integrated into Golem Cloud local model support"
	description = "WebAssembly component for working with Ollama APIs, with special support for Golem Cloud"

LLM Ollama Implementation #12

Are you sure you want to change the base?

LLM Ollama Implementation #12

Conversation

Nanashi-lab commented May 12, 2025 • edited Loading

Nanashi-lab commented May 12, 2025

mschuwalow commented May 14, 2025

jdegoes commented May 16, 2025

jdegoes commented May 19, 2025

Nanashi-lab commented May 19, 2025

mschuwalow May 19, 2025

Choose a reason for hiding this comment

mschuwalow left a comment • edited Loading

Choose a reason for hiding this comment

mschuwalow May 19, 2025

Choose a reason for hiding this comment

Nanashi-lab May 19, 2025

Choose a reason for hiding this comment

mschuwalow May 19, 2025 • edited Loading

Choose a reason for hiding this comment

mschuwalow May 19, 2025

Choose a reason for hiding this comment

mschuwalow commented May 19, 2025

Nanashi-lab commented May 12, 2025 •

edited

Loading

mschuwalow left a comment •

edited

Loading

mschuwalow May 19, 2025 •

edited

Loading