-
Notifications
You must be signed in to change notification settings - Fork 19
LLM Ollama Implementation #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Could you explain the bit about test framework, does the ticket invoke a pr in golem, to test-framework, which would start a docker, download and run ollama, download the models (vision, text) , and then run new test inspired by test:llm? This would be a lot of downloads just for a simple test? Currently golem only uses postgress/sqlite docker test within service test, both comparevity smaller downloads. I have pr with video record of me running all 6 test, @mschuwalow @vigoo please review, thank you. This test is by creating a component made from test:llm build profile ollama, creating worker, and running test in golem. |
The video looks very good. I'm currently at a conference and have no cycles to do a proper review, but I'll do that in the next few days. |
@Nanashi-lab See #13 |
@Nanashi-lab It looks like this implementation was at least partially copied from this one. We do not allow copying from existing work and require that all contributions be original. |
@jdegoes It is the otherway around, My Implementation was on 12th, He has forced pushed commit on 16th. He has forced pushed twice, so it hard to see what the original commit was. Among his git branch there is branch test-25, and two weeks ago when the PR was made his code was -> Link You can read the whole branch history here - Branch History I made local copy now as a backup. When the PR was made, he did not have support for Image or Tool Calling, He was using the default api at Now - After the 16th force push
|
license = "Apache-2.0" | ||
homepage = "https://golem.cloud" | ||
repository = "https://github.com/golemcloud/golem-llm" | ||
description = "WebAssembly component for working with Ollama APIs, integrated into Golem Cloud local model support" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description = "WebAssembly component for working with Ollama APIs, integrated into Golem Cloud local model support" | |
description = "WebAssembly component for working with Ollama APIs, with special support for Golem Cloud" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add automated tests in ci as well. Using the smallest models for ollama should be reasonably fast
|
||
let response: Response = self | ||
.client | ||
.request(Method::POST, format!("{}/chat/completions", self.base_url)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ollama's openai api seems to be experimental https://github.com/ollama/ollama/blob/main/docs/openai.md#openai-compatibility
What is the reason for using that one instead of their native api?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ollama default Api for streaming uses ndjson, while streaming support in llm is for sse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be nothing special about the way that durability is implemented for sse. Please try to use the same approach to implement support for ndjson streams and then switch to using the native ollama api.
}; | ||
|
||
// Handle tool choice configuration | ||
let tool_choice = if let Some(tc) = config.tool_choice.as_ref() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tool_choice parameter does not seem to be supported https://github.com/ollama/ollama/blob/main/docs/openai.md?plain=1#L245
There is definitely a lot of overlap in your implementations, and I think it's very likely that one of you copied from the other. As both of you are still missing some aspects of the bounty, I ask you to implement the missing features. Do this without copying and without force pushes please. Whoever ends up having the first complete implementation will be awarded the bounty. The following features are missing:
|
/closes #6
/claim #6
A video demonstrating the test process: Link to video
Includes both the
worker invoke
andworker stream
To run the tests, follow these steps:
Test 1,2,3,5,6: ✅
Test 4:⚠️ Partial Success
Test 4 (Tool Use): Ollama's streaming doesn't fully support tool calls, Refer to API. The code is there, when Ollama supports this, our code should work as is.
Test 5 (Vision with Moondream): This test uses the
moondream
model (only one that will work in my system) to see if image understanding (vision) works via Ollama.moondream
is an older, small model, which works in my system, it doesn't know Hungarian, but it does reply with some vague idea of the image. (Repeat the test, it gets it somewhat right in 3-4 tries)Default response for test5 is empty by default, but you can see the reply in the stream. Shouldn't it output the reply ?
Sending Images to Ollama:
Ollama API
/generate /chat
) and an experimental one that acts like OpenAI's (/v1
)./v1
one (like OpenAI) for chat, vision (/v1/chat/completions
), as streaming in default api uses ndjson and not sse