Skip to content

Inline Image Support #15

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 20, 2025
Merged

Inline Image Support #15

merged 7 commits into from
May 20, 2025

Conversation

Nanashi-lab
Copy link
Contributor

@Nanashi-lab Nanashi-lab commented May 17, 2025

/closes #13
/claim #13

Link to Video -
Video Includes running test5 and test7 for anthropic, grok, openai, openrouter

Test is run by

  • golem-cli app deploy --build-profile <model>-deug
  • Adding new worker with specific env
  • golem-cli worker invoke <worker> test5 and golem-cli worker invoke <worker> test7

Changes to the WIT

  record image-source {
    data: list<u8>,
    mime-type: string,
    detail: option<image-detail>,
  }

  variant image-reference {
    url(image-url),
    inline(image-source),
  }

  variant content-part {
    text(string),
    image(image-reference),
  }

In a earlier commit, I used content-part {text, image-url, image-inline}.
Choose this over earlier idea and made both image-url and image-source as part of the same image(image-reference).
OpenAI, Grok and Openrouter support inline image under image-url with small syntax change (Straight forward implementation)
Anthropic has a direct way to pass Base64 Image (Uses the inbuilt Base64 to pass the Image)

Test 7

There is cat.png under data/cat.png in test. We use golem Initial file system (using yaml) and the have the workers import the cat image as byte array, and get a response with the description of the image

Test 5

Test 5 previously did not have an output (unsure why ?), we now output the contents of LLM call as string, This is needed as CI ollama test can assert for output for all tests.

Added a few test in durablity.rs for base64 images

Add wit changes
Add Durability for Inline
Implement for Openai, Openrouter,anthropic,grok
Add Test7
@Nanashi-lab Nanashi-lab marked this pull request as ready for review May 18, 2025 00:07
@Nanashi-lab
Copy link
Contributor Author

Nanashi-lab commented May 18, 2025

Build and Test Pass but
clippy fails with below, I cannot replicate it locally, cargo make fix passes for me. Similar clippy errors also happens with many recent PR (Few days) in golem. Unit Tests fails clippy check, cannot replicate locally. Not sure why.

I can add #[allow(clippy::result_large_err)] this to code if that is a good way to deal with this, this is not the code I have touched.

error: the `Err`-variant returned from this function is very large
  --> llm/src/event_source/mod.rs:37:39
   |
37 |     pub fn new(response: Response) -> Result<Self, Error> {
   |                                       ^^^^^^^^^^^^^^^^^^^
   |
  ::: llm/src/event_source/error.rs:42:5
   |
42 |     InvalidContentType(HeaderValue, Response),
   |     ----------------------------------------- the largest variant contains at least 288 bytes
...
45 |     InvalidStatusCode(StatusCode, Response),
   |     --------------------------------------- the variant `InvalidStatusCode` contains at least 250 bytes
   |
   = help: try reducing the size of `event_source::error::Error`, for example by boxing large elements or replacing it with `Box<event_source::error::Error>`
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#result_large_err
   = note: `-D clippy::result-large-err` implied by `-D warnings`
   = help: to override `-D warnings` add `#[allow(clippy::result_large_err)]`

error: the `Err`-variant returned from this function is very large
  --> llm/src/event_source/mod.rs:97:42
   |
97 | fn check_response(response: Response) -> Result<Response, Error> {
   |                                          ^^^^^^^^^^^^^^^^^^^^^^^
   |
  ::: llm/src/event_source/error.rs:42:5
   |
42 |     InvalidContentType(HeaderValue, Response),
   |     ----------------------------------------- the largest variant contains at least 288 bytes
...
45 |     InvalidStatusCode(StatusCode, Response),
   |     --------------------------------------- the variant `InvalidStatusCode` contains at least 250 bytes
   |
   = help: try reducing the size of `event_source::error::Error`, for example by boxing large elements or replacing it with `Box<event_source::error::Error>`
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#result_large_err

error: could not compile `golem-llm` (lib) due to 2 previous errors
warning: build failed, waiting for other jobs to finish...
error: could not compile `golem-llm` (lib test) due to 2 previous errors
Error while executing command, exit code: 101

@vigoo
Copy link
Collaborator

vigoo commented May 18, 2025

The new lint errors are coming because of the new Rust version (1.87). CI is always using the latest stable. It's ok to add #allow temporarily to the existing code it is complaining about to make it pass. (In the golem repo you can rebase to the latest main where I already did that)

@Nanashi-lab
Copy link
Contributor Author

Done, Fixed clippy by adding #[allow(clippy::result_large_err)]

@vigoo vigoo merged commit 2ceb941 into golemcloud:main May 20, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enhance image support with inline images
2 participants