feat: adopt new `StreamingChunk` in Ollama #2109

Amnah199 · 2025-07-25T16:50:45Z

Related Issues

fixes feat: Update OllamaChatGenerator to use the StreamingChunk fields #2072

Proposed Changes:

adopt the new StreamingChunk and stream also tool calls

How did you test it?

Updated the tests

Notes for the reviewer

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test:.

* first implementation * tests * small fixes * more tests + refinements * small link fixes * lint * adopt a more defensive approach * use mapped finish reason everywhere

Co-authored-by: anakin87 <44616784+anakin87@users.noreply.github.com>

* chore: Google AI - suggest users to switch to Google GenAI * docs reference

Co-authored-by: anakin87 <44616784+anakin87@users.noreply.github.com>

* chore: Google Vertex - suggest users to switch to Google GenAI * docs reference * more links to docs

Co-authored-by: anakin87 <44616784+anakin87@users.noreply.github.com>

* feat: Amazon Bedrock - multimodal support * fix * add pillow test dep * fixes * pin latest haystack * try testing with sonnet 3.7 * try sonnet 4

Co-authored-by: anakin87 <44616784+anakin87@users.noreply.github.com>

...rations/ollama/src/haystack_integrations/components/generators/ollama/chat/chat_generator.py

anakin87

I left some comments.

I would also suggest to:

locally test some use cases (chat, tool calls) with print_streaming_chunk and confirm that the output is good
write unit tests with real Ollama chunks
- chat - similar to this
- tool calls - similar to this (but with multiple tool calls if possible; in llama.cpp it was not possible)

integrations/ollama/pyproject.toml

...rations/ollama/src/haystack_integrations/components/generators/ollama/chat/chat_generator.py

Amnah199 · 2025-07-31T19:50:36Z

...rations/ollama/src/haystack_integrations/components/generators/ollama/chat/chat_generator.py

+                tool_call_index += 1
+            chunk = self._build_chunk(
+                chunk_response=raw, component_info=component_info, index=index, tool_call_index=tool_call_index
+            )


Ollama doesnt provide a ToolCall index, so one way is to track it ourselves iteratively.

Amnah199 · 2025-07-31T19:51:21Z

integrations/ollama/tests/test_chat_generator.py

+        assert result["replies"][0].tool_calls[0].tool_name == "calculator"
+        assert result["replies"][0].tool_calls[0].arguments == {"expression": "7 * (4 + 2)"}
+        assert result["replies"][0].tool_calls[1].tool_name == "factorial"
+        assert result["replies"][0].tool_calls[1].arguments == {"n": 5}


Note that both ToolCalls are part of the same ChatMessage.

anakin87

I spent significant time trying to understand how Ollama behaves when it comes to tool calls + streaming, since it seems a crucial point for this PR.

In all my experiments, I've found that single tool calls are included in a single chunk, but I am not sure that this always holds true. I've tried mistral-small3.1:24b, llama3.2:3b, llama3.1:8b, qwen3:0.6b, and qwen3:1.7b.

Also looking for online resources, I cannot find a clear source of truth. At this point, I would open an issue on ollama-python and ask.

Most of the comments I left are related, some are not.

...rations/ollama/src/haystack_integrations/components/generators/ollama/chat/chat_generator.py

…ators/ollama/chat/chat_generator.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

anakin87 · 2025-08-01T14:29:59Z

Opened an issue to better understand: ollama/ollama#11633

anakin87 · 2025-08-04T08:44:33Z

I found this comment in the PR that introduced streaming + tool calls: ollama/ollama#10415 (comment)

It seems to confirm that Ollama streams complete JSON tool calls (not string fragments). I would simplify our implementation in this direction.

Amnah199 · 2025-08-04T12:00:24Z

@anakin87 I found this example where arguments are returned as JSON object in string format.
https://gist.github.com/philipp-meier/678a4679d0895276f270fac4c046ad14

So we want to keep an implementation where we check whether args are passed as str or dict, and handle accordingly?But we don't expect arguments to be passed as str fragments in multiple chunks.

anakin87 · 2025-08-04T12:16:03Z

...rations/ollama/src/haystack_integrations/components/generators/ollama/chat/chat_generator.py

+                    ToolCallDelta(
+                        index=tool_call_index,
+                        tool_name=tool_call["function"]["name"],
+                        arguments=tool_call["function"]["arguments"],


it seems strange that mypy does not complain
In Ollama python, arguments: Mapping[str, Any] (a dict)
https://github.com/ollama/ollama-python/blob/fe91357d4b9c86d79efe4fabbdfabf9a1e68b07f/ollama/_types.py#L303

while in our ToolCallDelta, arguments: Optional[str]
https://github.com/deepset-ai/haystack/blob/f2012a4521f4fad35ea8e0dd10530c9688e6eb12/haystack/dataclasses/streaming_chunk.py#L30

Do you think that a conversion is needed (json.dumps)? Could you verify?

Added a fix. I guess this was missed by mypy because earlier we do chunk_response_dict = chunk_response.model_dump() which returns generic Dict[str, Any].

...rations/ollama/src/haystack_integrations/components/generators/ollama/chat/chat_generator.py

anakin87 · 2025-08-04T12:21:50Z

@anakin87 I found this example where arguments are returned as JSON object in string format. https://gist.github.com/philipp-meier/678a4679d0895276f270fac4c046ad14

So we want to keep an implementation where we check whether args are passed as str or dict, and handle accordingly?But we don't expect arguments to be passed as str fragments in multiple chunks.

The example uses curl. My impression is that the python client handles the conversion and returns a dict, so I would not worry about receiving a str.

anakin87

Once the format error is fixed, feel free to merge.

I am not completely sure that everything is perfect, but we'll figure out.

anakin87 · 2025-08-05T08:18:15Z

From ollama/ollama#11633 (comment)

Tool calls should be coming back fully parsed

Update the build_chunk

8377f32

github-actions bot added the integration:ollama label Jul 25, 2025

Amnah199 and others added 12 commits July 28, 2025 01:48

Update the handle streaming response

2c06d3d

Fix linting

4fdab58

feat: LlamaCppChatGenerator streaming support (#2108)

34b2d6e

* first implementation * tests * small fixes * more tests + refinements * small link fixes * lint * adopt a more defensive approach * use mapped finish reason everywhere

Update changelog for integrations/llama_cpp (#2112)

15de1bd

Co-authored-by: anakin87 <44616784+anakin87@users.noreply.github.com>

chore: Google AI - suggest users to switch to Google GenAI (#2106)

ae7047d

* chore: Google AI - suggest users to switch to Google GenAI * docs reference

Update changelog for integrations/google_ai (#2113)

3e38370

Co-authored-by: anakin87 <44616784+anakin87@users.noreply.github.com>

chore: Google Vertex - suggest users to switch to Google GenAI (#2105)

62db94f

* chore: Google Vertex - suggest users to switch to Google GenAI * docs reference * more links to docs

Update changelog for integrations/google_vertex (#2116)

1c122e7

Co-authored-by: anakin87 <44616784+anakin87@users.noreply.github.com>

feat: Amazon Bedrock - multimodal support (#2114)

c2a5afe

* feat: Amazon Bedrock - multimodal support * fix * add pillow test dep * fixes * pin latest haystack * try testing with sonnet 3.7 * try sonnet 4

Update changelog for integrations/amazon_bedrock (#2117)

7e5ceb3

Co-authored-by: anakin87 <44616784+anakin87@users.noreply.github.com>

Update the indexing

3f91bc9

Update type errors

117ac69

github-actions bot added integration:google-vertex integration:amazon-bedrock integration:google-ai integration:llama_cpp labels Jul 29, 2025

Amnah199 marked this pull request as ready for review July 29, 2025 21:35

Amnah199 requested a review from a team as a code owner July 29, 2025 21:35

Amnah199 requested review from anakin87 and removed request for a team July 29, 2025 21:35

Merge branch 'main' into streaming-update-ollama

4a1f4d7

Amnah199 commented Jul 29, 2025

View reviewed changes

...rations/ollama/src/haystack_integrations/components/generators/ollama/chat/chat_generator.py Outdated Show resolved Hide resolved

Amnah199 marked this pull request as draft July 29, 2025 21:38

Amnah199 commented Jul 30, 2025

View reviewed changes

...rations/ollama/src/haystack_integrations/components/generators/ollama/chat/chat_generator.py Show resolved Hide resolved

anakin87 reviewed Jul 30, 2025

View reviewed changes

Update to handle different toolcalls

5cf7afc

github-actions bot added the type:documentation Improvements or additions to documentation label Jul 31, 2025

Improve build chunk

7bffb69

Update linting

a83d737

Amnah199 marked this pull request as ready for review July 31, 2025 19:47

Amnah199 commented Jul 31, 2025

View reviewed changes

Amnah199 requested a review from anakin87 August 1, 2025 09:06

anakin87 reviewed Aug 1, 2025

View reviewed changes

Update integrations/ollama/src/haystack_integrations/components/gener…

49a9f01

…ators/ollama/chat/chat_generator.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

Amnah199 added 3 commits August 4, 2025 12:53

PR comments

0a9e08b

small updates

cf07bc0

Remove parsing of arg strings

f55c9f1

anakin87 reviewed Aug 4, 2025

View reviewed changes

Amnah199 added 2 commits August 4, 2025 15:03

Update the typing

3875143

Fix typing

cdd7dea

anakin87 approved these changes Aug 4, 2025

View reviewed changes

Fix linting

267db06

Amnah199 merged commit f62cd02 into main Aug 5, 2025
7 checks passed

Amnah199 deleted the streaming-update-ollama branch August 5, 2025 12:41

feat: adopt new StreamingChunk in Ollama #2109

feat: adopt new StreamingChunk in Ollama #2109

Uh oh!

Conversation

Amnah199 commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issues

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

Uh oh!

Uh oh!

Uh oh!

anakin87 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Amnah199 Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

Amnah199 Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

anakin87 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anakin87 commented Aug 1, 2025

Uh oh!

anakin87 commented Aug 4, 2025

Uh oh!

Amnah199 commented Aug 4, 2025

Uh oh!

anakin87 Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Amnah199 Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

anakin87 commented Aug 4, 2025

Uh oh!

anakin87 left a comment

Choose a reason for hiding this comment

Uh oh!

anakin87 commented Aug 5, 2025

Uh oh!

Uh oh!

Uh oh!

feat: adopt new `StreamingChunk` in Ollama #2109

feat: adopt new `StreamingChunk` in Ollama #2109

Amnah199 commented Jul 25, 2025 •

edited

Loading

anakin87 Aug 4, 2025 •

edited

Loading