Ollama fix handle incomplete JSON chunks in stream #1019
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Addresses one of the issues raised in #686
Problem
Ollama LLM can return chunk in parts when it's too long.
Solution
Improve the
json_responses_chunk_handlermethod to handle incomplete JSON chunks in stream. If a chunk does not end with}, it is considered incomplete and buffered until the next chunk arrives. This prevents JSON parsing errors and ensures all responses are processed correctly.This PR is heavily inspired from #995 by @berkcaputcu, my implementation doesn't rely on exceptions and also includes additional specs.