Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
2e83f64
chore: bump versions
mhordynski Sep 12, 2025
ee680a4
chore: fix docs build
mhordynski Sep 12, 2025
ecf3508
init: TODO agent
jakubduda-dsai Sep 16, 2025
234ec2c
gather tasks results
jakubduda-dsai Sep 16, 2025
661e4f7
feat: support wrapping downstream agents as tools (#819)
akotyla Sep 16, 2025
61d6739
feat: class-based agents (#820)
dazy-ds Sep 17, 2025
1115edb
save
jakubduda-dsai Sep 18, 2025
bd9cb55
todo_manager
jakubduda-dsai Sep 18, 2025
fd87b7f
remove task_id
jakubduda-dsai Sep 18, 2025
63faf90
clear prints
jakubduda-dsai Sep 18, 2025
2f7146a
clear prints
jakubduda-dsai Sep 18, 2025
2d29e5b
remove global container for todo list
jakubduda-dsai Sep 19, 2025
dc1d068
fix: nightly builds
mhordynski Sep 22, 2025
3ca9cd4
fix: docs deployments
mhordynski Sep 22, 2025
3026929
fix: add docs login
mhordynski Sep 22, 2025
759f59e
feat: introduce post processors (#821)
mackurzawa Sep 22, 2025
f67ab91
add humaneval pipeline files
rk-izak Sep 24, 2025
769551f
feat: streaming from downstream agents (#825)
akotyla Sep 25, 2025
51c95a9
minor humaneval changes
rk-izak Sep 25, 2025
a21efb5
add GAIA pipeline + basic extra tools for benchmarking
rk-izak Sep 25, 2025
b9673fb
remove temp comment for lint
rk-izak Sep 25, 2025
d45415f
feat: todo list for agent (#823)
jakubduda-dsai Sep 26, 2025
09f94e4
feat: introduce supervisor post processor (#830)
mackurzawa Sep 26, 2025
8c8697d
add hotpotqa+rag pipeline files
rk-izak Sep 29, 2025
8ac24d8
small lint changes
rk-izak Sep 29, 2025
4998855
Merge remote-tracking branch 'origin/develop' into rki/todo-eval
rk-izak Sep 29, 2025
009c1d5
wrong conflict resolution
rk-izak Sep 29, 2025
fef3376
adjust parser for hotpotqa; remove previous TODO usage and adjust pro…
rk-izak Sep 29, 2025
56c8432
refactor
rk-izak Sep 30, 2025
4b2585a
add changelogs
rk-izak Oct 1, 2025
3a9df86
trailing spaces removal
rk-izak Oct 1, 2025
49bf3aa
remove some extra tools
rk-izak Oct 1, 2025
2b04da9
ruff and mypy refactor
rk-izak Oct 1, 2025
ba77557
initialize orchestator with todo agent
rk-izak Oct 2, 2025
bc7534a
add mock of todo orchestrator
rk-izak Oct 6, 2025
0e1607b
add info about HF CLI and gated datasets
rk-izak Oct 7, 2025
a6db07b
make the evaluation pipelines pluggable with agents; reformat
rk-izak Oct 8, 2025
77ee2af
feat: todo list component (#827)
dazy-ds Oct 10, 2025
61ec71f
Automated UI build
ds-ragbits-robot Oct 10, 2025
ab8de73
Merge remote-tracking branch 'origin/develop' into rki/todo-eval
rk-izak Oct 13, 2025
33ab6b5
remove tutorial draft
rk-izak Oct 13, 2025
2b9b8fb
docs: installation & source fixes (#844)
puzzle-solver Oct 13, 2025
51942e1
Merge branch 'develop' into rki/todo-eval
mhordynski Oct 13, 2025
9650a70
feat: conversation summary (#840)
dazy-ds Oct 13, 2025
7fa44c5
Automated UI build
ds-ragbits-robot Oct 13, 2025
1f27426
Merge branch 'develop' into rki/todo-eval
mhordynski Oct 13, 2025
3bc67aa
wip: add socrates dataset
mackurzawa Oct 13, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 17 additions & 36 deletions .github/workflows/nightly-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,43 +19,19 @@ jobs:
ref: develop
fetch-depth: 0

- name: Check if nightly build needed
id: check
run: |
# Get the latest commit hash on develop
COMMIT_HASH=$(git rev-parse --short HEAD)
echo "commit-hash=$COMMIT_HASH" >> "$GITHUB_OUTPUT"

# Check if we already built this commit as nightly
LAST_NIGHTLY_TAG=$(git tag -l "*dev*" --sort=-version:refname | head -1)
if [ -n "$LAST_NIGHTLY_TAG" ]; then
# Get the commit that the last nightly tag points to
LAST_NIGHTLY_COMMIT=$(git rev-list -n 1 $LAST_NIGHTLY_TAG)
CURRENT_COMMIT=$(git rev-parse HEAD)
if [ "$CURRENT_COMMIT" = "$LAST_NIGHTLY_COMMIT" ]; then
echo "should-build=false" >> "$GITHUB_OUTPUT"
echo "No new commits since last nightly build"
exit 0
fi
fi
- name: Install uv
uses: astral-sh/setup-uv@v2
with:
version: ${{ vars.UV_VERSION || '0.6.9' }}

# Generate nightly version
BASE_VERSION=$(python -c "
try:
import tomllib
except ImportError:
import tomli as tomllib
with open('packages/ragbits/pyproject.toml', 'rb') as f:
data = tomllib.load(f)
print(data['project']['version'])
")
# Use timestamp for unique nightly version (PEP 440 compliant)
TIMESTAMP=$(date +%Y%m%d%H%M)
NIGHTLY_VERSION="${BASE_VERSION}.dev${TIMESTAMP}"
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.10"

echo "should-build=true" >> "$GITHUB_OUTPUT"
echo "nightly-version=$NIGHTLY_VERSION" >> "$GITHUB_OUTPUT"
echo "Will build nightly version: $NIGHTLY_VERSION"
- name: Check if nightly build needed
id: check
run: uv run scripts/check_nightly_build.py

build-and-publish:
needs: check-for-changes
Expand Down Expand Up @@ -100,6 +76,7 @@ jobs:
git commit -m "chore: update package versions for nightly build ${{ env.NIGHTLY_VERSION }}"
git tag "${{ env.NIGHTLY_VERSION }}"
git push origin "${{ env.NIGHTLY_VERSION }}"
git push origin develop
env:
GH_TOKEN: ${{ secrets.GH_TOKEN }}
NIGHTLY_VERSION: ${{ needs.check-for-changes.outputs.nightly-version }}
Expand All @@ -114,7 +91,11 @@ jobs:

- name: Deploy nightly documentation
shell: bash
run: uv run mike deploy --push nightly
run: |
git config user.name "ds-ragbits-robot"
git config user.email "ds-ragbits-robot@users.noreply.github.com"
git fetch origin gh-pages
uv run mike deploy --push --alias-type copy nightly
env:
GH_TOKEN: ${{ secrets.GH_TOKEN }}

Expand Down
3 changes: 3 additions & 0 deletions .github/workflows/publish-docs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@ jobs:
contents: write
steps:
- uses: actions/checkout@v4
with:
ref: gh-pages
fetch-depth: 1

- name: Deploy docs
shell: bash
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/publish-pypi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ jobs:

- name: Deploy documentation
run: |
uv run mike deploy --push stable
git fetch origin gh-pages
uv run mike deploy --push --alias-type copy stable
env:
GH_TOKEN: ${{ secrets.GH_TOKEN }}
6 changes: 6 additions & 0 deletions docs/api_reference/agents/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,9 @@
::: ragbits.agents.AgentResultStreaming

::: ragbits.agents.a2a.server.create_agent_server

::: ragbits.agents.post_processors.base

::: ragbits.agents.post_processors.supervisor

::: ragbits.agents.AgentRunContext
29 changes: 28 additions & 1 deletion docs/how-to/agents/define_and_use_agents.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ Use a structured prompt to instruct the LLM. For details on writing prompts with
from pydantic import BaseModel
from ragbits.core.prompt import Prompt

--8<-- "examples/agents/tool_use.py:51:70"
--8<-- "examples/agents/tool_use.py:51:72"
```

### Run the agent
Expand All @@ -49,6 +49,33 @@ The result is an [AgentResult][ragbits.agents.AgentResult], which includes the m

You can find the complete code example in the Ragbits repository [here](https://github.com/deepsense-ai/ragbits/blob/main/examples/agents/tool_use.py).

### Alternative approach: inheritance with `prompt_config`

In addition to explicitly attaching a Prompt instance, Ragbits also supports defining agents through a combination of inheritance and the `@Agent.prompt_config` decorator.

This approach lets you bind input (and optionally output) models directly to your agent class. The agent then derives its prompt structure automatically, without requiring a prompt argument in the constructor.

```python
from pydantic import BaseModel
from ragbits.agents import Agent

--8<-- "examples/agents/with_decorator.py:51:71"
```

The decorator can also accept an output type, allowing you to strongly type both the inputs and outputs of the agent. If you do not explicitly define a `user_prompt`, Ragbits will default to `{{ input }}`.

Once defined, the agent class can be used directly, just like any other subclass of Agent:

```python
import asyncio
from ragbits.agents import Agent
from ragbits.core.llms import LiteLLM

--8<-- "examples/agents/with_decorator.py:73:84"
```

You can find the complete code example in the Ragbits repository [here](https://github.com/deepsense-ai/ragbits/blob/main/examples/agents/with_decorator.py).

## Tool choice
To control what tool is used at first call you could use `tool_choice` parameter. There are the following options:
- "auto": let model decide if tool call is needed
Expand Down
48 changes: 48 additions & 0 deletions docs/how-to/agents/stream_downstream_agents.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# How-To: Stream downstream agents with Ragbits

Ragbits [Agent][ragbits.agents.Agent] can call other agents as tools, creating a chain of reasoning where downstream agents provide structured results to the parent agent.

Using the streaming API, you can observe every chunk of output as it is generated, including tool calls, tool results, and final text - perfect for real-time monitoring or chat interfaces.

## Define a simple tool

A tool is just a Python function returning a JSON-serializable result. Here’s an example tool returning the current time for a given location:

```python
import json

--8<-- "examples/agents/downstream_agents_streaming.py:33:51"
```

## Create a downstream agent

The downstream agent wraps the tool with a prompt, allowing the LLM to use it as a function.

```python
from pydantic import BaseModel
from ragbits.core.prompt import Prompt
from ragbits.agents import Agent
from ragbits.agents._main import AgentOptions
from ragbits.core.llms import LiteLLM

--8<-- "examples/agents/downstream_agents_streaming.py:54:82"
```

## Create a parent QA agent

The parent agent can call downstream agents as tools. This lets the LLM reason and decide when to invoke the downstream agent.

```python
--8<-- "examples/agents/downstream_agents_streaming.py:85:111"
```

## Streaming output from downstream agents

Use `run_streaming` with an [AgentRunContext][ragbits.agents.AgentRunContext] to see output as it happens. Each chunk contains either text, a tool call, or a tool result. You can print agent names when they change and handle downstream agent events.

```python
import asyncio
from ragbits.agents import DownstreamAgentResult

--8<-- "examples/agents/downstream_agents_streaming.py:114:133"
```
167 changes: 167 additions & 0 deletions docs/how-to/agents/use_post_processors.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
# How-To: Use Post-Processors with Ragbits Agents

Ragbits Agents can be enhanced with post-processors to intercept, validate, log, filter, and modify their outputs. In this guide you will learn how to:

- Create custom post-processors (streaming and non-streaming)
- Attach post-processors to agents in run and streaming modes
- Use and configure the built-in Supervisor post-processor

## Post-Processors Overview

Ragbits provides two types of post-processors:

- **PostProcessor**: Processes the final output after generation, ideal for end-of-run processing.
- **StreamingPostProcessor**: Processes outputs as they are generated, suitable for real-time applications.

### Implementing a custom Post-Processor

To create a custom post-processor, inherit from the appropriate base class ([`PostProcessor`][ragbits.agents.post_processors.base.PostProcessor] or [`StreamingPostProcessor`][ragbits.agents.post_processors.base.StreamingPostProcessor]) and implement the required method.

#### Post-Processor Example

A non-streaming post-processor applies transformations after the entire content is generated.

```python
class TruncateProcessor(PostProcessor):
def __init__(self, max_length: int = 50) -> None:
self.max_length = max_length

async def process(self, result, agent, options=None, context=None):
content = result.content
if len(content) > self.max_length:
content = content[:self.max_length] + "... [TRUNCATED]"
result.content = content
return result
```

#### Streaming Post-Processor Example

A streaming post-processor can manipulate all information returned during generation, including text, tool calls, etc.

```python
class UpperCaseStreamingProcessor(StreamingPostProcessor):
async def process_streaming(self, chunk, agent):
if isinstance(chunk, str):
return chunk.upper()
return chunk
```

## Using Post-Processors

To use post-processors, pass them to the `run` or `run_streaming` methods of the `Agent` class. If you pass a non-streaming processor to `run_streaming`, set `allow_non_streaming=True`. This allows streaming processors to handle content piece by piece during generation, while non-streaming processors apply transformations after the entire output is generated.

```python
async def main() -> None:
llm = LiteLLM("gpt-4.1-mini")
agent = Agent(llm=llm, prompt="You are a helpful assistant.")
post_processors = [
UpperCaseStreamingProcessor(),
TruncateProcessor(max_length=50),
]
stream_result = agent.run_streaming(
"Tell me about the history of AI.",
post_processors=post_processors,
allow_non_streaming=True
)
async for chunk in stream_result:
if isinstance(chunk, str):
print(chunk, end="")
print(f"\nFinal answer:\n{stream_result.content}")
```

Post-processors offer a flexible way to tailor agent outputs, whether filtering content in real-time or transforming final outputs.

## Built-in Post-Processors

### Supervisor

The [`SupervisorPostProcessor`][ragbits.agents.post_processors.supervisor.SupervisorPostProcessor] validates the agent’s final response against the executed tool calls and, if needed, triggers an automatic rerun with a correction prompt. It helps catch inconsistencies (e.g., when the response contradicts tool output) and guide the agent to refine its answer. The Supervisor is a non-streaming post-processor: it runs after generation has completed, validating the final output before optionally issuing a correction rerun.

Key capabilities:

- Validates the last assistant response using an LLM-powered validation prompt
- Optionally reruns the agent with a formatted correction prompt derived from validation feedback
- Supports preserving or pruning intermediate history
- Attaches validation metadata to the final `AgentResult`

#### Quick start

```python
from ragbits.agents import Agent
from ragbits.agents.post_processors import SupervisorPostProcessor
from ragbits.agents.post_processors.supervisor import HistoryStrategy
from ragbits.core.llms.litellm import LiteLLM

llm = LiteLLM("gpt-4o-mini", use_structured_output=True)
supervisor = SupervisorPostProcessor(
llm=llm,
max_retries=2,
fail_on_exceed=False,
history_strategy=HistoryStrategy.PRESERVE, # Default HistoryStrategy is REMOVE
)

agent = Agent(
llm=llm,
prompt="You are a helpful assistant.",
)

result = await agent.run(
"What is the weather in Tokyo?",
post_processors=[supervisor],
)
```

#### Configuration

- **llm**: LLM used for validation and formatting structured outputs
- **validation_prompt**: Optional custom prompt class describing the validation output schema
- **correction_prompt**: Optional format string used to create a correction message from validation output
- **max_retries**: How many times to attempt correction-driven reruns
- **fail_on_exceed**: If `True`, raises when retries are exhausted; otherwise returns last result with metadata
- **history_strategy**:
- `PRESERVE`: keep all messages, including the correction user message and rerun assistant message
- `REMOVE`: prune the invalid assistant message and the correction user message, keeping the final assistant response succinctly

#### Custom structured validation and correction

You can define a custom validation output model and prompt to shape the supervisor feedback and correction message:

```python
from pydantic import BaseModel
from ragbits.core.prompt.prompt import Prompt
from ragbits.agents.post_processors.supervisor import ValidationInput

class MyValidationOutput(BaseModel):
is_valid: bool
errors: list[str]
fixes: list[str]
confidence: float

class MyValidationPrompt(Prompt[ValidationInput, MyValidationOutput]):
system_prompt = "You are an expert validator. Provide clear, actionable feedback."
user_prompt = (
"Chat History:\n"
"{% for message in chat_history %}"
"\n{{ message.role | title }}: {{ message.content }} (if None it means it's a tool call)"
"{% endfor %}"
"\n\nList all errors, possible fixes, and provide a confidence score (0.0-1.0) for your assessment.\n"
)

correction_prompt = (
"Previous answer had issues:\n"
"Errors: {errors}\n"
"Fixes: {fixes}\n"
"Confidence: {confidence}\n"
"Please answer again using the fixes."
)

supervisor = SupervisorPostProcessor(
llm=llm,
validation_prompt=MyValidationPrompt,
correction_prompt=correction_prompt,
max_retries=1,
history_strategy=HistoryStrategy.PRESERVE,
)
```

The Supervisor appends validation records to `result.metadata` under the `post_processors.supervisor` key as a list of dicts; each entry corresponds to a validation step.
2 changes: 1 addition & 1 deletion docs/tutorials/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Let's walk through a quick example of **basic question answering**. Specifically, let's build **a system for answering tech questions**, e.g. about Linux or iPhone apps.

Install the latest Ragbits via `pip install -U ragbits` and follow along.
Install the latest Ragbits via `pip install -U ragbits ragbits-agents` and follow along.

## Configuring the environment

Expand Down
Loading
Loading