generated from amazon-archives/__template_Apache-2.0
-
Couldn't load subscription status.
- Fork 452
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Checks
- I have updated to the lastest minor and patch version of Strands
- I have checked the documentation and this is not expected behavior
- I have searched ./issues and there are no duplicates of my issue
Strands Version
1.13.0
Python Version
3.12
Operating System
macOS
Installation Method
pip
Steps to Reproduce
The "test-guardrail-block-cactus" is the test guardrail used in the integration tests, that blocks input and output if it contains the word CACTUS.
import json
from strands import Agent, tool
from strands.models import BedrockModel
import boto3
def get_guardrail_id(client, guardrail_name):
def get_guardrail():
client = boto3.client("bedrock", region_name="us-west-2")
guardrail_name = "test-guardrail-block-cactus"
response = client.list_guardrails()
for guardrail in response.get("guardrails", []):
if guardrail["name"] == guardrail_name:
return guardrail["id"]
return None
@tool
def get_users() -> str:
"List my users"
return """|Name|Email|
|Jerry Barry|jerry@gmail.com|
|CACTUS|cactus@email.com|
"""
bedrock_model = BedrockModel(
model_id="...",
guardrail_id=get_guardrail(),
guardrail_version="DRAFT",
guardrail_redact_input = True, # default
guardrail_redact_output = False, # default
)
# Create agent with the guardrail-protected model
agent = Agent(
system_prompt="You are a helpful assistant.",
model=bedrock_model,
tools=[get_users]
)
# Use the protected agent for conversations
response = agent("Who are my users?")
assert response.stop_reason == "guardrail_intervened"
print(f"Conversation after first call: {json.dumps(agent.messages, indent=4)}")
response = agent("Hello")
print(f"Entire conversation: {json.dumps(agent.messages, indent=4)}")Expected Behavior
Bedrock Guardrails do not actually check tool outputs. In this case, the guardrails triggered on the model output.
My expectation would be that the model output is indeed redacted, but:
- the model input should not be redacted if only "output" guardrails triggered
- in any case, the part of the input that contains tool outputs should certainly not be redacted as it breaks the conversation
Entire conversation: [
{
"role": "user",
"content": [
{
"text": "Who are my users?"
}
]
},
{
"role": "assistant",
"content": [
{
"text": "I'll help you get a list of your users."
},
{
"toolUse": {
"toolUseId": "tooluse_-lfnT1OcSQqqJRusDse-1g",
"name": "get_users",
"input": {}
}
}
]
},
{
"role": "user",
"content": [
{
"toolResult": {
"toolUseId": "tooluse_-lfnT1OcSQqqJRusDse-1g",
"status": "success",
"content": [
{
"text": "|Name|Email|\n|Jerry Bramby|jerry@gmail.com|\n|CACTUS|cactus@email.com|\n"
}
]
}
}
]
},
{
"role": "assistant",
"content": [
{
"text": "BLOCKED BY GUARDRAILS (due to model output)"
}
]
},
{
"role": "user",
"content": [
{
"text": "Hello.."
}
]
},
{
"role": "assistant",
"content": [
{
"text": "Hello! How can I help you today?"
}
]
}
]
Actual Behavior
Having guardrail_redact_input=True, and output guardrails being triggered, the tool_result (being an input) is redacted by strands, breaking the conversation history and causing an error next the time the agent is prompted.
Conversation after first call: [
{
"role": "user",
"content": [
{
"text": "Who are my users?"
}
]
},
{
"role": "assistant",
"content": [
{
"text": "I'll help you get a list of your users."
},
{
"toolUse": {
"toolUseId": "tooluse_hzAn37L3QmCuULNdi6YAWA",
"name": "get_users",
"input": {}
}
}
]
},
{
"role": "user",
"content": [
{
"text": "[User input redacted.]"
}
]
},
{
"role": "assistant",
"content": [
{
"text": ""BLOCKED BY GUARDRAILS (due to output)"
}
]
}
]
Traceback (most recent call last):
....
botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the ConverseStream operation:
The model returned the following errors: messages.2: `tool_use` ids were found without `tool_result` blocks immediately after:
tooluse_hzAn37L3QmCuULNdi6YAWA. Each `tool_use` block must have a corresponding `tool_result` block in the next message.
└ Bedrock region: us-west-2
└ Model id: ...Additional Context
No response
Possible Solution
See the PR #1080
- strands should only redact the "output" if only "output" guardrails triggered
- This changes the existing behavior of redacting both input and output if one of the two if guardrailed. I'm not sure how intentional this is, but it might need discussion.
- In any case, blocks that contains tool outputs should certainly not be redacted in any case, as it breaks the conversation.
Related Issues
No response
alesanfra and leotac
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working