|
| 1 | +# Conversation Pruning |
| 2 | + |
| 3 | +This guide demonstrates how to use the `PruningConversationManager` to selectively manage conversation history through intelligent pruning strategies. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +The `PruningConversationManager` provides a flexible approach to conversation management that: |
| 8 | + |
| 9 | +- **Preserves Structure**: Unlike summarization, pruning maintains the conversation's message structure |
| 10 | +- **Uses Strategies**: Employs pluggable strategies to determine what and how to prune |
| 11 | +- **Selective Preservation**: Keeps important messages (initial and recent) while pruning middle content |
| 12 | +- **Proactive Management**: Can automatically prune when approaching token limits |
| 13 | +- **Context-Aware**: Makes intelligent decisions based on message content and context |
| 14 | + |
| 15 | +## Basic Usage |
| 16 | + |
| 17 | +### Simple Tool Result Pruning |
| 18 | + |
| 19 | +The most common use case is compressing large tool results that can consume significant context space: |
| 20 | + |
| 21 | +```python |
| 22 | +from strands import Agent |
| 23 | +from strands.agent.conversation_manager import PruningConversationManager |
| 24 | +from strands.agent.conversation_manager.strategies import LargeToolResultPruningStrategy |
| 25 | + |
| 26 | +# Create a strategy to compress large tool results |
| 27 | +tool_result_strategy = LargeToolResultPruningStrategy( |
| 28 | + max_tool_result_tokens=10_000, # Compress results larger than 10k tokens |
| 29 | + compression_template="[Tool result compressed: {original_size} → {compressed_size} tokens. Status: {status}]" |
| 30 | +) |
| 31 | + |
| 32 | +# Create the pruning manager |
| 33 | +conversation_manager = PruningConversationManager( |
| 34 | + pruning_strategies=[tool_result_strategy], |
| 35 | + preserve_recent_messages=3, # Keep 3 most recent messages |
| 36 | + preserve_initial_messages=1, # Keep 1 initial message |
| 37 | + enable_proactive_pruning=True, # Enable automatic pruning |
| 38 | + pruning_threshold=0.8, # Prune when 80% of context is used |
| 39 | + context_window_size=100_000 # 100k token context window |
| 40 | +) |
| 41 | + |
| 42 | +agent = Agent( |
| 43 | + conversation_manager=conversation_manager |
| 44 | +) |
| 45 | + |
| 46 | +# The agent will now automatically compress large tool results |
| 47 | +# and proactively prune when the conversation grows too large |
| 48 | +``` |
| 49 | + |
| 50 | +### Multiple Pruning Strategies |
| 51 | + |
| 52 | +You can combine multiple strategies for comprehensive pruning: |
| 53 | + |
| 54 | +```python |
| 55 | +from strands import Agent |
| 56 | +from strands.agent.conversation_manager import PruningConversationManager |
| 57 | +from strands.agent.conversation_manager.strategies import LargeToolResultPruningStrategy |
| 58 | + |
| 59 | +# Strategy 1: Compress large tool results |
| 60 | +tool_result_strategy = LargeToolResultPruningStrategy( |
| 61 | + max_tool_result_tokens=5_000 |
| 62 | +) |
| 63 | + |
| 64 | +# You can create custom strategies by implementing the PruningStrategy interface |
| 65 | +class OldMessagePruningStrategy: |
| 66 | + """Custom strategy to remove very old messages.""" |
| 67 | + |
| 68 | + def should_prune_message(self, message, context): |
| 69 | + # Prune messages that are more than 20 messages old |
| 70 | + return context["message_index"] < context["total_messages"] - 20 |
| 71 | + |
| 72 | + def prune_message(self, message, agent): |
| 73 | + # Remove the message entirely |
| 74 | + return None |
| 75 | + |
| 76 | + def get_strategy_name(self): |
| 77 | + return "OldMessagePruningStrategy" |
| 78 | + |
| 79 | +old_message_strategy = OldMessagePruningStrategy() |
| 80 | + |
| 81 | +# Combine strategies |
| 82 | +conversation_manager = PruningConversationManager( |
| 83 | + pruning_strategies=[tool_result_strategy, old_message_strategy], |
| 84 | + preserve_recent_messages=5, |
| 85 | + preserve_initial_messages=2 |
| 86 | +) |
| 87 | + |
| 88 | +agent = Agent(conversation_manager=conversation_manager) |
| 89 | +``` |
| 90 | + |
| 91 | +## Advanced Configuration |
| 92 | + |
| 93 | +### Fine-Tuning Preservation Settings |
| 94 | + |
| 95 | +Control exactly which messages are preserved during pruning: |
| 96 | + |
| 97 | +```python |
| 98 | +conversation_manager = PruningConversationManager( |
| 99 | + pruning_strategies=[LargeToolResultPruningStrategy()], |
| 100 | + preserve_initial_messages=3, # Keep first 3 messages (system prompt, initial exchange) |
| 101 | + preserve_recent_messages=5, # Keep last 5 messages (recent context) |
| 102 | + enable_proactive_pruning=True, |
| 103 | + pruning_threshold=0.6, # More aggressive - prune at 60% capacity |
| 104 | + context_window_size=150_000 |
| 105 | +) |
| 106 | +``` |
| 107 | + |
| 108 | +### Reactive vs Proactive Pruning |
| 109 | + |
| 110 | +```python |
| 111 | +# Reactive only - prune only when context window is exceeded |
| 112 | +reactive_manager = PruningConversationManager( |
| 113 | + pruning_strategies=[LargeToolResultPruningStrategy()], |
| 114 | + enable_proactive_pruning=False # Disable proactive pruning |
| 115 | +) |
| 116 | + |
| 117 | +# Proactive - prune before hitting limits |
| 118 | +proactive_manager = PruningConversationManager( |
| 119 | + pruning_strategies=[LargeToolResultPruningStrategy()], |
| 120 | + enable_proactive_pruning=True, |
| 121 | + pruning_threshold=0.7, # Prune when 70% full |
| 122 | + context_window_size=200_000 |
| 123 | +) |
| 124 | +``` |
| 125 | + |
| 126 | +## Custom Pruning Strategies |
| 127 | + |
| 128 | +Create your own pruning strategies by implementing the `PruningStrategy` interface: |
| 129 | + |
| 130 | +```python |
| 131 | +from strands.agent.conversation_manager import PruningStrategy |
| 132 | +from strands.types.content import Message |
| 133 | +from typing import Optional |
| 134 | + |
| 135 | +class TokenBasedPruningStrategy(PruningStrategy): |
| 136 | + """Prune messages based on token count.""" |
| 137 | + |
| 138 | + def __init__(self, max_message_tokens: int = 1000): |
| 139 | + self.max_message_tokens = max_message_tokens |
| 140 | + |
| 141 | + def should_prune_message(self, message: Message, context) -> bool: |
| 142 | + """Prune messages that exceed the token limit.""" |
| 143 | + return context["token_count"] > self.max_message_tokens |
| 144 | + |
| 145 | + def prune_message(self, message: Message, agent) -> Optional[Message]: |
| 146 | + """Truncate the message content.""" |
| 147 | + pruned_message = message.copy() |
| 148 | + |
| 149 | + for content in pruned_message.get("content", []): |
| 150 | + if "text" in content: |
| 151 | + text = content["text"] |
| 152 | + if len(text) > 500: # Truncate long text |
| 153 | + content["text"] = text[:500] + "... [truncated]" |
| 154 | + |
| 155 | + return pruned_message |
| 156 | + |
| 157 | + def get_strategy_name(self) -> str: |
| 158 | + return "TokenBasedPruningStrategy" |
| 159 | + |
| 160 | +# Use the custom strategy |
| 161 | +custom_strategy = TokenBasedPruningStrategy(max_message_tokens=2000) |
| 162 | +conversation_manager = PruningConversationManager( |
| 163 | + pruning_strategies=[custom_strategy] |
| 164 | +) |
| 165 | +``` |
| 166 | + |
| 167 | +### Content-Aware Pruning Strategy |
| 168 | + |
| 169 | +Create strategies that understand message content: |
| 170 | + |
| 171 | +```python |
| 172 | +class DebugMessagePruningStrategy(PruningStrategy): |
| 173 | + """Remove debug and logging messages to save context space.""" |
| 174 | + |
| 175 | + def should_prune_message(self, message: Message, context) -> bool: |
| 176 | + """Identify debug messages by content patterns.""" |
| 177 | + for content in message.get("content", []): |
| 178 | + if "text" in content: |
| 179 | + text = content["text"].lower() |
| 180 | + # Look for debug patterns |
| 181 | + debug_patterns = ["debug:", "log:", "trace:", "verbose:"] |
| 182 | + if any(pattern in text for pattern in debug_patterns): |
| 183 | + return True |
| 184 | + return False |
| 185 | + |
| 186 | + def prune_message(self, message: Message, agent) -> Optional[Message]: |
| 187 | + """Remove debug messages entirely.""" |
| 188 | + return None # Remove the message completely |
| 189 | + |
| 190 | + def get_strategy_name(self) -> str: |
| 191 | + return "DebugMessagePruningStrategy" |
| 192 | +``` |
| 193 | + |
| 194 | +## Tool Result Compression |
| 195 | + |
| 196 | +The `LargeToolResultPruningStrategy` provides sophisticated compression for tool results: |
| 197 | + |
| 198 | +```python |
| 199 | +# Detailed configuration for tool result compression |
| 200 | +tool_strategy = LargeToolResultPruningStrategy( |
| 201 | + max_tool_result_tokens=25_000, # Compress results larger than 25k tokens |
| 202 | + compression_template=( |
| 203 | + "[COMPRESSED] Original: {original_size} tokens → Compressed: {compressed_size} tokens\n" |
| 204 | + "Status: {status}\n" |
| 205 | + "--- Compressed Content Below ---" |
| 206 | + ), |
| 207 | + enable_llm_compression=False # Use simple compression (LLM compression not yet implemented) |
| 208 | +) |
| 209 | + |
| 210 | +conversation_manager = PruningConversationManager( |
| 211 | + pruning_strategies=[tool_strategy], |
| 212 | + preserve_recent_messages=4, |
| 213 | + preserve_initial_messages=2 |
| 214 | +) |
| 215 | +``` |
| 216 | + |
| 217 | +### Understanding Tool Result Compression |
| 218 | + |
| 219 | +The strategy compresses tool results by: |
| 220 | + |
| 221 | +1. **Text Truncation**: Long text content is truncated with indicators |
| 222 | +2. **JSON Summarization**: Large JSON objects are replaced with metadata and samples |
| 223 | +3. **Metadata Preservation**: Tool status and IDs are always preserved |
| 224 | +4. **Compression Notes**: Clear indicators show what was compressed |
| 225 | + |
| 226 | +Example of compressed output: |
| 227 | +``` |
| 228 | +[Tool result compressed: 15000 tokens → 500 tokens. Status: success] |
| 229 | +{ |
| 230 | + "_compressed": true, |
| 231 | + "_n_original_keys": 150, |
| 232 | + "_size": 15000, |
| 233 | + "_type": "dict", |
| 234 | + "sample_key_1": "sample_value_1", |
| 235 | + "sample_key_2": "sample_value_2" |
| 236 | +} |
| 237 | +``` |
| 238 | + |
| 239 | +## Monitoring and Debugging |
| 240 | + |
| 241 | +### Tracking Pruning Activity |
| 242 | + |
| 243 | +```python |
| 244 | +# Access pruning statistics |
| 245 | +print(f"Messages removed: {conversation_manager.removed_message_count}") |
| 246 | + |
| 247 | +# Get current state for debugging |
| 248 | +state = conversation_manager.get_state() |
| 249 | +print(f"Manager state: {state}") |
| 250 | +``` |
| 251 | + |
| 252 | +### Logging Pruning Decisions |
| 253 | + |
| 254 | +Enable logging to see pruning decisions: |
| 255 | + |
| 256 | +```python |
| 257 | +import logging |
| 258 | + |
| 259 | +# Enable debug logging for pruning |
| 260 | +logging.getLogger("strands.agent.conversation_manager.pruning_conversation_manager").setLevel(logging.DEBUG) |
| 261 | +logging.getLogger("strands.agent.conversation_manager.strategies.tool_result_pruning").setLevel(logging.DEBUG) |
| 262 | + |
| 263 | +# Now you'll see detailed logs about pruning decisions |
| 264 | +``` |
| 265 | + |
| 266 | +## Best Practices |
| 267 | + |
| 268 | +### 1. Choose Appropriate Thresholds |
| 269 | + |
| 270 | +```python |
| 271 | +# For long-running conversations with large tool results |
| 272 | +conversation_manager = PruningConversationManager( |
| 273 | + pruning_strategies=[LargeToolResultPruningStrategy(max_tool_result_tokens=10_000)], |
| 274 | + pruning_threshold=0.6, # Prune early to avoid context overflow |
| 275 | + preserve_recent_messages=5, # Keep enough recent context |
| 276 | + preserve_initial_messages=2 # Preserve system setup |
| 277 | +) |
| 278 | +``` |
| 279 | + |
| 280 | +### 2. Preserve Critical Messages |
| 281 | + |
| 282 | +```python |
| 283 | +# For conversations where initial setup is crucial |
| 284 | +conversation_manager = PruningConversationManager( |
| 285 | + pruning_strategies=[LargeToolResultPruningStrategy()], |
| 286 | + preserve_initial_messages=5, # Keep more initial context |
| 287 | + preserve_recent_messages=3, # Standard recent context |
| 288 | + pruning_threshold=0.8 # Less aggressive pruning |
| 289 | +) |
| 290 | +``` |
| 291 | + |
| 292 | +### 3. Combine with Other Managers |
| 293 | + |
| 294 | +You can switch between different conversation managers based on use case: |
| 295 | + |
| 296 | +```python |
| 297 | +from strands.agent.conversation_manager import SummarizingConversationManager |
| 298 | + |
| 299 | +# Use pruning for tool-heavy conversations |
| 300 | +pruning_manager = PruningConversationManager( |
| 301 | + pruning_strategies=[LargeToolResultPruningStrategy()] |
| 302 | +) |
| 303 | + |
| 304 | +# Use summarizing for text-heavy conversations |
| 305 | +summarizing_manager = SummarizingConversationManager() |
| 306 | + |
| 307 | +# Switch based on conversation characteristics |
| 308 | +if has_many_tool_results: |
| 309 | + agent.conversation_manager = pruning_manager |
| 310 | +else: |
| 311 | + agent.conversation_manager = summarizing_manager |
| 312 | +``` |
| 313 | + |
| 314 | +## Common Use Cases |
| 315 | + |
| 316 | +### 1. API Integration Agents |
| 317 | + |
| 318 | +For agents that make many API calls with large responses: |
| 319 | + |
| 320 | +```python |
| 321 | +api_strategy = LargeToolResultPruningStrategy( |
| 322 | + max_tool_result_tokens=5_000, # API responses can be large |
| 323 | + compression_template="[API Response compressed: {original_size} → {compressed_size} tokens]" |
| 324 | +) |
| 325 | + |
| 326 | +conversation_manager = PruningConversationManager( |
| 327 | + pruning_strategies=[api_strategy], |
| 328 | + preserve_recent_messages=4, # Keep recent API context |
| 329 | + pruning_threshold=0.7 |
| 330 | +) |
| 331 | +``` |
| 332 | + |
| 333 | +### 2. Data Analysis Agents |
| 334 | + |
| 335 | +For agents processing large datasets: |
| 336 | + |
| 337 | +```python |
| 338 | +data_strategy = LargeToolResultPruningStrategy( |
| 339 | + max_tool_result_tokens=15_000, # Data outputs can be very large |
| 340 | +) |
| 341 | + |
| 342 | +conversation_manager = PruningConversationManager( |
| 343 | + pruning_strategies=[data_strategy], |
| 344 | + preserve_initial_messages=3, # Keep data setup context |
| 345 | + preserve_recent_messages=5, # Keep recent analysis |
| 346 | + pruning_threshold=0.6 # Aggressive pruning for large data |
| 347 | +) |
| 348 | +``` |
| 349 | + |
| 350 | +### 3. Code Generation Agents |
| 351 | + |
| 352 | +For agents that generate and execute code: |
| 353 | + |
| 354 | +```python |
| 355 | +# Custom strategy for code execution results |
| 356 | +class CodeExecutionPruningStrategy(PruningStrategy): |
| 357 | + def should_prune_message(self, message, context): |
| 358 | + # Prune large code execution outputs |
| 359 | + if context["has_tool_result"]: |
| 360 | + for content in message.get("content", []): |
| 361 | + if "toolResult" in content: |
| 362 | + # Check if it's a code execution result |
| 363 | + tool_result = content["toolResult"] |
| 364 | + if "code_execution" in str(tool_result).lower(): |
| 365 | + return context["token_count"] > 2000 |
| 366 | + return False |
| 367 | + |
| 368 | + def prune_message(self, message, agent): |
| 369 | + # Compress code execution results |
| 370 | + pruned_message = message.copy() |
| 371 | + for content in pruned_message.get("content", []): |
| 372 | + if "toolResult" in content: |
| 373 | + result = content["toolResult"] |
| 374 | + if result.get("content"): |
| 375 | + # Keep only first and last few lines of output |
| 376 | + for result_content in result["content"]: |
| 377 | + if "text" in result_content: |
| 378 | + lines = result_content["text"].split('\n') |
| 379 | + if len(lines) > 20: |
| 380 | + compressed = ( |
| 381 | + '\n'.join(lines[:5]) + |
| 382 | + f'\n... [{len(lines)-10} lines omitted] ...\n' + |
| 383 | + '\n'.join(lines[-5:]) |
| 384 | + ) |
| 385 | + result_content["text"] = compressed |
| 386 | + return pruned_message |
| 387 | + |
| 388 | + def get_strategy_name(self): |
| 389 | + return "CodeExecutionPruningStrategy" |
| 390 | + |
| 391 | +code_strategy = CodeExecutionPruningStrategy() |
| 392 | +conversation_manager = PruningConversationManager( |
| 393 | + pruning_strategies=[code_strategy], |
| 394 | + preserve_recent_messages=3 |
| 395 | +) |
| 396 | +``` |
| 397 | + |
| 398 | +This comprehensive guide shows how to effectively use conversation pruning to manage context while preserving important information and conversation structure. |
0 commit comments