simplify

Pouyanpi · Pouyanpi · commit 0501e6aec101 · 2025-10-28T13:51:02.000+01:00
diff --git a/docs/user-guides/advanced/bot-thinking-guardrails.md b/docs/user-guides/advanced/bot-thinking-guardrails.md
@@ -4,20 +4,13 @@ Modern reasoning-capable LLMs expose their internal thought process as reasoning
 
 NeMo Guardrails allows you to inspect and control these reasoning traces by extracting them and making them available throughout your guardrails configuration. This enables you to write guardrails that can block responses based on the model's reasoning process, enhance moderation decisions with reasoning context, or monitor reasoning patterns.
 
-This guide shows you how to access and guardrail reasoning content using the three main mechanisms NeMo Guardrails provides.
-
 ```{important}
-The examples in this guide range from minimal toy examples (for understanding concepts) to complete reference implementations. They are designed to teach you how to access and work with `bot_thinking` in different contexts, not as production-ready code to copy-paste.
-
-- **Toy examples** demonstrate basic variable access and simple pattern matching for learning purposes
-- **Reference implementations** show complete working configurations from the repository ([examples/configs/self_check_thinking/](https://github.com/NVIDIA/NeMo-Guardrails/tree/bc799fbb05e1f12f1ba79461f7f7378b3af50c22/examples/configs/self_check_thinking))
-
-You should adapt these patterns to your specific use case with appropriate validation, error handling, and business logic for your application.
+The examples in this guide range from minimal toy examples (for understanding concepts) to complete reference implementations. They are designed to teach you how to access and work with `bot_thinking` in different contexts, not as production-ready code to copy-paste. Adapt these patterns to your specific use case with appropriate validation, error handling, and business logic for your application.
 ```
 
-## How Reasoning Content is Made Available
+## Accessing Reasoning Content
 
-When an LLM generates a response with reasoning traces, NeMo Guardrails automatically extracts the reasoning and makes it available through:
+When an LLM generates a response with reasoning traces, NeMo Guardrails automatically extracts the reasoning and makes it available in three ways:
 
 ### In Colang Flows: `$bot_thinking` Variable
 
@@ -26,29 +19,22 @@ The reasoning content is available as a context variable in Colang output rails:
 ```colang
 define flow check_reasoning
   if $bot_thinking
-    # Access reasoning content
     $captured_reasoning = $bot_thinking
 ```
 
-```{note}
-This is a minimal example to demonstrate variable access. See [Using Reasoning in Self-Check Output](#using-reasoning-in-self-check-output) for a complete reference implementation.
-```
-
-### In Actions: `context.get("bot_thinking")`
+### In Custom Actions: `context.get("bot_thinking")`
 
-When writing custom Python actions, you can access the reasoning via the context dictionary:
+When writing Python actions, you can access the reasoning via the context dictionary:
 
 ```python
 @action(is_system_action=True)
-async def my_custom_check(context: Optional[dict] = None):
+async def check_reasoning(context: Optional[dict] = None):
     bot_thinking = context.get("bot_thinking")
     if bot_thinking and "sensitive" in bot_thinking:
         return False
     return True
 ```
 
-This is the same pattern used in the built-in `self_check_output` action, which accesses `bot_thinking` from context and passes it to the prompt template.
-
 ### In Prompt Templates: `{{ bot_thinking }}`
 
 When rendering prompts for LLM tasks (like self-check output), the reasoning is available as a Jinja2 template variable:
@@ -66,26 +52,13 @@ prompts:
       Should this be blocked (Yes or No)?
 ```
 
+**Important**: Always check if reasoning exists before using it, as not all models provide reasoning traces.
+
 ## Guardrailing with Output Rails
 
 Output rails can use the `$bot_thinking` variable to inspect and control responses based on reasoning content.
 
-### Basic Example: Checking for Reasoning
-
-```colang
-define flow log_reasoning_presence
-  if $bot_thinking
-    $has_reasoning = True
-    log "Reasoning detected: {$bot_thinking}"
-  else
-    $has_reasoning = False
-```
-
-```{note}
-This is a toy example for learning purposes. For production use, see [Using Reasoning in Self-Check Output](#using-reasoning-in-self-check-output) below.
-```
-
-### Blocking Based on Reasoning Patterns
+### Basic Pattern Matching
 
 ```colang
 define bot refuse to respond
@@ -113,9 +86,9 @@ This demonstrates basic pattern matching for learning purposes. Real implementat
 
 ## Guardrailing with Custom Actions
 
-You can write custom Python actions that access reasoning content through the context dictionary.
+For complex validation logic or reusable checks across multiple flows, write custom Python actions:
 
-Create a file `config/actions.py`:
+**config/actions.py**:
 
 ```python
 from typing import Optional
@@ -141,7 +114,7 @@ async def check_reasoning_quality(context: Optional[dict] = None):
     return True
 ```
 
-Create a flow that uses this action in `config/rails/reasoning_check.co`:
+**config/rails/reasoning_check.co**:
 
 ```colang
 define bot refuse to respond
@@ -164,13 +137,9 @@ rails:
       - quality_check_reasoning
 ```
 
-```{note}
-This example shows how to structure a custom action that accesses `bot_thinking`. Adapt the validation logic to your specific requirements.
-```
-
 ## Using Reasoning in Self-Check Output
 
-This is the **complete reference implementation** from [examples/configs/self_check_thinking/](https://github.com/NVIDIA/NeMo-Guardrails/tree/bc799fbb05e1f12f1ba79461f7f7378b3af50c22/examples/configs/self_check_thinking) showing how `bot_thinking` is actually used in the codebase. This pattern provides reasoning traces to your self-check output rail, allowing the moderation LLM to make more informed decisions by considering both the response and the reasoning process.
+This is the **complete reference implementation** from [examples/configs/self_check_thinking/](https://github.com/NVIDIA/NeMo-Guardrails/tree/bc799fbb05e1f12f1ba79461f7f7378b3af50c22/examples/configs/self_check_thinking), showing how `bot_thinking` is used in practice. This pattern provides reasoning traces to your self-check output rail, allowing the moderation LLM to make more informed decisions.
 
 ### Configuration
 
@@ -218,214 +187,11 @@ prompts:
 
 The `{% if bot_thinking %}` conditional ensures the prompt works with both reasoning and non-reasoning models. When reasoning is available, the self-check LLM can evaluate both the final response and the reasoning process.
 
-**Explore the complete implementation**: You can find the full working configuration in [examples/configs/self_check_thinking/](https://github.com/NVIDIA/NeMo-Guardrails/tree/bc799fbb05e1f12f1ba79461f7f7378b3af50c22/examples/configs/self_check_thinking) with all files (config.yml, prompts.yml) ready to use as a reference for your own implementation.
-
-## Managing Reasoning in API Responses
-
-### Breaking Change in v0.18.0
-
-Prior to v0.18.0, reasoning traces were prepended directly to the response content. Starting from v0.18.0, reasoning is handled separately:
-
-### With `GenerationOptions` (Structured Access)
-
-When using the Python API with `GenerationOptions`, reasoning is available in the separate `reasoning_content` field:
-
-```python
-from nemoguardrails import RailsConfig, LLMRails
-
-config = RailsConfig.from_path("./config")
-rails = LLMRails(config)
-
-result = rails.generate_async(
-    messages=[{"role": "user", "content": "What is 2+2?"}]
-)
-
-if result.reasoning_content:
-    print("Reasoning:", result.reasoning_content)
-
-print("Response:", result.response[0]["content"])
-```
-
-This is the recommended approach as it provides clean separation between reasoning and response content.
-
-### Without `GenerationOptions` (Tagged String)
-
-When calling without `GenerationOptions` (e.g., via dict/string response), reasoning is wrapped in `<thinking>` tags:
-
-```python
-response = rails.generate(
-    messages=[{"role": "user", "content": "What is 2+2?"}]
-)
-
-print(response["content"])
-```
-
-Output:
-
-```
-<think>Let me calculate: 2 plus 2 equals 4.</think>
-The answer is 4.
-```
-
-This maintains backward compatibility while still separating reasoning from the actual response.
-
-## Complete Working Example
-
-This example combines multiple patterns for demonstration purposes. Use it as a reference to understand how different access methods can work together, but adapt each component to your specific needs.
-
-### Directory Structure
-
-```
-config/
-├── config.yml
-├── prompts.yml
-├── actions.py
-└── rails/
-    └── reasoning_checks.co
-```
-
-### Model and Rails Configuration
-
-**config.yml**:
-
-```yaml
-models:
-  - type: main
-    engine: <your_engine>
-    model: <your_reasoning_model>
-  - type: self_check_output
-    model: <your_moderation_model>
-    engine: <your_engine>
-
-rails:
-  output:
-    flows:
-      - self check output
-      - block_sensitive_reasoning
-      - custom_reasoning_check
-```
-
-### Self-Check Output Prompt
-
-**prompts.yml**:
-
-```yaml
-prompts:
-  - task: self_check_output
-    content: |
-      Your task is to check if the bot message complies with company policy.
-
-      Company policy:
-      - No explicit content
-      - No abusive language
-      - No harmful content
-
-      Bot message: "{{ bot_response }}"
-
-      {% if bot_thinking %}
-      Bot reasoning: "{{ bot_thinking }}"
-      {% endif %}
-
-      Should this be blocked (Yes or No)?
-      Answer:
-```
-
-### Custom Reasoning Validation
-
-**actions.py**:
-
-```python
-from typing import Optional
-from nemoguardrails.actions import action
-
-@action(is_system_action=True)
-async def check_reasoning_patterns(context: Optional[dict] = None):
-    bot_thinking = context.get("bot_thinking")
-
-    if not bot_thinking:
-        return True
-
-    blocked_terms = ["proprietary", "confidential", "trade secret"]
-
-    for term in blocked_terms:
-        if term in bot_thinking.lower():
-            return False
-
-    return True
-```
-
-### Reasoning Guard Flows
-
-**rails/reasoning_checks.co**:
-
-```colang
-define bot refuse to respond
-  "I'm sorry, I can't respond to that."
-
-define flow block_sensitive_reasoning
-  if $bot_thinking
-    if "internal only" in $bot_thinking
-      bot refuse to respond
-      stop
-
-define flow custom_reasoning_check
-  $is_allowed = execute check_reasoning_patterns
-
-  if not $is_allowed
-    bot refuse to respond
-    stop
-```
-
-## Best Practices
-
-### Always Use Conditional Checks
-
-Always check if reasoning exists before using it, as not all models provide reasoning traces:
-
-```colang
-if $bot_thinking
-  # Your logic here
-```
-
-```python
-bot_thinking = context.get("bot_thinking")
-if bot_thinking:
-    # Your logic here
-```
-
-```yaml
-{% if bot_thinking %}
-Bot reasoning: "{{ bot_thinking }}"
-{% endif %}
-```
-
-### When to Use Rails vs Custom Actions
-
-- **Use Colang Rails** for simple pattern matching and logic directly in your flow definitions
-- **Use Custom Actions** for complex logic, external API calls, or when you need reusable validation logic across multiple flows
-
-### Privacy Considerations
-
-Reasoning traces may contain sensitive information about the model's decision-making process. Consider:
-
-- Filtering or redacting sensitive content before logging reasoning traces
-- Applying guardrails to prevent reasoning from revealing proprietary information
-- Being mindful of what reasoning content is included in API responses
-
-### Model Configuration
-
-Ensure you're using a reasoning-capable model as your main model:
-
-```yaml
-models:
-  - type: main
-    engine: <your_engine>
-    model: <your_reasoning_model>
-```
+**Explore the complete implementation**: You can find the full working configuration in [examples/configs/self_check_thinking/](https://github.com/NVIDIA/NeMo-Guardrails/tree/bc799fbb05e1f12f1ba79461f7f7378b3af50c22/examples/configs/self_check_thinking) with all files ready to use as a reference for your own implementation.
 
 ## See Also
 
+- [LLM Configuration - Using LLMs with Reasoning Traces](../configuration-guide/llm-configuration.md#using-llms-with-reasoning-traces) - API response handling and breaking changes
 - [Output Rails](../../getting-started/5-output-rails/README.md) - General guide on output rails
-- [Generation Options](./generation-options.md) - Using generation options for structured API responses
 - [Self-Check Output Example](https://github.com/NVIDIA/NeMo-Guardrails/tree/bc799fbb05e1f12f1ba79461f7f7378b3af50c22/examples/configs/self_check_thinking) - Complete working configuration
 - [Custom Actions](../../colang-language-syntax-guide.md#actions) - Guide on writing custom actions